For the past two years, we have helped users build fast, resilient, and stable applications with Scylla, an enterprise-grade database solution. During these two years, our early adopters migrated from a variety of database solutions, and while most of the migrations we successfully completed were Apache Cassandra (enterprise and open-source versions), we have seen users migrate from MongoDB, HBase, relational systems such as MySQL and Postgres, and key/value stores like Memcache and Redis.
Migration strategies differ between users and systems. In general, we can divide Apache Cassandra-to-Scylla migrations into two main strategies, cold migration and hot migration.
During the cold migration process, neither the legacy system nor the Scylla system is operational. The cold migration strategy is easier on the operators as it lets the team stop the legacy database system at a point in time and restart it at the same point in time with the new Scylla system. However, users are not able to use the system during the migration process which is a constraint very few organizations are willing to accept.
The second strategy is a hot migration. Contrary to cold migration, in hot migration, both the legacy and Scylla deployment are fully operational during the migration process. We describe the required steps for a hot migration in the following document.
What about other database migrations?
For a document-based solution, we work with the user to “serialize” the data, for example, converting a JSON data entry to a columnar one. The following is a simple example of one of the conversions, in which a user profile is converted from a document model to a columnar one.
The next model is a Scylla data model to hold the same address information. We will need to create a user-defined type for addresses:
Create the table:
And insert the data:
Migrations from a different database architecture
Obviously, migrations from a different database architecture such as relational, require more attention to data models and data retrieval patterns. For example, some databases offer joins and aggregation functions. For joins, we recommend users to denormalize their data model and consider the queries deployed by the application. Here is an example of denormalizing a relational database model.
The relational data model and query look like the following:
Here is our sample data after insertion:
The following is a query we can deploy in a relational database scenario:
The outcome of the query will be:
The query retrieves, based on a player name, the games in which he or she scored, and whether it was a home or an away game.
The following example is one of the options to store the needed information in Scylla and enable the query. Please note that arbitrary where clauses are not implemented with Scylla. With Materialized Views, users can create different views of a table and query the information should they need a less granular access. You can read more about Scylla’s implementation of Materialized Views here.
Here is the complete games info:
And the query to learn how Cristiano Ronaldo did for his home games:
As we demonstrated above, it is possible to migrate from different databases to Scylla. In our recent user conference, we presented a talk about database migration. You can see the presentation and the slides on our website.