Dec19

Making the Move: Migrating to Scylla

Subscribe to Our Blog

migrating

 

For the past two years, we have helped users build fast, resilient, and stable applications with Scylla, an enterprise-grade database solution. During these two years, our early adopters migrated from a variety of database solutions, and while most of the migrations we successfully completed were Apache Cassandra (enterprise and open-source versions), we have seen users migrate from MongoDB, HBase, relational systems such as MySQL and Postgres, and key/value stores like Memcache and Redis.

Migration strategies differ between users and systems. In general, we can divide Apache Cassandra-to-Scylla migrations into two main strategies, cold migration and hot migration.

Cold Migration

During the cold migration process, neither the legacy system nor the Scylla system is operational. The cold migration strategy is easier on the operators as it lets the team stop the legacy database system at a point in time and restart it at the same point in time with the new Scylla system. However, users are not able to use the system during the migration process which is a constraint very few organizations are willing to accept.

Hot Migration

The second strategy is a hot migration. Contrary to cold migration, in hot migration, both the legacy and Scylla deployment are fully operational during the migration process. We describe the required steps for a hot migration in the following document.

What about other database migrations?

For a document-based solution, we work with the user to “serialize” the data, for example, converting a JSON data entry to a columnar one. The following is a simple example of one of the conversions, in which a user profile is converted from a document model to a columnar one.

The next model is a Scylla data model to hold the same address information. We will need to create a user-defined type for addresses:

Create the table:


And insert the data:

Migrations from a different database architecture

Obviously, migrations from a different database architecture such as relational, require more attention to data models and data retrieval patterns. For example, some databases offer joins and aggregation functions. For joins, we recommend users to denormalize their data model and consider the queries deployed by the application. Here is an example of denormalizing a relational database model.

The relational data model and query look like the following:

Here is our sample data after insertion:

player_id name
1 Cristiano Ronaldo
2 Lionel Messi
3 David Beckham

 

Game Against Goals Game Type Player
1 F.C. Barcelona 1 away 1
2 Atletico Madrid 2 home 1
3 F.C. Valencia 1 home 1
4 Malaga 2 away 3
5 Sevillia 1 away 2
6 Real Madrid 1 away 2

The following is a query we can deploy in a relational database scenario:

The outcome of the query will be:

Game Against Goals Game Type Player
2 Atletico Madrid 2 home 1
3 F.C. Valencia 1 home 1

The query retrieves, based on a player name, the games in which he or she scored, and whether it was a home or an away game.

The following example is one of the options to store the needed information in Scylla and enable the query. Please note that arbitrary where clauses are not implemented with Scylla. With Materialized Views, users can create different views of a table and query the information should they need a less granular access. You can read more about Scylla’s implementation of Materialized Views here.

Here is the complete games info:

And the query to learn how Cristiano Ronaldo did for his home games:

For aggregation functions, some functions are available today in Scylla, while others are in development. We also recommend using additional tools such as Spark or Presto.

As we demonstrated above, it is possible to migrate from different databases to Scylla. In our recent user conference, we presented a talk about database migration. You can see the presentation and the slides on our website.

Also, check out our migration documentation, and if you are ready to get started, contact us to learn about the professional services we offer to help with your migration.

Eyal GutkindAbout Eyal Gutkind

Eyal Gutkind is a solution architect for Scylla. Prior to Scylla Eyal held product management roles at Mirantis and DataStax. Prior to DataStax Eyal spent 12 years with Mellanox Technologies in various engineering management and product marketing roles.Eyal holds a BSc. degree in Electrical and Computer Engineering from Ben Gurion University, Israel and MBA from Fuqua School of Business at Duke University, North Carolina.


Tags: Apache Cassandra, Cassandra, deployment, Migration