Another week, another Spark and Scylla post! This time, we’re back again with the Scylla Spark Migrator; we’ll take a short tour through its innards to see how it is implemented. Read why we implemented the Scylla Spark Migrator in this blog. Overview When developing the Migrator, we had several design goals in mind. First, the Migrator should be highly efficient in terms of resource usage. Resource efficiency in the land of Spark applications usually translates to avoiding data shuffles between nodes. Data shuffles are destructive to Spark’s performance, as they incur more I/O costs. Moreover, shuffles usually get slower […]
Welcome to a whole new chapter in our Spark and Scylla series! This post will introduce the Scylla Migrator project – a Spark-based application that will easily and efficiently migrate existing Cassandra tables into Scylla. Over the last few years, ScyllaDB has helped many customers migrate from existing Cassandra installations to a Scylla deployment. The migration approach is detailed in this document. Briefly, the process is comprised of several phases: Create an identical schema in Scylla to hold the data; Configure the application to perform dual writes; Snapshot the historical data from Cassandra and load it into Scylla; Configure the […]
The Scylla team is pleased to announce the release of Scylla 2.3, a production-ready Scylla Open Source minor release. The Scylla 2.3 release includes CQL enhancements, new troubleshooting tools, performance improvements and more. Experimental features include Materialized Views, Secondary Indexes, and Hinted Handoff (details below). Starting from Scylla 2.3, packages are also available for Ubuntu 18.04 and Debian 9. Scylla is an open source, Apache Cassandra-compatible NoSQL database, with superior performance and consistently low latency. Find the Scylla 2.3 repository for your Linux distribution here. Our open source policy is to support only the current active release and its predecessor. […]
ScyllaDB is a diamond sponsor at the upcoming Distributed Data Summit in San Francisco on September 14th. Come see us!
The Mutant Monitoring System series has come to an end. In this post, we will summarize each day of the training series and explain what readers can learn.
AdGear joined us at the Big Data Montreal meetup and we discussed Real-time Data at Scale and how we helped them achieve their goal of 1 million queries per second.
Benchmarking is no easy task, especially when comparing databases with different “engines” under the hood. You want your benchmark to be fair, to run each database on its optimal setup and hardware, and to keep the comparison as apples-to-apples as possible. (For more on this topic, see our webinar on the “Do’s and Don’ts of Benchmarking Databases.”) We kept this in mind when conducting this Scylla versus Cassandra benchmark, which compares Scylla and Cassandra on AWS EC2, using cassandra-stress as the load generator. Most benchmarks compare different software stacks on the same hardware and try to max out the throughput. […]
Ola Cabs shares their two-year journey with Scylla and how it lived up to their expectations. Learn how they graduated from using Scylla for very simple and non-critical use cases to running it for their mission-critical flows.
The Intel Memory Group is behind the revolutionary Optane SSD drive that provides breakthrough performance and is 5-8x faster at Low Queue Depths than traditional SSD’s. Intel began working with ScyllaDB staff last year to build a big memory system at high-volume scale. They chose Scylla because they needed a solution that can fully leverage the hardware to derive the best possible performance.
Learn how developers create applications that connect to databases by using the Cassandra libraries available for programming languages. In this post you will learn how to create a sample Node.js application in Docker that connects to the Mutant Monitoring System.