ScyllaDB Open Source 5.X

ScyllaDb Open Source 5.0

ScyllaDB Open Source has a rich set of new production-ready features, including support for the new AWS EC2 I4i instances, IO scheduler improvements, load and stream of SSTables for easier restorations, Repair Based Node Operations (RBNO), materialized view improvements, and more. Already the best high-performance NoSQL database for big data workloads, ScyllaDB was built to replace Apache Cassandra and Amazon DynamoDB by taking their best attributes and improving on performance, scalability, and cost-efficiency.

What’s New in ScyllaDB Open Source 5.1

Production Ready Features

Cobalt blue arrow

Distributed SELECT count(*) (5.1+)

Scylla will now automatically run SELECT count(*) statements on all nodes and all shards in parallel, which brings a considerable speedup, even 100X in larger clusters. This feature is limited to queries that do not use GROUP BY or filtering. This per-operation function uses a new form of super-coordinator to distribute and aggregate results of queries across the cluster.

Cobalt blue arrow

Limit Partition Access Rate (5.1+)

It is now possible to limit read rates and writes rates into a partition with a new WITH per_partition_rate_limit clause for the CREATE TABLE and ALTER TABLE statements. This is useful to prevent hot-partition problems when high rate reads or writes are bogus (for example, arriving from spam bots).

Cobalt blue arrow

Load and Stream (5.1+)

This feature extends nodetool refresh to allow loading arbitrary SSTables that do not belong to a particular node into the cluster. It loads the SSTables from disk, calculates the data’s owning nodes, and automatically streams the data to the owning nodes. In particular this is useful when restoring a cluster from backup.

Cobalt blue arrow

Materialized View: Prune (5.1+)

So-called “ghost rows” manifest when rows in a materialized view do not correspond to any base table rows. Such inconsistencies should be prevented altogether and ScyllaDB strives to avoid them, but if they happen, the new PRUNE MATERIALIZED VIEW statement can be used to restore a materialized view to a fully consistent state without rebuilding it from scratch.

Cobalt blue arrow

Materialized View: Synchronous Mode (5.1+)

In ordinary (asynchronous) materialized views the operation returns before the view is updated. In synchronous materialized views the operation does not return until the view is updated. This enhances consistency but reduces availability as in some situations all nodes might be required to be functional.

Cobalt blue arrow

Support for AWS EC2 I4i Series Instances

ScyllaDB now supports the new AWS EC2 I4i series instances. The I4i series provides superior performance over the I3 series due to a number of factors: the Intel Xeon Ice Lake processors, the AWS Nitro System and low-latency Nitro NVMe SSDs. ScyllaDB can achieve 2x throughput and lower latencies on I4i instances over comparable i3 servers.

Cobalt blue arrow

I/O Scheduler Improvements

A new I/O scheduler was integrated via a Seastar update. The new scheduler is better at restricting disk I/O in order to keep latency low.

Cobalt blue arrow

Improved Reverse Queries

Reverse queries are SELECT statements that use reverse order from the table schema. If no order was defined, the default order is ascending (ASC). For example, imagine rows in a partition sorted by time in ascending order. A reverse query would sort rows in descending order, with the newest rows first. Reverse queries were improved in ScyllaDB Open Source 4.6, and are further improved in 5.0, first, to return short pages to limit memory consumption, and secondly, for reverse queries to leverage ScyllaDB’s row-based cache (before 5.0 they bypassed the cache).

Cobalt blue arrow

New Virtual Tables for Configuration and Nodetool Information

A new system.config virtual table allows querying and updating a subset of configuration parameters over CQL. These updates are not persistent, and will return to the scylla.yaml update after restart. Nodetool command information can also be accessed via virtual tables, including snapshots, protocol servers, runtime info, and a virtual table replacement for nodetool versions. Virtual tables allow remote access over CQL, including for Scylla Cloud users.

Experimental Features

Cobalt blue arrow

Alternator Time-to-Live (TTL) (5.1+)

This release supports Time-to-Live (TTL) expirations of data in our DynamoDB-compatible API, known as “Alternator.” The TTL has a deletion delay of up to 48 hours. With ScyllaDB’s Alternator, you can set a custom deletion delay (by default set to 24 hours). Also, Alternator will BYPASS CACHE for scans employed in TTL expiration, reducing the impact on user workloads. We’ve also implemented new metrics to observe TTL expirations.

Cobalt blue arrow

Alternator Time-to-Live (TTL) (5.1+)

This feature allows users to create User Defined Aggregates (UDAs) and User Defined Functions (UDFs) using a WebAssembly (WASM) engine now available in ScyllaDB. The CQL syntax of UDAs and UDFs in ScyllaDB are compatible with Apache Cassandra.

Cobalt blue arrow

Schema Changes using Raft (5.0+)

Unstable schema management has been a problem in all Apache Cassandra and ScyllaDB versions so far. Using the new Raft consensus protocol support in ScyllaDB, you can now perform immediate and safe schema management for DDL operations like CREATE, ALTER, DROP for KEYSPACE, TABLE, INDEX, UDT, MV etc.

Cobalt blue arrow

More Robust Tombstone Garbage Collection

Tombstones (markers that indicate deleted records) that are older than the most recent repair can now be automatically purged, and newer ones will be kept. This drops tombstones more frequently if repairs are made in a timely manner, and prevents data resurrection if repairs are delayed beyond gc_grace_seconds.

Other ScyllaDB Open Source Features

See what we introduced in ScyllaDB 4.0 Read More
See what we introduced prior to ScyllaDB 4.0. Read More

Resources

Our new feature-based minor release for the ScyllaDB Open Source 5.0 major release brings improved capabilities, stability and performance.
Learn about the latest evolution of our monstrously fast and scalable NoSQL database and the first milestone in ScyllaDB V.
Our open source Amazon DynamoDB-compatible API allows you to run your database on any cloud or on premises.

Read details on ScyllaDB’s newest capabilities, and see all of the software quality fixes that went into the latest major release.

Discover how ScyllaDB Open Source 5.0 uses a custom IO scheduler and algorithms to keep disk load perfectly balanced.

ScyllaDB adopted Raft as a consensus protocol to provide immediately consistent schema changes. Learn more how we did it.

ScyllaDB University Mascot

ScyllaDB University

Get started on the path to ScyllaDB expertise

ScyllaDB Cloud Mascot

ScyllaDB Cloud

It’s easy to get started with our NoSQL DBaaS