ScyllaDB Open Source Release 3.0

The ScyllaDB team is pleased to announce the availability of ScyllaDB Open Source 3.0, a production-ready major release.

ScyllaDB Open Source 3.0 brings major new features including

  • Production-ready Materialized Views (MV)

  • Global Secondary Indexes (GSI)

  • Hinted Handoffs

  • New disk format, compatible with Apache Cassandra 3.0.

ScyllaDB is an open source, Apache Cassandra-compatible NoSQL database, with superior performance and consistently low latency. Find the ScyllaDB Open Source Release 3.0 repository for your Linux distribution here.

Our open source policy is to support only the current active release and its predecessor. Therefore, with the release of ScyllaDB Open Source Release 3.0, ScyllaDB Open Source Release 2.3 is still supported, but ScyllaDB Open Source Release 2.2 is officially retired.

Related Links

New features in ScyllaDB Open Source 3.0

Materialized Views (MV)
With ScyllaDB 3.0, MV is production ready (graduated from a long incubation in experimental mode) and feature-compatible with Apache Cassandra 3.0, including:

  • Creating a MV based on any subset of columns of the base table, including the primary key columns

  • Updating a MV for base table DELETE or UPDATE.

  • Indexing of existing data when creating an MV

  • Support for MV hinted handoff

  • Topology changes, and migration of MV token ranges

  • Sync of TTL between a base table and an MV table

  • nodetool viewbuildstatus

  • Unselected columns keep MV row alive #3362, CASSANDRA-13826

The following MV functions are not available in ScyllaDB 3.0:

  • MV on static and collection columns

See ScyllaDB MV documentation here

Global Secondary Index (GSI)
Unlike Apache Cassandra, ScyllaDB’s Secondary Indexes are global and based on MV. This means every secondary index creates a materialized view under the hood, using all the columns of the original base table’s primary key, and the required indexed columns. With ScyllaDB 3.0, GSI is production ready; it was experimental in previous versions (since 2.2).

The following SI functions are not available in ScyllaDB 3.0

  • Indexing of static and collection columns (same as for MV above)

Hinted Handoff (see ScyllaDB 2.1) was experimental in previous versions and is now production ready.

In case of temporary node outage or unavailability, ScyllaDB stores hints in a commitlog-like files in a per-destination-node + per-shard scope. More on this may be found in this blog  and our documentation.

New File Format
ScyllaDB Open Source 3.0 includes an Apache Cassandra 3.x file format (mc) which is more efficient and requires less disk space than the ScyllaDB 2.x and Apache Cassandra 2.x format (la). Note that ScyllaDB Open Source 3.0 is still able to read the old file format.

Note that the new disk format is disabled by default, allowing a controlled move to the new format following an upgrade. It is enabled by setting the new  enable_sstables_mc_format parameter in scylla.yaml. More on the new and old file format here

Full (multi-partition) scan improvement (#1865)
Full scan is a common use case of analytics where one needs to query data *without* a key. In ScyllaDB 3.0, full table scans are significantly improved. More on full scans in ScyllaDB Open Source 3.0 here.

CQL: Enable ALLOW FILTERING for regular and primary key columns #2025
ScyllaDB now permits ALLOW FILTERING to minimize data returned to a client based on filtering criteria. FIltering can also be used in conjunction with secondary indexes to create more sophisticated queries. However, filtering is to be used with caution, as it can hamper overall database performance. Read more on the topic in this blog.

For example:

CREATE TABLE t (p int, c int, v int, PRIMARY KEY (p, c));
select * from t where c = 0 and v = 1 allow filtering;
select * from t where p = 0 and v = 1 allow filtering;
select * from t where p = 0 and c = 0 and v = 1 allow filtering;

Tooling:

  • Move to Cassandra 3.11 Java Tools: including nodetool and cassandra-stress.

  • Move to node_exporter 0.17

Significant improvements to ScyllaDB streaming reducing the time it takes to repair, add a node to a cluster and other cluster operations which use streaming under the hood. More here.

Metrics Updates from ScyllaDB 2.3 to ScyllaDB 3.0

ScyllaDB Grafana Monitoring project now includes ScyllaDB 3.0 dashboard on branch-2.0

In particular, look for the new Materialized View panels.

See here for all metrics changed in 3.0

Known Issues

  • In rare cases on large machines, ScyllaDB may not start. The last log message will be “Completed migration of legacy schema tables“. In these cases, issue the command

    >systemctr restart scylla-server

    to cause ScyllaDB to start again #4096

  • A few MV and SI backpressure issues are pushed to 3.1 #4090
    View writes are not totally isolated from normal writes, potentially causing the latter to timeout under some workloads as batch writes to tables with MV or SI.

  • Using the new Filtering feature in combination with LIMIT applies the limit per page instead of globally. This means a request might get more values in the response than requested #4100