Scylla Enterprise Release 2019.1.0

The Scylla team is pleased to announce the release of Scylla Enterprise 2019.1.0, a production-ready Scylla Enterprise major release.

The Scylla Enterprise 2019.1 release is based on Scylla Open Source 3.0, inheriting features from that release, and also add new features which are only offered to our Scylla Enterprise customers. Scylla Enterprise 2019.1 includes backported bug fixes from upstream releases and master.

Related Links

Scylla Enterprise customers are encouraged to upgrade to Scylla Enterprise 2019.1, and are welcome to contact our Support Team with questions.

New Distribution Support

Scylla Enterprise packages are newly available for:

  • Debian 9
  • Ubuntu 18.04

Deprecated

Scylla Enterprise for:

  • Ubuntu 14
  • Debian 8

Note that Scylla Enterprise 2019.1 is not supported Ubuntu 14 and Debian 8. If you are using Ubuntu 14, or Debian 8 please contact Scylla support for help to upgrade to a later OS release.

Additional AWS EC2 instance type

Scylla Enterprise 2019.1 AMI is optimized to work with AWS EC2 i3 instances, now including i3.metal #620

New features In 2019.1

Scylla 2019.1 will include all the Scylla Enterprise features of previous releases, including Auditing and In-Memory Tables, in addition, it will include the following, new, Scylla Enterprise only features:

Workload Prioritization – Technology Preview

Scylla Enterprise will enable its users to safely prioritize different workloads. For example, a user may balance real-time operational workloads with big-data analytical workloads within a single database cluster. Online Transaction Processing (OLTP) and Online Analytical Processing (OLAP) have very different data access patterns and characteristics. OLTP involves many small and varied transactions, including mixed writes, updates, and reads, with a high sensitivity to latency. In contrast, OLAP emphasizes the throughput of broad scans across datasets. By introducing capabilities that isolate workloads, Scylla Enterprise will uniquely support simultaneous OLTP and OLAP workloads without sacrificing latency or throughput. Note that workload prioritization can be used to differentiate between any two or more different workloads (for example, different OLTP workloads with different priorities or SLAs).

Technology Preview features are product innovations that require real-world field use, customer validation, and feedback to ensure their functionality, quality and performance meet customer expectations. Your feedback will help influence their development and maturation. We caution users against enabling them in a production environment without significant internal testing and validation for their use case and workload.

Features Inherited From Scylla Open Source

The following Scylla Open Source features are now part of the Enterprise branch and are included with the Scylla 2019.1 release:

Materialized Views (MV)

In Scylla Enterprise 2019.1, MV is production ready (graduated from a long incubation in Scylla open Source) and feature-compatible with Apache Cassandra 3.0, including:

  • Creating a MV based on any subset of columns of the base table, including the primary key columns
  • Updating a MV for base table DELETE or UPDATE.
  • Indexing of existing data when creating an MV
  • Support for MV hinted handoff
  • Topology changes, and migration of MV token ranges
  • Sync of TTL between a base table and an MV table
  • nodetool viewbuildstatus
  • Unselected columns keep MV row alive #3362, CASSANDRA-13826

The following MV functions are not available in Scylla Enterprise 2019.1:

MV on static and collection columns

See Scylla MV documentation here

Global Secondary Index (GSI)

Unlike Apache Cassandra, Scylla’s Secondary Indexes are global and based on MV. This means every secondary index creates a materialized view under the hood, using all the columns of the original base table’s primary key, and the required indexed columns. This feature is production ready in Scylla Enterprise 2019.1

See Scylla SI documentation here

CQL: Enable ALLOW FILTERING for regular and primary key columns

Scylla now permits ALLOW FILTERING to minimize data returned to a client based on filtering criteria. Filtering can also be used in conjunction with secondary indexes to create more sophisticated queries. However, filtering is to be used with caution, as it can hamper overall database performance. Read more on the topic in this blog.

For example:

CREATE TABLE t (p int, c int, v int, PRIMARY KEY (p, c));
select * from t where c = 0 and v = 1 allow filtering;
select * from t where p = 0 and v = 1 allow filtering;
select * from t where p = 0 and c = 0 and v = 1 allow filtering;

Support multi-column restrictions in ALLOW FILTERING

For example:

CREATE TABLE t (a int, b int, c int, d int, e int, PRIMARY KEY ((a, b), c, d));
SELECT * FROM t WHERE (c, d) = (1, 2) ALLOW FILTERING;
SELECT * FROM t WHERE (c, d) IN ((1, 2), (1,3), (1,4)) ALLOW FILTERING;
SELECT * FROM t WHERE (c, d) < (1, 3) ALLOW FILTERING;

Support for collections in ALLOW FILTERING

Filtering can also be used to find an element in collections:

CREATE TABLE t (p frozen<map<text, text>>, c1 frozen<list>, c2 frozen<set>, v map<text, text>, id int, PRIMARY KEY(p, c1, c2));CREATE TABLE t (p frozen<map<text, text>>, c1 frozen<list>, c2 frozen<set>, v map<text, text>, id int, PRIMARY KEY(p, c1, c2));
SELECT id FROM t WHERE c1 CONTAINS 3 ALLOW FILTERING;
SELECT id FROM t WHERE p CONTAINS KEY 'a' ALLOW FILTERING;
SELECT id FROM t WHERE v CONTAINS KEY 'y1' AND c2 CONTAINS 3 ALLOW FILTERING;

PER PARTITION LIMIT

Limit the number of returned rows per partition. For example, the following will return two rows per partition:

SELECT * FROM users PER PARTITION LIMIT 2;

While the previously exist LIMIT was applied to the global number of rows. The two limits can be combined:

SELECT * FROM users PER PARTITION LIMIT 1 LIMIT 10

Hinted Handoff

Hinted Handoffs are now available in Scylla Enterprise 2019.1 as a production ready feature. In case of temporary node outage or unavailability, Scylla stores hints in a commitlog-like files. The hints are used to update the node efficiently when it becomes available again. More on this may be found in this blog and our documentation.

New File Format (SSTable 3.0 “mc” format)

Scylla Open Source 3.0 includes an Apache Cassandra 3.x file format (mc) which is more efficient and requires less disk space than the Scylla 2.x and Apache Cassandra 2.x format (la). Note that Scylla Open Source 3.0 is still able to read the old SSTable file format.

Note that the new SSTable file format is disabled by default, allowing a controlled upgrade to the new format following an upgrade. It is enabled by setting the new enable_sstables_mc_format parameter in scylla.yaml. More on the new and old file formats here.

Note that some features of the mc SSTable file format are not backward compatible with older formats, meaning that once the whole cluster is upgraded to 2019.1 and the new SSTable file format is enabled and used for storing new data, downgrade might not be possible without data loss.

Full (multi-partition) scan improvement

Full scan is a common use case of analytics where one needs to query data *without* a key. In Scylla 3.0, full table scans are significantly improved. More on full scans in Scylla Open Source 3.0 here.

Role Based Access Control (RBAC)

Role Based Access Control (RBAC) is a method of reducing lists of authorized users to a few roles assigned to multiple users. RBAC is sometimes referred to as role-based security. RBAC is used for Workload Prioritization (see above)

GoogleCloudSnitch

Use the GoogleCloudSnitch for deploying Scylla on the Google Cloud Engine (GCE) platform across one or more regions. The region is treated as a data center and the availability zones are treated as racks within the datacenter. All communication occurs over private IP addresses within the same logical network.

To use the GoogleCloudSnitch, add the snitch to the scylla.yaml file which is located under /etc/scylla/ for all nodes in the cluster.

Tooling

Large Partitions

Scylla now identifies large partitions and makes the information available in a system table. This allows you to identify large partitions, giving you the ability to fix them. The threshold for large partitions can be set in scylla.yaml:

compaction_large_partition_warning_threshold_mb parameter (default 100MB)

Example:

SELECT * FROM system.large_partitions;

iotune v2

Iotune is a storage benchmarking tool that runs as part of the scylla_setup script. Iotune runs a short benchmark on the Scylla storage and uses the results to set the Scylla io_properties.yaml configuration file (formerly called io.conf). Scylla uses these settings to optimize I/O performance, specifically through setting max storage bandwidth and max concurrent requests.The new iotune output matches the IO scheduler configuration, is time-limited (2 minutes) and produces more consistent results than the previous version.

scyllarepair – deprecated

Scyllarepair, a script for running repairs is deprecated and will not be part of the following releases. Instead, Scylla Manager is the preferred tool for cluster-wide operations and repair scheduling

Other tooling:

  • Tracing: Added prepared statement parameters #1657

CQL

Datetime Functions Support

Scylla supports Cassandra-style datetime functions, including currentTimestamp, currentDate, currentTime, and currentTimeUUID.

JSON support

Support for the Javascript Object Notation (JSON) format, including inserting JSON documents, retrieving data in JSON and providing helper functions to transform native CQL types into JSON and vice versa. Schemas are still enforced for all operations — one cannot just insert random JSON documents into a table. The new API is simply a convenient way of working with JSON without having to convert everything back and forth on the client-side.

Other CQL:

  • Different timeouts for reads and range scans can now be set #3013
  • TWCS support for using millisecond values in timestamps #3152 (used by KairosDB and others)
  • Support for timeuuid functions: now, currentTimeUUID (alias of now), minTimeuuid and maxTimeuuid #2950

Performance Improvements

Streaming Improvements

Significant improvements, reducing the time it takes to repair, add a node to a cluster and other cluster operations which use streaming under the hood. More here.

Row-level cache eviction

The cache is capable of freeing individual rows to satisfy memory reclamation requests. Rows are freed starting from the least recently used ones, with insertion counting as a use. More here.

CPU Scheduler and Compaction Controller for Size Tiered Compaction Strategy (STCS)

With Scylla’s thread-per-core architecture, many internal workloads are multiplexed on a single thread. These internal workloads include compaction, flushing memtables, serving user reads and writes, and streaming. The CPU scheduler isolates these workloads from each other, preventing, for example, a compaction using all of the CPU and preventing normal read and write traffic from using its fair share. The CPU scheduler complements the I/O scheduler which solves the same problem for disk I/O. Together, these two are the building blocks for the compaction controller.

Other performance improvements:

  • Dynamic controllers for compaction strategies: Size Tiered Compaction Strategy, Leveled Compaction Strategy and Time Window Compaction Strategy.
  • Enhanced tagging for scheduling groups to improve performance isolation
  • Stateful paging to improve the performance of partition range scans queries. #1865. More here
  • Improve row digest hash #2884 – Scylla moved from md5 hashes to xxHash, which improved performance significantly. Mean latency on reads improved 23%. Write latencies improved by 18%.
  • Improving the performance of promoted index for wide partitions, by reducing their memory footprint #2981
  • Size-based sampling rate in SSTable summary files – automatically tune

Known issue

During the upgrade or right after you might see error messages similar to:

[shard X] storage_proxy - Exception when communicating with : std::runtime_error (Failed to load schema version 2ccf9826-0fc5-37f6-9092-dcbabef2bfcd)

Those failures are transient and not problematic. The message appears once and then the schemas synchronize, there is no impact.