ScyllaDB Enterprise Release 2019.1.0

The ScyllaDB team is pleased to announce the release of ScyllaDB Enterprise 2019.1.0, a production-ready ScyllaDB Enterprise major release.

The ScyllaDB Enterprise 2019.1 release is based on ScyllaDB Open Source 3.0, inheriting features from that release, and also add new features which are only offered to our ScyllaDB Enterprise customers. ScyllaDB Enterprise 2019.1 includes backported bug fixes from upstream releases and master.

Related Links

ScyllaDB Enterprise customers are encouraged to upgrade to ScyllaDB Enterprise 2019.1, and are welcome to contact our Support Team with questions.

New Distribution Support

ScyllaDB Enterprise packages are newly available for:

  • Debian 9
  • Ubuntu 18.04

Deprecated

ScyllaDB Enterprise for:

  • Ubuntu 14
  • Debian 8

Note that ScyllaDB Enterprise 2019.1 is not supported Ubuntu 14 and Debian 8. If you are using Ubuntu 14, or Debian 8 please contact ScyllaDB support for help to upgrade to a later OS release.

Additional AWS EC2 instance type

ScyllaDB Enterprise 2019.1 AMI is optimized to work with AWS EC2 i3 instances, now including i3.metal #620

New features In 2019.1

ScyllaDB 2019.1 will include all the ScyllaDB Enterprise features of previous releases, including Auditing and In-Memory Tables, in addition, it will include the following, new, ScyllaDB Enterprise only features:

Workload Prioritization – Technology Preview

ScyllaDB Enterprise will enable its users to safely prioritize different workloads. For example, a user may balance real-time operational workloads with big-data analytical workloads within a single database cluster. Online Transaction Processing (OLTP) and Online Analytical Processing (OLAP) have very different data access patterns and characteristics. OLTP involves many small and varied transactions, including mixed writes, updates, and reads, with a high sensitivity to latency. In contrast, OLAP emphasizes the throughput of broad scans across datasets. By introducing capabilities that isolate workloads, ScyllaDB Enterprise will uniquely support simultaneous OLTP and OLAP workloads without sacrificing latency or throughput. Note that workload prioritization can be used to differentiate between any two or more different workloads (for example, different OLTP workloads with different priorities or SLAs).

Technology Preview features are product innovations that require real-world field use, customer validation, and feedback to ensure their functionality, quality and performance meet customer expectations. Your feedback will help influence their development and maturation. We caution users against enabling them in a production environment without significant internal testing and validation for their use case and workload.

Features Inherited From ScyllaDB Open Source

The following ScyllaDB Open Source features are now part of the Enterprise branch and are included with the ScyllaDB 2019.1 release:

Materialized Views (MV)

In ScyllaDB Enterprise 2019.1, MV is production ready (graduated from a long incubation in ScyllaDB open Source) and feature-compatible with Apache Cassandra 3.0, including:

  • Creating a MV based on any subset of columns of the base table, including the primary key columns
  • Updating a MV for base table DELETE or UPDATE.
  • Indexing of existing data when creating an MV
  • Support for MV hinted handoff
  • Topology changes, and migration of MV token ranges
  • Sync of TTL between a base table and an MV table
  • nodetool viewbuildstatus
  • Unselected columns keep MV row alive #3362, CASSANDRA-13826

The following MV functions are not available in ScyllaDB Enterprise 2019.1:

MV on static and collection columns

See ScyllaDB MV documentation here

Global Secondary Index (GSI)

Unlike Apache Cassandra, ScyllaDB’s Secondary Indexes are global and based on MV. This means every secondary index creates a materialized view under the hood, using all the columns of the original base table’s primary key, and the required indexed columns. This feature is production ready in ScyllaDB Enterprise 2019.1

See ScyllaDB SI documentation here

CQL: Enable ALLOW FILTERING for regular and primary key columns

ScyllaDB now permits ALLOW FILTERING to minimize data returned to a client based on filtering criteria. Filtering can also be used in conjunction with secondary indexes to create more sophisticated queries. However, filtering is to be used with caution, as it can hamper overall database performance. Read more on the topic in this blog.

For example:

CREATE TABLE t (p int, c int, v int, PRIMARY KEY (p, c));
select * from t where c = 0 and v = 1 allow filtering;
select * from t where p = 0 and v = 1 allow filtering;
select * from t where p = 0 and c = 0 and v = 1 allow filtering;

Support multi-column restrictions in ALLOW FILTERING

For example:

CREATE TABLE t (a int, b int, c int, d int, e int, PRIMARY KEY ((a, b), c, d));
SELECT * FROM t WHERE (c, d) = (1, 2) ALLOW FILTERING;
SELECT * FROM t WHERE (c, d) IN ((1, 2), (1,3), (1,4)) ALLOW FILTERING;
SELECT * FROM t WHERE (c, d) < (1, 3) ALLOW FILTERING;

Support for collections in ALLOW FILTERING

Filtering can also be used to find an element in collections:

CREATE TABLE t (p frozen<map<text, text>>, c1 frozen<list>, c2 frozen<set>, v map<text, text>, id int, PRIMARY KEY(p, c1, c2));CREATE TABLE t (p frozen<map<text, text>>, c1 frozen<list>, c2 frozen<set>, v map<text, text>, id int, PRIMARY KEY(p, c1, c2));
SELECT id FROM t WHERE c1 CONTAINS 3 ALLOW FILTERING;
SELECT id FROM t WHERE p CONTAINS KEY 'a' ALLOW FILTERING;
SELECT id FROM t WHERE v CONTAINS KEY 'y1' AND c2 CONTAINS 3 ALLOW FILTERING;

PER PARTITION LIMIT

Limit the number of returned rows per partition. For example, the following will return two rows per partition:

SELECT * FROM users PER PARTITION LIMIT 2;

While the previously exist LIMIT was applied to the global number of rows. The two limits can be combined:

SELECT * FROM users PER PARTITION LIMIT 1 LIMIT 10

Hinted Handoff

Hinted Handoffs are now available in ScyllaDB Enterprise 2019.1 as a production ready feature. In case of temporary node outage or unavailability, ScyllaDB stores hints in a commitlog-like files. The hints are used to update the node efficiently when it becomes available again. More on this may be found in this blog and our documentation.

New File Format (SSTable 3.0 “mc” format)

ScyllaDB Open Source 3.0 includes an Apache Cassandra 3.x file format (mc) which is more efficient and requires less disk space than the ScyllaDB 2.x and Apache Cassandra 2.x format (la). Note that ScyllaDB Open Source 3.0 is still able to read the old SSTable file format.

Note that the new SSTable file format is disabled by default, allowing a controlled upgrade to the new format following an upgrade. It is enabled by setting the new enable_sstables_mc_format parameter in scylla.yaml. More on the new and old file formats here.

Note that some features of the mc SSTable file format are not backward compatible with older formats, meaning that once the whole cluster is upgraded to 2019.1 and the new SSTable file format is enabled and used for storing new data, downgrade might not be possible without data loss.

Full (multi-partition) scan improvement

Full scan is a common use case of analytics where one needs to query data *without* a key. In ScyllaDB 3.0, full table scans are significantly improved. More on full scans in ScyllaDB Open Source 3.0 here.

Role Based Access Control (RBAC)

Role Based Access Control (RBAC) is a method of reducing lists of authorized users to a few roles assigned to multiple users. RBAC is sometimes referred to as role-based security. RBAC is used for Workload Prioritization (see above)

GoogleCloudSnitch

Use the GoogleCloudSnitch for deploying ScyllaDB on the Google Cloud Engine (GCE) platform across one or more regions. The region is treated as a data center and the availability zones are treated as racks within the datacenter. All communication occurs over private IP addresses within the same logical network.

To use the GoogleCloudSnitch, add the snitch to the scylla.yaml file which is located under /etc/scylla/ for all nodes in the cluster.

Tooling

Large Partitions

ScyllaDB now identifies large partitions and makes the information available in a system table. This allows you to identify large partitions, giving you the ability to fix them. The threshold for large partitions can be set in scylla.yaml:

compaction_large_partition_warning_threshold_mb parameter (default 100MB)

Example:

SELECT * FROM system.large_partitions;

iotune v2

Iotune is a storage benchmarking tool that runs as part of the scylla_setup script. Iotune runs a short benchmark on the ScyllaDB storage and uses the results to set the ScyllaDB io_properties.yaml configuration file (formerly called io.conf). ScyllaDB uses these settings to optimize I/O performance, specifically through setting max storage bandwidth and max concurrent requests.The new iotune output matches the IO scheduler configuration, is time-limited (2 minutes) and produces more consistent results than the previous version.

scyllarepair – deprecated

ScyllaDBrepair, a script for running repairs is deprecated and will not be part of the following releases. Instead, ScyllaDB Manager is the preferred tool for cluster-wide operations and repair scheduling

Other tooling:

  • Tracing: Added prepared statement parameters #1657

CQL

Datetime Functions Support

ScyllaDB supports Cassandra-style datetime functions, including currentTimestamp, currentDate, currentTime, and currentTimeUUID.

JSON support

Support for the Javascript Object Notation (JSON) format, including inserting JSON documents, retrieving data in JSON and providing helper functions to transform native CQL types into JSON and vice versa. Schemas are still enforced for all operations — one cannot just insert random JSON documents into a table. The new API is simply a convenient way of working with JSON without having to convert everything back and forth on the client-side.

Other CQL:

  • Different timeouts for reads and range scans can now be set #3013
  • TWCS support for using millisecond values in timestamps #3152 (used by KairosDB and others)
  • Support for timeuuid functions: now, currentTimeUUID (alias of now), minTimeuuid and maxTimeuuid #2950

Performance Improvements

Streaming Improvements

Significant improvements, reducing the time it takes to repair, add a node to a cluster and other cluster operations which use streaming under the hood. More here.

Row-level cache eviction

The cache is capable of freeing individual rows to satisfy memory reclamation requests. Rows are freed starting from the least recently used ones, with insertion counting as a use. More here.

CPU Scheduler and Compaction Controller for Size Tiered Compaction Strategy (STCS)

With ScyllaDB’s thread-per-core architecture, many internal workloads are multiplexed on a single thread. These internal workloads include compaction, flushing memtables, serving user reads and writes, and streaming. The CPU scheduler isolates these workloads from each other, preventing, for example, a compaction using all of the CPU and preventing normal read and write traffic from using its fair share. The CPU scheduler complements the I/O scheduler which solves the same problem for disk I/O. Together, these two are the building blocks for the compaction controller.

Other performance improvements:

  • Dynamic controllers for compaction strategies: Size Tiered Compaction Strategy, Leveled Compaction Strategy and Time Window Compaction Strategy.
  • Enhanced tagging for scheduling groups to improve performance isolation
  • Stateful paging to improve the performance of partition range scans queries. #1865. More here
  • Improve row digest hash #2884 – ScyllaDB moved from md5 hashes to xxHash, which improved performance significantly. Mean latency on reads improved 23%. Write latencies improved by 18%.
  • Improving the performance of promoted index for wide partitions, by reducing their memory footprint #2981
  • Size-based sampling rate in SSTable summary files – automatically tune

Known issue

During the upgrade or right after you might see error messages similar to:

[shard X] storage_proxy - Exception when communicating with : std::runtime_error (Failed to load schema version 2ccf9826-0fc5-37f6-9092-dcbabef2bfcd)

Those failures are transient and not problematic. The message appears once and then the schemas synchronize, there is no impact.