SSTable

SSTable Definition

Sorted Strings Table (SSTable) is a persistent file format used by ScyllaDB, Apache Cassandra, and other NoSQL databases to take the in-memory data stored in memtables, order it for fast access, and store it on disk in a persistent, ordered, immutable set of files. Immutable means SSTables are never modified. They are later merged into new SSTables or deleted as data is updated.

[See the discussion of SSTables and LSM-Trees in Martin Kleppmann’s Designing Data Intensive Applications book (free PDF)]

Diagram showing data simultaneously moving to Commit log and Memtable. The Memtable is periodically flushed to SSTable for storage.

SSTable FAQs

What Is SSTable In Cassandra?

Sorted Strings Table (SSTable) is a file format used by Apache Cassandra, ScyllaDB, and other NoSQL databases when memtables are flushed to durable storage from memory. ScyllaDB has always tried to maintain compatibility with Apache Cassandra, and file formats are no exception. SSTable is saved as a persistent, ordered, immutable set of files on disk. They are created by a memtable flush and are deleted by a compaction.

What is the Use of SSTable in Cassandra?

The purpose of a database is to persistently and efficiently store data. That storage needs to be durable, so that the data isn’t lost when the system is shut down or an error occurs. Keeping all the data only in memory would be fast, but not durable. Writing every update to storage immediately would be very slow and inefficient at scale.

This is where we need to understand Memtable vs. SSTable. When data is committed, ScyllaDB or Cassandra stores the changes in a commitlog, which is a file that only allows appending, so writes are quick. Simultaneously the data is written to an in-memory cache of key/column data called a memtable. Periodically the memtable is flushed to persistent storage in the form of SSTables on disk. SSTables in Cassandra or ScyllaDB serve as the building blocks of the total data stored in the database. SSTables are immutable, so updates to data create a new SSTable file instead of changing the existing ones.

For each SSTable the database creates an index file and a data file. The index file helps locate data faster in the sorted data file.

An SSTable uses a Log-Structured Merge (LSM) tree data structure format. This format is more efficient for write-heavy fast-growing extremely large data sets than a traditional B-tree (pronounced “Bee tree”) format.

Scylla and Cassandra

Learn more about SSTables and compaction in Cassandra 4.x and ScyllaDB


Watch Video

When Does SSTable Compaction Occur?

As data continues to be written and updated, more immutable SSTable files are created. So the same record, with different versions of the data, may be found across the different SSTable files on disk. The system understands which of these records are the most current, and only responds to query requests with the latest version. However, without some way to remove the outdated ones, the SSTable count and data volume stored would get very high, and the disks would fill up.

Compaction is a process that writes a whole new file using data found across the extant SSTables. This process deduplicates obsolete records and only writes the most current changes for the same key on different SSTables, writing a new SSTable file. Deleted rows (indicated by a marker called a tombstone or entire deleted columns are also cleaned up, and the process creates a new index for the compacted SSTable file.

Are SSTables Used in BigTable?

Yes. Google’s proprietary Cloud Bigtable data storage system also uses SSTables internally.

Are ScyllaDB SSTables Different from Cassandra SSTables?

ScyllaDB supports the same SSTable format as Apache Cassandra. The current SSTable file format used in both ScyllaDB and Cassandra has the extension .md. SSTables can be relocated from a Cassandra data directory into a ScyllaDB data directory.

ScyllaDB maintains a larger number of smaller SSTables than Cassandra. In ScyllaDB, each CPU core manages its own subset of SSTables, related to that CPU’s assigned partitions of data. The internal sharding allows each core to avoid competing trying to read and write to the same data partitions against other cores.

Are ScyllaDB SSTables Different from DataStax SSTables?

DataStax Enterprise (DSE) uses its own proprietary SSTable file format. Whereas ScyllaDB and Apache Cassandra both use the “.md” file format, DataStax uses a proprietary .bti format. Files migrated into this proprietary file format cannot be shared with Apache Cassandra or ScyllaDB.

What Are Strategies for ScyllaDB SSTable Compaction?

ScyllaDB supports several strategies for compacting SSTables, ranging from size-based (Size Tiered Compaction Strategy, or STCS) to time-based (Time Window Compaction Strategy or TWCS) to an incremental, consistently-compacting strategy (Incremental Compaction Strategy, or ICS), which can be tailored to the needs of the application.

Incremental Compaction Strategy (ICS) is unique to ScyllaDB Enterprise. It is not available in ScyllaDB Open Source or in any version of Cassandra. ICS makes more efficient use of persistent disk. Whereas SSTables using other compaction strategies require a reserve of 50% of free disk space to accommodate compactions, with ICS you can use up to about 85% of your disk, and only need about 15% of free disk space to conduct compactions. This lowers overall costs and improves the utilization of your hardware resources.

How Large is an SSTable in Cassandra?

The initial size of an SSTable is related to the size of the memtable it came from. The maximum size of these tables initially depends on your memtable settings and the size of your heap. The memtable_total_space_in_mb setting affects the size allocated. Size on disk can be reduced by using SSTable compression, which uses a block-based compression scheme. Learn more

How to Resolve a Corrupted SSTable?

SSTable files occasionally become corrupted. With a minimum replication factor (RF) of 3, the operator takes the node with the corrupt SSTable offline. The operator then runs the sstablescrub tool against the offline node. If that tool is unable to delete all the corrupt SSTable files, the files can be manually removed from the file system. At this point, a repair operation is run on the affected column family. If a corrupted SSTable cannot be successfully scrubbed and repaired, the operator might need to restore the file from the latest stored backup, after which the repair operation is run again.

Does ScyllaDB Offer Solutions for Managing SSTables?

ScyllaDB offers ScyllaDB, an open source wide-column NoSQL database. ScyllaDB is completely compatible with the Cassandra SSTable format, and ScyllaDB increases performance by using more and smaller SSTables with a shard-per-core design to reduce competition for resources.

ScyllaDB’s approach to NoSQL data store design is optimized for modern hardware. ScyllaDB Enterprise, the commercial version of ScyllaDB, also provides Incremental Compaction Strategy (ICS) for SSTables, making it far more efficient with storage resources. ScyllaDB also offers ScyllaDB Manager, a tool for automatically repairing and backing up SSTables. It is free for all ScyllaDB Enterprise users and for users of ScyllaDB Open Source NoSQL Database for clusters of up to five nodes.

Learn essential strategies for wide column data modeling with Apache Cassandra and ScyllaDB


Watch Video

Trending NoSQL Resources

ScyllaDB University Mascot

ScyllaDB University

Get started on your path to becoming a ScyllaDB expert.