Cassandra Compaction

Cassandra Compaction Definition

Optimized to rapidly write large amounts of data, Apache Cassandra places all incoming data into internal files called SSTables in an append-only manner. For this reason, SSTables hold both updates/deletes of previously inserted data and newly inserted data and various SSTables will hold copies of the same data.

Apache Cassandra compaction is the process of reconciling different data copies stored in different SSTables. Cassandra compaction of SSTables is a crucial background activity for maintenance and performance. Compaction in Cassandra involves various techniques and different timing for performing different operations.

Routine compactions are important to a healthy Cassandra cluster, and Cassandra compaction configuration can vary significantly depending on a specific table’s uses. It is possible to change compaction the strategy in Cassandra after creating the table; however, since the data will then need to be re-written with the new compaction strategy, this significantly affects cluster performance in direct proportion to how much data the table holds. This is why identifying and implementing the most effective Cassandra Compaction strategy early on is critical.

Watch How to Ruin Performance by Choosing the Wrong Compaction Strategy

Cassandra Compaction FAQs

What is Cassandra Compaction?

Cassandra compaction is used for different kinds of operations, each requiring a number of SSTables and producing new SSTables. Cassandra triggers minor compaction automatically. In major compaction, a user executes a Cassandra compaction across the node. In other words, each of the SSTables are compacted on the node. Finally, in a user defined compaction, the user triggers the Cassandra compaction across a specific set of SSTables.

The optimal compaction strategy based on the workload delivers the best Cassandra compaction performance for both compaction itself and for querying.

Size Tiered Compaction Strategy (STCS) is the default Cassandra compaction strategy. It is triggered when the system has enough (four by default) similarly sized SSTables. STCS is ideal when the I/O from LCS (explained below) is too high, for non pure time series workloads with spinning disks, or as a fallback when other strategies don’t fit the workload. However, space amplification is a major problem for STCS and it is not as useful for similar sized workloads.

Leveled Compaction Strategy (LCS) is optimized for workloads with lots of deletes and updates, or read heavy workloads. With LCS, the system uses small, fixed-size (by default 160 MB) SSTables distributed across different levels. It is not ideal for immutable time series data. Although LCS solves the major problem with space-amplification that Size-Tiered Compaction Strategy experiences, it has its own problem that can cause both write and read performance to suffer: write-amplification.

Time Window Compaction Strategy (TWCS) is intended mostly for immutable time series data. It is similar to the deprecated DateTiered Compaction Strategy (DTCS).

There are several less common Cassandra compaction types to consider. A scrub compaction may repair broken SSTables, but it can also leave the node in need of a complete repair, as it can remove valid data if it is corrupted. And after upgrading to a major new version of Cassandra, run upgrade SSTables.

Finally, sub range compaction, which targets merely a sub range for Cassandra compaction, is possible if you have enough information to narrow it down. Use nodetool compact -st x -et y to select all SSTables sizes in your sub range and issue a compaction for those SSTables.

Cassandra alternatives such as ScyllaDB may also offer different compaction strategies. For example, Incremental Compaction Strategy (ICS) solves the temporary space requirement issue in STCS using a hybrid technique to reduce the space overhead of STCS and combine the strengths of both STCS and LCS. ICS ensures that with a certain configuration in place, the fragment size, disk size, and number of shards can be used to calculate the worst-case temporary space requirement for compaction. The exact percentage that results decreases significantly for larger disks since it depends on the disk size and the space overhead and the logarithm of the disk size are proportional.

ICS delivers less expensive storage at scale by allowing users to optimize disk utilization without increasing read or write amplification. This means a capability to store more data on the existing cluster, using fewer nodes.

Learn more about Incremental Compaction

Compare Cassandra Compaction Strategies

Selecting a Cassandra Compaction Strategy

Multiple Cassandra Compaction strategies exist, each optimized for a different use case. There are Cassandra compaction best practices for each strategy.

To choose and implement a compaction strategy, first understand the difference between the options and determine the right Cassandra compaction strategy for each sort of table. Then test your selected strategy on the data after configuring the appropriate compaction strategy and subproperties such as min threshold.

If the table processes time series data, use TimeWindowCompactionStrategy (TWCS).

If the table handles an unequal amount of reads and writes, try the LeveledCompactionStrategy (LCS) might be the best choice. LCS works best if there are at least twice as many reads as writes, particularly randomized reads. However, LCS can be overwhelmed by a high number of writes, and when reads and writes are approximately equal, LCS leaves users with a performance penalty that may not be worth the benefit.

SizeTieredCompactionStrategy (STCS) is ideal when table data changes infrequently, if ever, there are few upserts, or the data is immutable.

Learn more about STCS vs LCS compaction strategies in the following video:

How to Set a Cassandra Compaction Strategy

Compaction strategies are set as part of the CREATE or ALTER statement when creating or altering tables. Refer to the CQL syntax for details.

More on How the Cassandra Compaction Process Works

The importance of Cassandra compaction is directly related to how the Cassandra write path works.

First, Cassandra stores recent writes in a structure called the Memtable in memory. It then flushes the Memtable to disk once it has made enough writes. Cassandra stores data on disk in Sorted String Tables (SSTables), relatively simple data structures like a sorted variety of strings.

Cassandra merges and pre-sorts the memtable data according to a Primary Key before it writes a new SSTable. A Primary Key is made up of a Partition Key and any defined Clustering Keys. The Partition Key is the unique key that determines which node stores the data.

In one continuous write operation, the SSTable is written to disk. Once on disk, SStTables are immutable. Any updates to or deletions of SSTable data are written to a new SSTable. Cassandra may need to read from multiple SSTables if data is updated regularly enough—even to retrieve just one row.Compaction operations are required and occur to combine and re-write SSTables periodically. This is because once written to disk SSTables are immutable. Compactions merge disparate row data into new SSTables and prune deleted data to keep read operations optimized and reclaim disk space. Use the nodetool to check Cassandra compaction status.

What is Cassandra Compact Storage?

A Cassandra compact table is defined with the outmoded COMPACT STORAGE option which shouldn’t be used for new tables. The Cassandra compact table option is maintained for definitions created before CQL version 3 to maintain backward compatibility. Creating a table with this option limits it in many ways, for example rendering it unable to use static columns and collections.

What are Cassandra Tombstones?

Cassandra does not delete data from the disk at once. Instead, the system writes a tombstone, a special value that indicates that data has been deleted. Tombstones mark data for tombstone compaction and prevent deleted data from being returned during reads.

Compaction only triggers dropping the tombstones if all SSTables that might hold relevant data are included.

Tombstones and actual data are always retained in the same data directory to ensure no data can be undeleted even if a disk and all versions of a partition are lost. One compaction strategy instance runs per data directory in addition to compaction strategy instances containing data. In other words, for every two data directories, there will actually be four compaction strategy instances running.

Single SSTable compaction or single sstable tombstone compaction consists of creating and running a histogram with the tombstone expiration times to locate SSTables with many target tombstones. This allows you to run single SSTable compaction, ideally, depending on how much overlap there is with other SSTables.

Learn more about tombstones in the following video:

Does ScyllaDB Offer a Comparable Alternative to Cassandra Compaction?

Users of ScyllaDB can use all four of Apache Cassandra’s traditional compaction strategies: Date-Tiered, Leveled, Size-Tiered, and Time-Window—implemented in ScyllaDB using the same heuristics that Apache Cassandra uses. This makes the switch from Cassandra to ScyllaDB simple.

But ScyllaDB users also have a fifth, new, compaction strategy—Incremental Compaction Strategy (ICS)—which is not available in Apache Cassandra. This hybrid approach, which combines the best aspects of leveled and size-tiered compaction strategies, shares the low write amplification benefit of size-tiered compaction while reducing its space amplification weakness, allowing users to store more far data on their existing infrastructure. Using ICS, ScyllaDB can support more workloads with greater flexibility and cost savings.

Cassandra and ScyllaDB users all know that the wrong compaction strategy can impair their workload’s performance. ScyllaDB makes it easier to implement the right compaction strategy and switch seamlessly from Apache Cassandra. Learn more here.

Real-Time AI

Is ScyllaDB right for me?

ScyllaDB University

ScyllaDB Blog

Cassandra Compaction