ScyllaDB Cloud on AWS I8g and I8ge: 2x Throughput, Lower Latency, Zero Extra Cost

By Tzach Livyatan

January 22, 2026

How ScyllaDB performs on the new I8g and I8ge instances, across different workload types

Let’s start with the bottom line. For ScyllaDB, the new Graviton4-based i8g instances improve i4i throughput by up to 2x with better latency – and the i8ge improves i3en throughput by up to 2x with better latency. Benchmarks also show single-digit millisecond latency during maintenance operations like scaling. Fast and smooth scaling is an important part of the new ScyllaDB X Cloud offering.

The chart below shows ScyllaDB max through under a latency SLA of 10ms latency for different workloads, for the old i4i, i3en and the new i8g, i8ge.

AWS recently launched the I8g and I8ge storage-optimized EC2 instances powered by AWS Graviton4 CPUs and 3rd-generation AWS Nitro SSDs. They’re designed for I/O-intensive workloads like real-time databases, search, and analytics (so a nice fit for ScyllaDB).

Instance Family	Use Case	Number of vCPUs per instance	Storage
i8g	Compute bound	2 to 96	0.5 to 22.5 TB
i8ge	Storage bound	2 to 192	1.25 to 120 TB

Reduced TCO in ScyllaDB Cloud

Based on our performance results, ScyllaDB users migrating to Graviton4 can reduce infrastructure requirements by up to 50% compared to i4i and i3en previous generations. This translates into significantly lower total cost of ownership (TCO) by requiring fewer nodes to sustain the same workload.

These improvements stem from a few factors – both in the new instances themselves, and in the match between ScyllaDB and these instances.

The new I8g architecture features:

vCPU-to-core mapping: On x86, each vCPU uses half a physical core (a hyperthread); for i8g (ARM), each core matches one physical core
Larger caches: 64kB instruction cache and 64kB data cache, compared to 32/48kB on Intel (shared between the two hyperthreads)
Faster storage and networking (see spec above)

In addition, ScyllaDB’s design allows it to take full advantage of the new server types:

The shard-per-core architecture scales with linear performance to any number of cores
The IO scheduler can take full advantage of the 3rd-generation AWS Nitro SSD, fully utilizing the higher IO rate, and lower latency without overloading it and increasing latency
ARM’s relaxed memory model suits Seastar applications. Since locks and fences are rare, the memory subsystem has more opportunities to reorder memory accesses to optimize performance.

What this means for you

I8g and i8ge are now available on ScyllaDB Cloud.

If you’re running ScyllaDB Cloud, the net impact is:

Compute-bound workloads: Move from I4i to I8g. This should provide up to 2x throughput at the same ScyllaDB Cloud price.
Storage-bound workloads: Move from I3en to I8ge. Here, you should expect up to 2x higher throughput at the same ScyllaDB Cloud price. Note that using the new ScyllaDB dictionary-based compression can lower the storage cost further.

For both use cases, ScyllaDB can keep the 10ms P99 latency SLA during maintenance operations, including scaling out and scaling down.

What we measured

Max Throughput: The maximum requests per second the database can handle
Max Throughput under SLA: The maximum request per second under a P99 latency of 10ms. Only throughput with latency below this SLA counts. This throughput can be sustained under any operation, like scaling and repair. This is the number you should use when sizing your ScyllaDB Database on i8g instances.
P99 Latency: Measures the p99 latency for the Max Throughput under SLA

Results

Read Workload – cached data

Cached data: working set size < available RAM, resulting in close to 100% cache hit rate.

Instance type	Max throughput	Max Throughput Under Latency SLA	Improvement	P99 in ms
i4i.4xlarge	1,062,578	750,000	100%	7.84
i8g.4xlarge	1,434,215	1,300,000	135%	6.29
i3en.3xlarge	585,975	550,000	100%	4.37
i8ge.3xlarge	962,504	800,000	164%	6.38

Read Workload – non-cached data, storage only

Non-cached data: working set size >> available RAM, resulting in 0% cache hit rate. When most of the data is not cached, storage becomes a significant factor for performance.

Instance type	Max throughput	Max Throughput Under Latency SLA	Improvement	P99 in ms
i4i.4xlarge	218,674	210,000	100%	4.56
i8g.4xlarge	444,548	300,000	203%	4.24
i3en.3xlarge	145,702	140,000	100%	6.83
i8ge.3xlarge	259,693	255,000	178%	7.95

Write Workload

Instance type	Max throughput	Max Throughput Under Latency SLA	Improvement	P99 in ms
i4i.4xlarge	289,154	150,000	100%	2.4
i8g.4xlarge	689,474	600,000	238%	4.02
i3en.3xlarge	217,072	200,000	100%	5.42
i8ge.3xlarge	452,968	400,000	209%	3.41

Tests under maintenance operations

ScyllaDB takes pride in testing under realistic use cases, including scaling out and in, repair, backups, and various failure tests.

The following results represent the P99 average latency (across all nodes) of different maintenance operations on a 3-node cluster of i8ge.3xlarge. It’s using the same setup as above.

Setup

ScyllaDB version: 2025.3.1-20250907.2bbf3cf669bb
DB node amount: 3
DB instance types: i8ge.3xlarge
Loader node amount: 4
Loader instance type: c5.2xlarge
Throughput: Read 41K, write 81K, Mixed 35K

Results

Read Test: Read Latency

Operation	Read P99 latency in ms
Base: Steady State	0.95
During Repair	4.92
During Add Node (out scale)	2.68
During Replace Node	3.10
During Decommission Node (downscale)	2.44

Write Test: Write Latency

Operation	Write P99 latency in ms
Steady State	2.22
During Repair	3.24
Add Node (scale out)	2.49
Replace Node	3.07
Decommission Node (downscale)	2.37

Mixed Test: Write and Read Latency

Operation	Write P99 Latency in ms	Read P99 Latency in ms
Steady state	2.03	2.11
During Repair	3.21	4.70
Add Node (scale out)	2.19	2.71
Replace Node	3.00	3.37
Decommission Node (downscale)	2.20	3.05

The results indicate that ScyllaDB can meet the latency SLA under maintenance operations. This is critical for ScyllaDB Cloud, and in particular ScyllaDB X Cloud, where scaling out and in scaling are automatic, and can happen multiple times per day. It’s also critical in unexpected failure cases, when a node must be replaced rapidly, without hurting availability and the latency SLA.

Test Setup

ScyllaDB cluster

3-node cluster
I4i.4xlarge vs. i8g.4xlarge
I3en.3xlarge vs. i8ge.3xlarge

Loaders

Loader node amount: 4
Loader instance type: c7i.8xlarge

Workload

Replication Factor (RF): 3
Consistency Level (CL): Quorum
Data size 650GB for read/mixed, 1.5T for write

AWS Benchmarks

Previous Post Next Post

Real-Time AI

Is ScyllaDB right for me?

ScyllaDB University

ScyllaDB Blog

ScyllaDB Cloud on AWS I8g and I8ge: 2x Throughput, Lower Latency, Zero Extra Cost

Reduced TCO in ScyllaDB Cloud

What this means for you

What we measured