Virtual Workshops
Twice-monthly interactive sessions with our NoSQL solution architects.
Join Our Next Session >

Scylla Monitoring Stack

Complete visibility into your Scylla database cluster

Automate Repetitive Tasks

Scylla Monitoring Stack is a bundle of four components (a Prometheus metric collector, alert manager, Grafana 6 dashboards, and Grafana Loki log aggregation system) that can be deployed as containers or directly onto a host. It collects aggregated metrics, logs and events through Scylla Manager.

Alert Manager Metric Collector
+
Grafana 6 dashboards
+
grafana-loki-logo

Log Aggregation System

The stack empowers DevOps, Infrastructure Operations, and Database Administrators to quickly find and fix issues impacting the performance of their Scylla cluster. Teams can drill down from high-level dashboards to detailed metrics to determine next steps.

Grafana Dashboards

Scylla Monitoring Stack includes a set of pre-built dashboards to monitor your Scylla cluster in real time. Hundreds of different metrics populate dashboard components for your team to review historical trends and identify anomalous behavior in your cluster.

    • Overview – General overview of cluster health
    • Detailed – In-depth detailed look, at the server level
    • CQL – Detailed CQL metrics pointing out trouble areas such as shard aware drivers, non-paged queries, etc
    • OS – OS metrics reported by the node_exporter agent such as storage and network
    • IO – Scylla IO performance metrics focusing on the IO Queue
    • CPU – CPU related metrics such as core utilization
    • Errors – A single place for errors generated by Scylla
    • Manager – The Scylla Manager Dashboard
    • DynamoDB API – Track usage of Scylla’s DynamoDB-compatible API

CQL Troubleshooting

The CQL dashboard helps teams identify query issues, poor data models, and unexpected driver behavior. Teams can quickly see, for example, if their cluster is being hit by a lot of heavy queries with full table scans where “allow filtering” is enabled.

Queue Monitoring

Quickly identify queue latency and performance for the commitlog, compaction, memtable, and more. The dashboard supports dynamic classes added to Enterprise releases.

Cluster Health

Quickly identify nodes in your cluster and drill down to detailed OS level metrics such as CPU utilization, IO, and Errors. Teams can quickly decide if nodes need to get rebooted or if the team needs to perform a rolling upgrade on nodes running old versions.

Alerting

Set conditional alerts for your Scylla cluster within the alert manager so your team knows when incidents arise. Out-of-the-box alert triggers are included for conditions such as:

    • Low free disk space on the root partition
    • Node status changes availability status
    • CQL availability on node

Chart Annotations

screenshot-chart-annotations-1

Database administrators are able to annotate heavy tasks such as backup or repair start and finish times. This helps cross functional teams visually understand why there may be additional latency or reduced throughput during particular times.

Advisor

The Advisor is a new concept in Scylla Monitoring. It identifies potential problems and notifies them. The Advisor section in the Overview dashboard has two parts, one for various issues detected, like unprepared statements. The second is an indication of how balanced the system is. When the cluster works properly, all nodes and shards should act the same. An outlier shard could be a result of a problem. For example, if the number of CQL connections per shard varies between shards, it indicates a driver configuration issue.

summit cta logo

All Sessions Now Available On Demand

Innovative use cases, NoSQL best practices, Scylla internals, and more.​

summit monsters mobile

Resources

Read the documentation for additional information to get started.

Find out what’s new in Scylla Monitoring Stack 3.2.

Grab the latest source code, builds, and detailed quick start instructions.

Monitoring Scylla using Prometheus metrics and Datadog.

Take the free Operations course at Scylla University and get certified.

Guide to help you upgrade from the 2.x to 3.x monitoring stack.

Scylla University Mascot

Scylla University

Get started on the path to Scylla expertise.

Live Test CTA

Live Test

Spin up a 3-node Scylla cluster to see our light-speed performance

Virtual Workshop

Interactive sessions with our NoSQL solution architects.