See all blog posts

Announcing ScyllaDB Manager 1.1, a Production-Ready Release of ScyllaDB Manager.

ScyllaDB Release

The ScyllaDB team is pleased to announce the release of ScyllaDB Manager 1.1, a production-ready release of ScyllaDB Manager for ScyllaDB Enterprise customers.

ScyllaDB Manager adds centralized cluster administration and recurrent task automation to ScyllaDB Enterprise. ScyllaDB Manager 1.x includes automation of periodic repair, with future releases providing rolling upgrades, recurrent backup, and more. With time, ScyllaDB Manager will become the focal point of ScyllaDB Enterprise cluster management, including a GUI frontend. ScyllaDB Manager is available for all ScyllaDB Enterprise customers. It can also be downloaded from scylladb.com for a 30-day trial.

Related links

New features in ScyllaDB Manager 1.1

 

ScyllaDB Manager Metrics, Dashboard, and Alerts

ScyllaDB Manager now reports metrics over the Prometheus protocol. You can use ScyllaDB Manager metrics directly or with the ScyllaDB Manager dashboard from the ScyllaDB Monitoring Stack.

release

With the latest addition of alerts to the ScyllaDB Monitoring Stack, an alert will be triggered in case a repair has failed, or if ScyllaDB Manager exits for any reason. More on Alerts here.

Full List of ScyllaDB Manager metrics:

Parameter Description
cluster Cluster unique identification (string)
host Host IP address
shard Shard (core) number
unit Repair Unit
quantile Histograms quantile

 

Metric Type Description
repair_duration_seconds summary repair_duration_seconds. The time repair has been running in seconds.
repair_segments_success gauge Number of repaired segment
repair_segments_error gauge Number of segments that failed to repair
repair_segments_total gauge Total number of segments to repair.
Where repair_segments_error + repair_segments_success = repair_segments_total
ssh_open_streams_count gauge Number of active (multiplexed) connections to ScyllaDB node.
log_error_total counter counter Total number of ERROR messages
log_info_total counter counter Total number of INFO messages

Repair Retries

Starting from ScyllaDB Manager 1.1, during a repair, should a segment fail for any reason, ScyllaDB Manager skips the segment and continues until all segments have been repaired. It will then go back to the skipped segments and try to repair it a second time. Each repair attempt is considered a retry. The number of retries is configurable in the scylla-manager.yaml (see below)

You can follow the progress of retries using the sctool repair progress command or Grafana Manager dashboard.

Repair Configuration

The following new configuration parameters are now available:

sctool updates

  • New date format across the tool. The new format contains TZ (UTC) info and is easier to read ie. 13 Apr 18 00:00 UTC
  • Start/End/Duration info in the repair progress command
  • Task list can be used without a cluster argument to see tasks on all the clusters

REST API updates

  • New /ping service for testing manager availability
  • New /metrics service expose Manager metrics (also, see ScyllaDB Manager Metrics above)
  • New /progress/{run_id} return status of a specific repair
  • Rest API Update:
    /cluster/{cluster_id}/repair/unit/{unit_id}/progress is now /cluster/{cluster_id}/repair/unit/{unit_id}/progress/{run_id} making the repair run ID mandatory.

Other improvements

  • New script scyllamgr_ssh_test performs a quick test of SSH connectivity between ScyllaDB manager and ScyllaDB nodes

Noteworthy bug fixes in ScyllaDB Manager 1.1

  • All Time types are now represented as UTC (Coordinated Universal Time)
  • sctool: The task list command now displays duration instead of end time. In addition, any task list command that is run without a cluster name argument, reports a list of tasks from all clusters.
  • ScyllaDB Manager log level is now configurable in the manager.yaml file
  • ScyllaDB REST API was reporting SSH errors with confusing error messages, this has been fixed.
  • dist: A Restart directive was added to the scylla-mgmt service

About Tzach Livyatan

Tzach Livyatan has a B.A. and MSc in Computer Science (Technion, Summa Cum Laude), and has had a 15 year career in development, system engineering and product management. In the past he worked in the Telecom domain, focusing on carrier grade systems, signalling, policy and charging applications.