
Motivation
Many times in real life installations, one of the most important parameters of the system is the longest response time. Naturally, the shorter, the better. Therefore capturing the request that took a long time and understanding why it took it so long is a very critical and challenging task. Slow query tracing to the rescue!
Example
One of the common problems of ill-formed queries is that a query has to get to the remote Data Center in order to achieve a required consistency level.
Let’s see how Slow Query Logging is going to help us catch such problems.
Let’s create a clear cluster in an Amazon AWS environment with 3 Nodes in a us-east time zone and 3 in us-west time zone. We will use Ec2MultiRegionSnitch for simplicity:
$ nodetool status
Datacenter: us-east
===================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns Host ID Rack
UN 54.87.27.55 237.34 KB 256 ? 7bda100f-ea8e-4e20-8d11-54d64d76c661 1c
UN 54.90.129.115 247.68 KB 256 ? ad71327a-b750-4817-8c78-79801d21dee3 1c
UN 50.19.22.61 247.68 KB 256 ? 44448021-f96d-4de9-b018-daa5853acea5 1c
Datacenter: us-west
===================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns Host ID Rack
UN 52.53.167.135 235.41 KB 256 ? 00e6fe23-be5c-4d9b-b933-68030d0e6a5b 1c
UN 54.153.93.219 274.82 KB 256 ? 3212ee29-17fd-437c-9e9c-2b9b09f0fd36 1c
UN 54.67.34.6 321.74 KB 256 ? ce0899a8-e898-465e-9c36-34e29cede5b4 1c
Then lets create a keyspace and a table for the demonstration. We’ll run cqlsh here and after from a Node with an address 50.19.22.61:
cqlsh> CREATE KEYSPACE my_ks with replication = {'class': 'NetworkTopologyStrategy', 'us-east' : '2', 'us-west' : '2'};
cqlsh> CREATE table my_ks.my_cf1 ( c1 text PRIMARY KEY, c2 text);
Let’s add a row to it:
cqlsh> INSERT INTO my_ks.my_cf1 (c1, c2) VALUES ('key1', 'value1');
Let’s enable the Slow Query Logging:
curl -X POST --header "Content-Type: application/json" --header "Accept: application/json" "http://127.0.0.1:10000/storage_service/slow_query?enable=true&ttl=86000&threshold=10000"
Now let’s read the row above with the “bad” consistency level: QUORUM
cqlsh> CONSISTENCY QUORUM
Consistency level set to QUORUM.
cqlsh> SELECT * FROM my_ks.my_cf1 where c1='key1';
c1 | c2
------+--------
key1 | value1
(1 rows)
Now let’s stop the Slow Query Logging:
curl -X POST --header "Content-Type: application/json" --header "Accept: application/json" "http://127.0.0.1:10000/storage_service/slow_query?enable=false&ttl=86000&threshold=10000"
Let’s see if we have anything in the system_traces.node_slow_log:
cqlsh> SELECT * from system_traces.node_slow_log ;
start_time | node_ip | shard | command | date | duration | parameters | session_id | source_ip | table_names | username
--------------------------------------+-------------+-------+---------------------------------------------+--------------------------+----------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--------------------------------------+-----------+------------------+-----------
bd212320-94d5-11e6-925a-000000000003 | 50.19.22.61 | 25 | SELECT * FROM my_ks.my_cf1 where c1='key1'; | 2016-10-18 01:54:00+0000 | 211827 | {'consistency_level': 'QUORUM', 'page_size': '100', 'query': 'SELECT * FROM my_ks.my_cf1 where c1=''key1'';', 'serial_consistency_level': 'SERIAL', 'user_timestamp': '1476755640338012'} | bd20ff20-94d5-11e6-925a-000000000003 | 127.0.0.1 | {'my_ks.my_cf1'} | anonymous
(1 rows)
Gotcha!!!
Now let’s check what actually had happen there. Let’s see the corresponding records in system_traces.sessions and system_traces.events tables:
cqlsh> SELECT * from system_traces.sessions where session_id=bd20ff20-94d5-11e6-925a-000000000003;
session_id | client | command | coordinator | duration | parameters | request | started_at
--------------------------------------+-----------+---------+-------------+----------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--------------------+--------------------------
bd20ff20-94d5-11e6-925a-000000000003 | 127.0.0.1 | QUERY | 50.19.22.61 | 211827 | {'consistency_level': 'QUORUM', 'page_size': '100', 'query': 'SELECT * FROM my_ks.my_cf1 where c1=''key1'';', 'serial_consistency_level': 'SERIAL', 'user_timestamp': '1476755640338012'} | Execute CQL3 query | 2016-10-18 01:54:00+0000
(1 rows)
cqlsh> SELECT * from system_traces.events where session_id=bd20ff20-94d5-11e6-925a-000000000003;
session_id | event_id | activity | source | source_elapsed | thread
--------------------------------------+--------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------+----------------+----------
bd20ff20-94d5-11e6-925a-000000000003 | bd21232e-94d5-11e6-925a-000000000003 | Parsing a statement | 50.19.22.61 | 1 | shard 25
bd20ff20-94d5-11e6-925a-000000000003 | bd2126f3-94d5-11e6-925a-000000000003 | Processing a statement | 50.19.22.61 | 97 | shard 25
bd20ff20-94d5-11e6-925a-000000000003 | bd212c1a-94d5-11e6-925a-000000000003 | Creating read executor for token 1573573083296714675 with all: {54.90.129.115, 54.87.27.55, 54.67.34.6, 54.153.93.219} targets: {54.90.129.115, 54.87.27.55, 54.67.34.6} repair decision: NONE | 50.19.22.61 | 229 | shard 25
bd20ff20-94d5-11e6-925a-000000000003 | bd212cff-94d5-11e6-925a-000000000003 | read_digest: sending a message to /54.87.27.55 | 50.19.22.61 | 252 | shard 25
bd20ff20-94d5-11e6-925a-000000000003 | bd2131af-94d5-11e6-925a-000000000003 | read_digest: sending a message to /54.67.34.6 | 50.19.22.61 | 372 | shard 25
bd20ff20-94d5-11e6-925a-000000000003 | bd21335f-94d5-11e6-925a-000000000003 | read_data: sending a message to /54.90.129.115 | 50.19.22.61 | 415 | shard 25
bd20ff20-94d5-11e6-925a-000000000003 | bd21759b-94d5-11e6-925a-000000000003 | read_data: got response from /54.90.129.115 | 50.19.22.61 | 2111 | shard 25
bd20ff20-94d5-11e6-925a-000000000003 | bd217950-94d5-11e6-925a-000000000003 | read_digest: got response from /54.87.27.55 | 50.19.22.61 | 2206 | shard 25
bd20ff20-94d5-11e6-925a-000000000003 | bd417280-94d5-11e6-925a-000000000003 | read_digest: got response from /54.67.34.6 | 50.19.22.61 | 211747 | shard 25
bd20ff20-94d5-11e6-925a-000000000003 | bd41750f-94d5-11e6-925a-000000000003 | Done processing - preparing a result | 50.19.22.61 | 211812 | shard 25
(10 rows)
If we look at the traces above we may immediately see that one of the targets of this query is a Node 54.67.34.6 from the remote DC. We see then that 50.19.22.61 (our Coordinator from us-east DC) sends a digest request to 54.67.34.6 from us-west and it takes 211375us till the response arrives.
That was clearly a reason why that query was so slow. – Problem found!
A fix for this problem may be choosing a different CL, e.g. LOCAL_QUORUM.
If we run all the steps above but choose a consistency level LOCAL_QUORUM instead of QUORUM we will notice that there wouldn’t be any new entry added to the system_traces.node_slow_log and that’s why:
1243us!!! As you may see – it’s an absolutely different story!.. 😉
Final notes
Unlike other databases, we have decided not to hold back such an important functionality and to provide this in our open source version. Query trace is a basic feature, almost as queries themselves. Additionally, we went all-in and we trace not just the culprit query but all of the intermediate timing traces which help to realize why the query is slow. It’s not enough to have a separate detailed- trace mechanism. In many cases, the same query execution is not going to reproduce the long latency that was caused the query to be recorded as slow in the first place. Therefore we trace both the query and its execution trace.
This, of course, doesn’t come for free.
On AWS cluster with 3 c3.8xlarge Nodes when a Slow Query Logging was enabled with a really huge threshold the throughout in a standard cassandra-stress read test behaved as follows (number of threads was controlled with a `-rate threads=N parameter):
- for a single thread test dropped 0.5%: from 903.2 to 898.6 ops/sec
- for 250 threads test dropped 5.09%: from 116314.13 to 110390.63 ops/sec
- for 500 threads test dropped 17.6%: from 176246.63 to 145297.75 ops/sec
The latency wasn’t influenced at all: 1.1 ms (measured in a single thread test).
Naturally, the more loaded CPU-wise the system is, the heavier is a penalty for extra CPU the Slow Query Logging requires.
So, although the feature is disabled by default, it’s quite safe to enable it on systems where each shard is not more than 50%-60% loaded (use scyllatop "*reactor*load*"
to verify that).
node_slow_log table schema
Below is a description of system_traces.node_slow_log schema and its columns.
Columns description
- start_time and date: time when the query has began
- node_ip: Address of a Coordinator Node
- shard: shard ID on a Coordinator, where the query has been handled
- command: the query command, e.g. select * from my_ks.my_cf
- duration: the duration of a query handling in microseconds
- parameters: query parameters like a parameters column in a system_traces.sessions table
- session_id: the corresponding Tracing session ID
- source_ip: Address of a Client that sent this query
- table_names: a list of tables used for this query, where applicable
- username: a user name used for authentication with this query
How to enable and configure
By default slow query logging is disabled.
There is a REST API that allows configuring and querying the current configuration of this feature.
To set the parameters run:
curl -X POST --header "Content-Type: application/json" --header "Accept: application/json" "http://<Node's address>:10000/storage_service/slow_query?enable=<true|false>&ttl=<in seconds>&threshold=<threshold in microseconds>"
e.g. in order to enable the feature on a Node with an address 127.0.0.1, set the ttl to 86000s and a threshold to 10000us:
curl -X POST --header "Content-Type: application/json" --header "Accept: application/json" "http://127.0.0.1:10000/storage_service/slow_query?enable=true&ttl=86000&threshold=10000"
To get the current configuration run:
curl -X GET --header "Content-Type: application/json" --header "Accept: application/json" "http://<Node's address>:10000/storage_service/slow_query"
e.g. after the POST command above the query result will look as below:
$ curl -X GET --header "Content-Type: application/json" --header "Accept: application/json" "http://127.0.0.1:10000/storage_service/slow_query"
{"threshold": 10000, "enable": true, "ttl": 86000}
$