Mutant Monitoring Systems (MMS) Day 5 – Visualizing Data with Zeppelin
This is part 5 of a series of blog posts that provides a story arc for Scylla Training.
In the last post, we simulated that Division 3 was under attack and learned how to recover from a node failure, consistency levels, how to repair the Scylla cluster. Now that operations are back in a good operating state, we can get back to the important part of the Mutant Monitoring System, analyzing data. In this post, we will learn how to visualize the data in MMS with Apache Zeppelin.
Apache Zeppelin is a Java Web-based solution that allows users to interact with a variety of data sources like MySQL, Spark, Hadoop, and Scylla. Once in Zeppelin, you can run CQL queries and view the output in a table format with the ability to save the results. Also, the query can be visualized in an array of different graphs. To get started with Zeppelin, let’s build and run the container.
Building and Running the Scylla Cluster and Zeppelin
First, the Scylla Cluster should be up and running with the data imported from the previous blog posts. The MMS Git repository has been updated to provide the ability to automatically import the keyspaces and data. If you have the Git repository cloned already, you can simply do a “git pull” in the scylla-code-samples directory.
git clone https://github.com/scylladb/scylla-code-samples.git
Modify docker-compose.yml and add the following line under the environment: section of scylla-node1:
Now the container can be built and run:
docker-compose up -d
Roughly after 60, the existing MMS data will be automatically imported.
To build and run Zeppelin in Docker, run the following commands:
docker build -t zeppelin .
docker run --name zeppelin --network mms_web -p 9080:8080 -d zeppelin
Shortly after running the above commands, you should be able to access the Zeppelin Web Interface in your browser at http://127.0.0.1:9080/.
Now that Zeppelin is loaded, we can get started with creating a Notebook. A Notebook in Zeppelin is basically a collection of data from a source like Scylla. Click Notebook at the top of the web console and click “Create new note”. Type MMS for the Note Name and choose Cassandra for the Default Interpreter. Finally, click Create Note.
We can use Zeppelin to run queries just like cqlsh. Let’s view all of the tracking data like we did before by typing the following in the query box:
The data can be visualized in a variety of graphs by choosing one from the toolbar:
Let’s visualize the heat readings from the Tracking system for each mutant by clicking on the Area Chart button. Once the Area Chart has been chosen, click on the settings link and adjust the values as shown:
We can see that Jim Jeffries had the highest heat signature. If you recall from Day 3, Jim has been known to have increased heat levels around 20 or higher when he receives negative feedback on his comedy performance. Also, he will likely injure them out of anger. Because of this, we alerted the authorities.
Let’s use the information available to see which mutant is likely the fastest with a Line Chart with the following settings:
Here we can see that Bob Loblaw was the fastest mutant. Also, Jim Jeffries should probably hit the gym and practice cardiovascular exercises a few times a week.
Luckily there was no havoc done to Division 3 or the general public today. With this downtime, we were able to focus on a new way to analyze data from the Mutant Monitoring System. With Apache Zeppelin, we can easily visualize our queries and turn them into useful graphs. Please be safe out there as we continue to track the mutants and evolve our Monitoring System.