Hello again! Following up on our previous post on saving data to Scylla, this time, we’ll discuss using Spark Structured Streaming with Scylla and see how streaming workloads can be written in to ScyllaDB. Our code samples repository for this post contains an example project along with a docker-compose.yaml file with the necessary infrastructure for running the it. We’re going to use the infrastructure to run the code samples throughout the post and run the project itself, so start it up as follows: After that is done, launch the Spark shell as in the previous posts: With that done, let’s […]
Welcome back! Last time, we discussed how Spark executes our queries and how Spark’s DataFrame and SQL APIs can be used to read data from Scylla. That concluded the querying data segment of the series; in this post, we will see how data from DataFrames can be written back to Scylla. As always, we have a code sample repository with a docker-compose.yaml file with all the necessary services we’ll need. After you’ve cloned it, start up the services with docker-compose: After that is done, launch the Spark shell as in the previous posts in order to run the samples in […]
What is DC/OS? From https://dcos.io DC/OS (the datacenter operating system) is an open-source, distributed operating system based on the Apache Mesos distributed systems kernel. DC/OS manages multiple machines in the cloud or on-premises from a single interface; deploys containers, distributed services, and legacy applications into those machines; and provides networking, service discovery and resource management to keep the services running and communicating with each other. Scylla on DC/OS A centralized management system is often used in modern data-centers, and lately the most popular and in-demand type of such a management system is centered around running and controlling containers at scale. […]
In this post we discuss the enhanced filtering support coming in Scylla 2.4 and compare it to the recommended alternatives and their performance.
The Mutant Monitoring System series has come to an end. In this post, we will summarize each day of the training series and explain what readers can learn.
Welcome to part 1 of an in-depth series of posts revolving around the integration of Spark and Scylla. In this series, we will delve into many aspects of a Scylla/Spark solution: from the architectures and data models of the two products, through strategies to transfer data between them and up to optimization techniques and operational best practices. The series will include many code samples which you are encouraged to run locally, modify and tinker with. The Github repo contains the docker-compose.yaml file which you can use to easily run everything locally. In this post, we will introduce the main stars […]
In this installment of the MMS series, we look at the features of the Mutant Monitoring Web Console written in Node.js and how it works.
In this installment of the MMS series, we take a look at how to store binary blobs into a Scylla cluster using the Java programming language.
Division 3 now wants to dive back into data analytics to learn how to prevent the attacks. In this installment of the Mutant Monitoring series, we take a look at how to use Apache Spark, Hive, and Superset to analyze and visualize data from the Mutant Monitoring System.
In this installment of the MMS series, we take a look at Materialized Views. Material Views automate the tedious and inefficient work that must be done when an application maintains several tables with the same data that’s organized differently.