NoSQL Case Study: FireEye

Learn how FireEye accelerated Threat Analysis by 1000% with ScyllaDB as a Back-end to JanusGraph.

About FireEye

FireEye Threat Intelligence is a portfolio of subscriptions and services that help their customers combat cyberthreats. Subscriptions range from analyst generated reports delivered via email, to API-level access to the full suite of intelligence services. Forrester Research positioned FireEye as the only company in the ‘leader’ quadrant in the Threat Intel Services market sector.

As a thought-leader in cyberintelligence, FireEye experts are often called upon by the media to provide analysis and perspectives on this topic. As a recognized and very visible thought-leader in the industry, FireEye teams are under tremendous pressure to deliver timely, accurate and trustworthy industry-leading cybersecurity solutions.

Learn about how FireEye was able to accelerate Threat Analysis with ScyllaDB in the NoSQL Case Study below.

The Challenge with Database Scaling

FireEye’s Threat Intelligence application centralizes, organizes, and processes threat intel data to support analysts. It does so by grouping threats using analytical correlation, and by processing and recording vast quantities of data. Data objects range widely, from DNS data, RSS feeds, domain names to URLs. Overall, the graph represents relationships among actors — for example, hackers and criminal organizations — to threat vectors represented by URLs,email or IP addresses, and content files containing the fingerprints of malware. In this way, the graph can provide rich interrelationships on webs of properties that define, for example, an email phishing campaign.

The graph size has grown to 500M nodes, with around 1.5B edges connecting them. Each node has more than 100 associated properties, which are aspects accumulated over several years. Using this array of billions of properties, analysts are capable of asking trillions of questions.

“Using ScyllaDB turned out to be a game-changer in terms of performance and the types of analysis our application is able to do effortlessly.”

– Krishna Palati, Sr. Manager, DevOps Engineering, FireEye

FireEye’s legacy system used PostgreSQL with a custom graph database system to store and facilitate analysis of threat intelligence data. As their user base increased they ran into scaling issues requiring a system redesign with a new platform. FireEye’s existing solution was not up to the task. As the team of analysts grew into the hundreds, the system’s limitations became painfully apparent.

The team had become a victim of their own success. The PostgreSQL-based system was slow, difficult to scale, and was not distributed or highly available. The team’s objective was to completely replace the existing system with a new scalable, highly available distributed system.

ScyllaDB Outperforms Apache Cassandra, Berkeley DB, and Apache HBase

The team started by evaluating a selection of graph databases, including OrientDB, Synapse, AWS Neptune, and JanusGraph. Based on their evaluation, JanusGraph provided the best support for FireEye’s use case. Having a pluggable backend, the team also neededhttps://www.scylladb.com/resources/introduction-to-apache-cassandra/ to identify a compatible backend storage solution. JanusGraph provides support for four databases by default: ScyllaDB, Apache Cassandra, Berkeley DB, and Apache HBase.

Based on the technical evaluation, ScyllaDB was a clear winner. ScyllaDB beat the other databases not only in raw performance but also in manageability. Some aspects of ScyllaDB that appealed to the team are its ease of setup and auto-tuning capabilities. Migrating their data from PostgreSQL, the team also noticed that ScyllaDB provided an 80% compression rate. Notably, the project went from concept to deployment in only 4 months.

“A system is only as strong or as performant as its underlying database solution. ScyllaDB clearly was a winner for us,” said Krishna Palati, Sr. Manager, DevOps Engineering, FireEye. “GraphDB on top of a relational database was really horrible! It would take anywhere from 30 seconds to two or three minutes for workload queries to return. ScyllaDB was a game-changer for us.”

By deploying ScyllaDB, the team has been able to achieve exceptional performance. For example, a query that traverses 15,000 graph nodes returns results in about 300ms; 100x faster than the existing system.

FireEye-comparison-chart-1
FireEye measured a 10x increase in speed using ScyllaDB

While ScyllaDB Cloud seemed a good option, the team chose to run ScyllaDB themselves within a secure enclave guarded by an NGINX gateway. Today, the ScyllaDB solution is deployed on AWS i3.8xarge instances in 7-node clusters. Each node is provisioned with 32 CPU, 244MB of memory, and 16TB SSD storage. Once the team gained experience with the system, they were able to dramatically slash the storage footprint, while preserving the 1000-2000% performance increase they had experienced by switching to ScyllaDB. Ultimately they reduced AWS spend to 10% of the original cost.

But the real gain was found in business efficiency. M-Trends is the annual publication based on FireEye Mandiant frontline investigations. It reports on high-interest, significant cyber attacks across multiple industries and regions. Previously, generating the report required the efforts of 1,500 analysts working around the globe. With the new ScyllaDB-based system, the report can be generated by 5 analysts within a couple of days.

The team attributed their efficiency and success to ScyllaDB’s technical resources and community. “We are so impressed by people like Dor, who is the CEO of the company, who is jumping on and responding to customers’ questions and requests,” Palati added. “That’s unheard of!”

Based on this successful rollout, FireEye is looking for opportunities to expand their usage of ScyllaDB across the company. The team has executive buy-in to seek out opportunities to replace legacy NoSQL deployments, in particular, Apache Cassandra.