We’ve spent the first quarter of 2017 researching how developers build a wide range of mission-critical solutions. In our effort to better understand how open source solutions contribute to applications built on Amazon Web Services (AWS), we surveyed 117 active developers on AWS who attended the 2017 AWS Summit in San Francisco about their NoSQL database adoption.
We had the opportunity to meet more than 1,000 of the 6,500 attendees at AWS Summit in San Francisco this year and determined that this group is statistically significant due to the enormous impact that San Francisco Bay Area has to technology, society, and the economy both here in the United States and abroad.
NoSQL Survey highlights
Our summary of the findings includes these major highlights:
More than 50% of respondents use Docker in production with a variety of container solutions used by more than 63% of respondents. 20% of respondents used more than one container service in production.
More than 90% of respondents use MySQL in production with NoSQL databases including MongoDB, Postgres, Redis, and Apache Cassandra in the top 5.
As a young vendor, we’re happy to see the buds of adoption. At the present, we have dozens of proof of concept trials of Cassandra migrations and several MongoDB ones and thus we look forward to growing at hundreds of percents in the next year.
An interesting question is about the size of the data stored in these databases, surely users tend to place a smaller amount of data on MySQL than other big data solutions. We’ll add it as a question next time.
Latency the most common database headache suffered by 25% of respondents followed by complexity, scalability, and cost.
Latency, management, and scalability are the biggest headaches that AWS developers are having with their database needs. Let’s first take a look at latency. NoSQL databases these days are usually Java-based technologies that utilize large amounts of system resources such as CPU and memory. These solutions suffer from sudden latency hiccups, expensive locking, and low throughput due to low processor utilization and garbage collection.
Management and scalability are tied for second as a headache. Time is a valuable resource in an organization. Based on these results, administrators are probably losing too much of it trying to manage and scale their database to keep up with business needs. Users are tired of endless tuning a huge matrix of knobs. Good databases should automatically and dynamically tune the workload. As for scaling, sometimes reality bites and not all databases live up to the promise. Before marrying a database, make sure it scales easily.
Activity tracking, heavy read/write, and time series lead as top NoSQL use cases.
Activity tracking and monitoring make the leading use case. It is no surprise and there are so many variations of tracking. From AdTech (tracking web visitors) to cyber (tracking what desktop processes touch), Mobile application tracking, location based tracking, banks, and other e-commerce businesses that track buyer and visitor activities and so forth. Almost everything is collected, measured and analyzed and the data needs to be consumed in real time, 24×7 at multiple physical locations.
The goal of this survey is to benchmark a statistically significant sample of data on the criteria and results of their NoSQL decisions. Data is provided as a benchmark only. Questions or comments? Feel free to discuss over Twitter via @scylladb.