Graph Database

Graph Database Definition

Graph databases can be used to digitally map relationships for a host of real-world uses in enterprise and business computing. A graph database, sometimes called a semantic database, is a software application designed to query, store, and modify network graphs. Graph databases are a type of NoSQL database.

In graph database systems, each entity is stored as a node. Each relationship between two nodes is represented by an edge. The network graph model is itself the visual construction of this network of relationships made up of nodes and edges, and this is what is used to store data instead of tables, or documents. A family tree is a simple example of a basic graph database.

Graph database platforms are ideal for analyzing relationships, which is why there has been a lot of interest in using graph databases to mine connected data from social networks, power recommendation engines, and manage complex supply chains.

It is possible to rapidly traverse relationships or joins in graph database solutions because they are persisted in the database rather than calculated at query times. The advantages of graph databases make them well-suited for various applications. Social network analysis and fraud detection include some of the key use cases of graph databases. Each involves the need to create relationships between data and rapidly query them.

As a NoSQL graph database example, a social network platform might show you a list of people you might know based on a search from a graph database; the results are showing your connections on the site, (the nodes) and their relationships and “friends of friends” (edges).

Graph Database FAQs

What is a Graph Database?

Graph databases store data using topographical data models. Nodes in graph databases can represent companies, customers, or any other entity or amount of data. Graph database administrators can create usable data models, even as they scale high data values.

How do graph databases work to query information?

Different graph databases share several similarities:

Data storage
Data is recorded and represented in a topographical schema
Users retrieve data using query language

The structure of graph databases vary. Some businesses use RDF databases, a type of NoSQL graph database sometimes referred to as a triple store that retrieves “triples,” data organized based on a subject-predicate-object relationship.

For example:

Joe > is friends with > Jane

If the indexer programming wants to use graph analytics to learn about Joe and they have data on Joe’s friends, they only need to search the graph database for that combination of triples ( :Joe :is friends with ? ). There is probably other data on Joe, too, like where he shops ( :Joe :shops at ? ) or what he listens to ( :Joe :listens to ? ).

Types of Graph Databases

There are two basic types of graph database data models: property graphs that include nodes and edges, and the more complex emphasis on relationships and analysis seen in knowledge graphs. Knowledge graphs include RDF graphs (Resource Description Framework) that emphasize data integration such as the one above that can focus on the semantic aspects of data and store information in triples. RDF graphs conform to a set of graph database design principles promulgated by the Worldwide Web Consortium (W3C) designed to represent statements and are best for providing rich semantics and inferences from data and representing complex metadata and master data.

Indexing strategies for both types of graphs are generally similar although differences remain. Over time, the architectural distinctions between knowledge graphs and property graphs are likely to become less important.

Advantages and Disadvantages of Graph Databases

Among the main advantages of graph databases over relational databases are the more flexible, high-performance graph format for identifying and analyzing distant connections between data based on factors such as quality or strength of relationships. Speed is another of the important benefits of graph databases. Because graph databases store relationships, queries run much more rapidly and users need not execute endless join operations.

The main graph database disadvantages are: a lack of a standardized query language and graphs which are less appropriate for transactional-based systems.

Graph Database Use Cases

Fraud Detection

Real-time fraud detection systems are among the most advanced graph database applications. Graph databases highlight relationships and queries that can show when flagged credit card numbers or email addresses are being used or when multiple people in different physical locations are associated with the same IP address or personal email address. Graph analytics helps establish patterns between nodes—here, showing anomalous behavior between (cardholders), purchase categories, purchase locations, terminals, transactions,etc.

Recommendation Engines

Ecommerce recommendation engines are another example of when to use graph databases. Graph databases for big data allow users to graphically store relationships between categories of data such as friends, interests, and purchase history. For example, you can see what trusted friends buy, or what people who follow the same hobbies use to pursue them.

Social Network Analysis

No introduction to graph databases would be complete without a discussion of social networks and social media analysis. Social networks are the perfect use case for graph databases because they can manage multi-dimensional connections and engagements between many nodes. A social network graph analysis can determine:

Number of nodes/User activity
Connection density/User influence
Two-way engagement/Connection density and direction

Graph analytics make it possible to identify complex patterns rapidly and filter bot accounts, for example.

Graph Database vs Relational Database

A NoSQL graph database stores data as a network graph and prioritizes relationships between data.

Relational databases store data in relational tables defined by rows and columns. Each row can be linked to other rows in other tables because it is identified by a unique key. There is also a primary identifying key for each individual table that corresponds with the information within the table.

Graph databases are made up of nodes, edges, and the relationships between them. Nodes represent particular entities, and edges represent connections between nodes. Graph databases are designed to be scalable and flexible, and store the data relationships themselves as data. This emphasis on data relationships helps users explore complex data sets and make connections between data points.

Relational databases infer a relational focus between columns of data tables, not data points. It is easy to add data to either kind of database. However, because relational databases require complex joins on data tables to perform complex queries, they are typically faster in graph databases.

Does ScyllaDB Offer Solutions for Graph Databases?

Yes. ScyllaDB is an ideal data storage layer for graph databases like JanusGraph, which can plugin to NoSQL databases like Apache HBase, Google Cloud Bigtable, Oracle Berkeley DB Java Edition, Apache Cassandra, and ScyllaDB for the data storage layer. With ScyllaDB, users get low and consistent latency, high availability, up to x10 throughput, ease of use, and a highly scalable system.

A group at IBM compared using ScyllaDB as the JanusGraph storage backend vs. Apache Cassandra and HBase. They found that ScyllaDB displayed nearly 35% higher throughput when inserting vertices than HBase and almost 3X Cassandra’s throughput. ScyllaDB’s throughput was 160% better than HBase and more than 4X that of Cassandra when inserting edges. ScyllaDB performed 72% better than Cassandra in a query performance test and nearly 150% better than HBase.

Learn more about using ScyllaDB as the data storage layer for open-source graph databases.

ScyllaDB University

Get started on your path to becoming a ScyllaDB expert.

Apache® and Apache Cassandra® are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. Amazon DynamoDB® and Dynamo Accelerator® are trademarks of Amazon.com, Inc. No endorsements by The Apache Software Foundation or Amazon.com, Inc. are implied by the use of these marks.

Why ScyllaDB?

Is ScyllaDB right for me?

ScyllaDB University

ScyllaDB Blog

Graph Database

Graph Database Definition

What is a Graph Database?

Advantages and Disadvantages of Graph Databases

Graph Database Use Cases

Graph Database vs Relational Database

Does ScyllaDB Offer Solutions for Graph Databases?

Trending NoSQL Resources

ScyllaDB University

Start scaling with the world's best high performance NoSQL database.