Graph Databases and the Connections in Data [Updated]

by   |   March 31, 2016 12:30 pm   |   2 Comments

Emil Eifrem, CEO, Neo Technology

Emil Eifrem, CEO, Neo Technology

As the NoSQL sector continues to attract attention, graph databases are generating real and lasting excitement. In fact, interest in this sector has grown by a whopping 500 percent in the last two years alone. Forrester Research has reported that graph databases – the fastest-growing category in database management systems – will reach more than 25 percent of enterprises by 2017.

Despite their market momentum, some people still consider graph databases to be mysterious. But graph databases use intuitive principles that are similar to tasks we perform on a daily basis. Relational database management systems, on the other hand, have a comparatively steep learning curve. If you have ever worked out a route on a mass transit map or followed a family tree, you have manually run your own graph-based query.

In fact, it’s likely that you have come across a product or service powered by a graph database within the last few hours. Many everyday businesses have created new products and services and re-imagined existing ones by bringing data relationships to the fore. That’s because graph databases model, store, and query both data and their relationships, which is crucial for next-generation applications that feature use cases such as real-time recommendations, graph-based search, and identity and access management.

For example, Walmart, which deals with almost 250 million customers weekly through its 11,000 stores across 27 countries and through its retail websites in 10 countries, wanted to understand the behavior and preferences of online buyers with enough speed and depth to make real-time, personalized, “you may also like” recommendations. By using a graph database, Walmart is able to connect masses of complex buyer and product data quickly to gain insight into customer needs and product trends.

Related Stories

Busting 4 Myths About In-Memory Databases.
Read the story »

Data Exhaust: How Contextual Relationships Drive Engagement.
Read the story »

The Connected Cow, Contextual Awareness, and the IoT.
Read the story »

Debunking NoSQL Database Myths.
Read the story »

Zephyr Health, a San Francisco-based software company offering a data analytics platform for pharmaceutical, biotech, and medical device companies, sought to enable customers to unlock more value from their data relationships. Doing so would enable pharmaceutical companies, for example, to find the right doctors for a clinical trial by understanding relationships among a complex mix of public and private data such as specialty, geography, and clinical trial history.

Old-school SQL databases were not up to the task. Traditional SQL databases don’t handle data relationships well, and most NoSQL databases don’t handle data relationships at all. Nor are they well equipped to handle data that’s always changing – such as streams of new information coming in from doctor’s surveys.

Zephyr turned to a graph database for its capability and scale. Graph databases are designed to easily model and navigate networks of data with extremely high performance.

To fully appreciate the value of the graph, consider that early adopters of graph databases such as Facebook and LinkedIn became household names and unrivaled leaders in their sectors.

A “graph” can be thought of like a whiteboard sketch: When you draw on a whiteboard with circles and lines, sketching out data, you are drawing a graph. Graph databases store and process data within the structure you have drawn, providing performance advantages and making it easy to evolve the data model.

The Seven Bridges Puzzle

Click to enlarge.

Click to enlarge.

Far from being a recent data handling development, graph theory is nearly 300 years old and can be traced to Swiss mathematician Leonhard Euler. Euler was looking to solve an old riddle known as the “Seven Bridges of Königsberg.” Set on the Pregel River, the city of Königsberg included two large islands connected to each other and the mainland by seven bridges. The challenge was to map a route through the city that would cross each bridge only once.  Euler realized that by reducing the problem to its basics, eliminating all features except landmasses and the bridges connecting them, he could develop a mathematical structure that proved such a walk was impossible.

Today’s graphs are based entirely on Euler’s design – with land masses now referred to as a “node” (or “vertex”), while the bridges are the “links” (also known as ‘relationships” and “edges”). With graph databases, end users do not need to know anything about graph theory to experience immediate practical benefits.

Everyday Use

Graphs are a vital part of our online lives, powering everything from social media sites – including Twitter and Facebook – to the retail recommendations on eBay. Online dating also owes much of its success to the way graphs can analyze even the most complex relationships, looking not only at location and personal details, but also passions, hobbies, and attitudes, and relationships between all of those things, to identify potential matches. In addition, enterprise efforts in fraud detection, master data management, and network and IT operations are vastly improving thanks to relationship-based insights rooted in graph database usage.

Interest in the graph will continue to grow. The real-time nature of a graph database makes it an excellent platform for unlocking business value from data relationships that simply can’t be identified using traditional SQL or most NoSQL databases. The uses and applications for graph databases seem endless, and it’s exciting to consider what innovations they will continue to power as the world unlocks the value of data relationships.

Emil Eifrem is CEO of Neo Technology and co-founder of Neo4j, the world’s leading graph database. Before founding Neo, he was the CTO of Windh AB, where he headed the development of highly complex information architectures for Enterprise Content Management Systems. Committed to sustainable open source, he guides Neo along a balanced path between free availability and commercial reliability. Emil is a frequent conference speaker and author on NoSQL databases, and tweets at @emileifrem.

Subscribe to Data Informed
for the latest information and news on big data and analytics for the enterprise.

Tags: , , , , , , ,


  1. Adam Butkus
    Posted October 3, 2015 at 1:01 am | Permalink

    This is an excellent intro to graph databases. My group is redefining our products because we realize that this tech will scale much better and provide the kinds of answers people need. Nice job, Emil.

  2. Posted October 6, 2015 at 8:35 am | Permalink

    We build EDM systems, and a lot of the added value we bring to the table depends on our ability to model client data as some flavor of a directed graph.

    We do it all in SQL, and I am somewhat skeptical about the “revolutionary” nature of the graph DB. Isn’t it all just rows and columns at some level? How much of the graph DB “magic” is really just an encapsulation of traversal queries I write every day against my SQL Server data models?

    Or am I missing something fundamental about the new technology?

Post a Comment

Your email is never published nor shared. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>