Data with Relationships Yields Insights Before Analytics

by   |   June 2, 2017 5:30 am   |   0 Comments

Dr. Jans Aasman Ph.D, CEO, Franz Inc.

Dr. Jans Aasman Ph.D, CEO, Franz Inc.

Data’s utility is not rooted in amassing the largest quantities of data; it’s predicated on understanding how data relates–both to itself and to business objectives–in a timeframe that fits the speed of business.

What are Data Relationships?

Relationships between data can be formed by linking data based on meaningful statements within a Semantic Graph or triplestore database. These relationships provide a form of machine intelligence that is essential to expanding the understanding of how data relates to each other. Triplestores use these statements as the basis for providing further inferences about the way that data interrelates.

For example, the EMR of a hospital would have data about a patient that will be expressed in triples like:

Patient X takes Drug Aspirin

Patient X takes Drug Insulase

Combining the EMR data and publicly available medical drug database would have triples such as:

Chlorpropamide has the brand name Insulase

Chrolpropamide has Drug Interaction with Aspirin

Based on these data relationships, the reasoning in the triplestores will instantly conclude that Patient X is at risk of a drug interaction.

Such an example illustrates the instantaneous usefulness of relationships between data. Creating relationships between data enables “intelligent inferencing”: –in other words-data learning from data. Such inferencing is invaluable when leveraged at scale and accounting for the numerous subtleties existent between big data.

Leveraging data relationships quickly, cost-effectively, and in a user-friendly format across industries is the surest way to entrench data culture and maximize data’s worth to the enterprise. Whether identifying time-sensitive investment opportunities in finance, treatment possibilities in healthcare, or apposite clinical trials in life sciences, extracting value unmistakably hinges on exploiting relationships between data.

Auto-Linking Builds Relationships Rapidly

The most effectual means of determining relationships between data elements, particularly when involving unstructured and semi-structured data, is to allow the data to indicate those relationships themselves.

This capacity eludes traditional relational methods, in which users have to construct tables to define all the relationships between data they can think of prior to analyzing them. If they want to ask a question that was not previously modeled with the requisite schema or if sources or requirements change, end users have to wait on IT for reconfiguration while transitory business opportunities pass.

But in a graph-aware environment, the data determines the relationships between one another. Semantic graphs effectively link all data in a uniform fashion with an emphasis on the edges–the links between nodes, a critical determinant for ascertaining relationships that transcends the capabilities of non-semantic property graphs focused solely on the nodes themselves.

Below is an ecommerce example where three electronic retailers (eBay, Amazon, Google) sell the same anti-wrinkle cream with slightly different pictures, quantities and description. Machine learning was built into the graph database to compute similarity between products. The purple boxes show the similarity between products, the grey boxes show the prices for these products. Note how the product might be very much the same but the prices differ widely!

ecommerce example where three electronic retailers (eBay, Amazon, Google) sell the same anti-wrinkle cream with slightly different pictures, quantities and description
This approach aligns with basic human intuition for recognizing relationships outside of external parameters for traditional schema and transcends data sources, structures and types. It endows a degree of flexibility and innate relationship awareness at the pace of modern business, vastly exceeding the capabilities of other technologies.

The data’s discernment of the relationships between individual nodes automates the data discovery process, informing users which data pertains to a particular use case (and in several instances, even how it relates). Semantic models can expedite the integration process by aligning data with existing standards that accordingly progress to include new data or concerns in a holistic manner. The pivotal point in this process is that the technology is tasked with understanding the relationships for a particular purpose, so that users simply reap the benefits with an enlightened awareness of how data relates to their specific use case.

Relationships Yield Insights BEFORE Analytics

On a basic level, graph-aware analytics are enriched with a degree of context that widely escapes users relying on other technologies. This contextualized understanding yields insight into data’s meaning prior to analytics, which significantly impacts both the questions asked and their answers.

The picture below is derived from a national database of a chamber of commerce that shows how legal entities (both C-level executives and companies) relate to each other through role relationships (like person A was a bookkeeper for company B from time t1 to time t2) but also how some of these entities are related through common addresses and telephone numbers and emails.

National database of a chamber of commerce that shows how legal entities relate to each other
Users are able to parse through their data in a highly exploratory manner leveraging relationships found in wildly disparate data sets that exist at a level so rudimentary it might otherwise evade them. The linked data approach allows for the incorporation of all enterprise data into such queries, which substantially enhances the thoroughness of analytics and its results. Given the speed at which data is incorporated from initial ingestion to analytic output, relationship-sensitive analytics empowers users with highly adaptive, tailored results for increasingly targeted use cases that are otherwise unattainable.

It’s All About Relationships!

Don’t fall for perhaps the most persistent myth in the era of big data–the value of your data is intrinsically linked to quantity. The common illusion is that the more data organizations obtain, the better their chances for deriving a competitive advantage.

But the reality is that an organization’s ability to effectively determine the relationships between its data is essential to monetize data, whether big data or otherwise, structured or unstructured, within or across data sets. Understanding data’s context, meaning and interrelations drastically revamps analytical prowess while accelerating data discovery and otherwise time-consuming integration efforts.

Failing to properly contextualize data and its meaning prior to analytics greatly limits the overall value data produces for organizations, which may otherwise fall victim to spending more on infrastructure and architecture than creating revenue from their knowledge.

 

Dr. Jans Aasman Ph.D is the CEO of Franz Inc., an early innovator in Artificial Intelligence and leading supplier of Semantic Graph Database technology. Dr. Aasman’s previous experience and educational background include:

– Experimental and cognitive psychology at the University of Groningen, specialization: Psychophysiology, Cognitive Psychology

– Tenured Professor in Industrial Design at the Technical University of Delft. Title of the chair: Informational Ergonomics of Telematics and Intelligent Products

– KPN Research, the research lab of the major Dutch telecommunication company

– Carnegie Mellon University. Visiting Scientist at the Computer Science Department of Prof. Dr. Allan Newell

 

Subscribe to Data Informed for the latest information and news on big data and analytics for the enterprise.

 

 

Tags: , , , , , , , , , , ,

Post a Comment

Your email is never published nor shared. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>