Dark Data Compliance: Fuhgettaboutit

by   |   February 12, 2016 5:30 am   |   0 Comments

Editor’s Note: Jim Sterne is the founder of the eMetrics Summit, taking place April 3-6 in San Francisco. For more information about the event and to register, click here. Use the Data Informed code DIPAW15 for 15 percent off of your registration.

The eMetrics Summit is co-located with the Predictive Analytics World Conference. For more information about that event and to register, click here. Use the Data Informed code DIPAW15 for 15 percent off of your registration.

 

Jim Sterne, Founder, eMetrics Summit

Jim Sterne, Founder, eMetrics Summit

There are a couple of bits of the European data protection law that are logical and civil, and a couple of bits that are inconceivable and impossible. Let’s take, for example, the Right to Be Forgotten. It’s simply impossible.

For decades, we in marketing have tried our very best to embrace customer centricity and the teachings of Peppers and Rogers in their 1996 book, The One to One Future. Treat each customer as an individual, just like the kindly, old shopkeeper in the general store, out West.

In all this time, the most intractable problem has been trying to match up an individual between data silos. Sales force automation systems, customer relationship management systems, database marketing efforts, accounting systems, and customer contact management systems all have been rather self-centered and not very good at sharing information.

 

The Unbearable Heaviness of Harmonizing

Through the magic of APIs, ETL, data warehouses, data lakes and clouds, we have strived mightily to get these streams to flow together and create a 360-degree view of the customer. If only we could get all of those disparate collections of attributes to play nicely with each other, we could treat our customers individually. Or, at least, treat them in segments finer than gender and age bucket. Even microsegments are more aspirational than rational.

It’s the perennial struggle. All of the sins of the schema designers are visited upon those who would free the data for the benefit of customer-kind. Data is captured and stored for unique purposes in unique ways by unique people with very unique ideas about what constitutes “clever” and “elegant.”

Once the spigots open and data starts to flow, the lakes grow, the clouds condense, and hopes rise. A giant analytics engine is then wheeled in to slice, dice, and mulch data in ways heretofore unimaginable.

A Toolbox of Analytics Modelers

Related Stories

With No Safe Harbor Agreement, Businesses Eye Other Options.
Read the story »

A European View of Consumer Data Privacy.
Read the story »

How to Navigate the Thorny Issue of Data Privacy.
Read the story »

Emotion Data Spotlights the Balance Between Insight and Privacy.
Read the story »

But the engine is not as nimble as it might be for asking this question in this way by this analyst, so he downloads an open-source tool, pulls a sample data set and, perhaps, mingles it with another stream from elsewhere to scratch that curiosity itch.

A different analyst has a very simple report to churn out and simply uses Excel, a perfectly valid choice that results in a half a dozen spreadsheets on his laptop.

The boss’s boss’s boss emails that a certain number is needed for the meeting she’s attending first thing in the morning, which causes a flurry of cubes to be generated, data to be shared, and multiple versions of reports to be dropped into a myriad of PowerPoint decks.

The above is done every day by hundreds of analysts in any given enterprise. It’s how creative people get work done. It’s a wonderful feeling, being able to access what’s needed, drop it into the most appropriate and convenient tool, and follow one’s iterative thoughts.

But this is “dark data.” It is not controlled, managed, noted, logged, classified, protected, or archived.

Can You Forget Me Now?

And then the call comes from Angela Merkel’s lawyer. She wants to exercise her right to be forgotten. After all, it’s the law.

In all the years we have tried to get all of our data talking together for the benefit of our customers – and failing – there is now a legal requirement to get all of our data talking together in order to forget our customers.

No technical magic was delivered with the legislation. We were not able to walk on water before the EU decided this was a good idea and, lo and behold, we cannot walk on water now.

But wait, it gets better.

Another tenet of the data-protection rules is a right to data portability. That is, the right to have your personal data transferred between service providers.

Whose Data Is It, Anyway? 

If I wish to move from one health care provider to another, it would be very valuable to take my test results and heath care history with me. But imagine if I asked Amazon to send all of the data about everything I have purchased and everything I have searched for … to Walmart.

Inconceivable.

The only sound louder than the entire data team laughing will be the hordes of lawyers filing injunctions.

There is a movement afoot to see to it that customers actually own their own data. The Vendor Relationship Management Project out of Harvard puts the shoe on the other foot and suggests that customers should manage the relationship with their vendors, rather than how things stand today.

That means battle lines will be drawn between the public (just as with the European Commission) and commercial enterprises.

Businesses will be placed in a position of drawing a line in the sand somewhere between the data they have collected about customers, the data they have purchased about customers, the data they have derived or inferred about customers, and the meta data they have generated about their data about their customers. Should customers have the right to make you forget all of that?

What to Do?

For now, your best steps are to hope for the best and prepare your data to be much better documented and managed. Data-stream stewards will need to be able to verify, validate, and vindicate every bit and, when required by law, to intentionally forget about it.

Your first step in the right direction will be to create a customer data classification system in order to establish a data attribute catalog of the data you have now, the data you may collect in the future, and the data we don’t yet know will be possible to derive.

Jim Sterne is founder of the eMetrics Summit and co-founder of the Digital Analytics Association. He has consulted to some of the world’s largest companies, lectured at MIT, Stanford, USC, Harvard, and Oxford.

Editor’s Note: Jim Sterne is the founder of the eMetrics Summit, taking place April 3-6 in San Francisco. For more information about the event and to register, click here. Use the Data Informed code DIPAW15 for 15 percent off of your registration.

The eMetrics Summit is co-located with the Predictive Analytics World Conference. For more information about that event and to register, click here. Use the Data Informed code DIPAW15 for 15 percent off of your registration.


Subscribe to Data Informed
for the latest information and news on big data and analytics for the enterprise.







Tags: , , , ,

Post a Comment

Your email is never published nor shared. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>