Companies that are delving into big data may be putting customer privacy and corporate intellectual property at risk because they haven’t thought through how their data handling practices need to change.
Just mashing up data sets can be risky, says Ian Glazer, research vice president with Gartner. “Anytime you aggregate information, you aggregate the risk associated with that information,” he says. Even when information just sits in a data warehouse, the likelihood that it will be disclosed accidentally is greater than if it were siloed.
Meanwhile, deep data mining techniques may expose information about individuals that used to be anonymous. “Computer science literature has asserted very convincingly that data we once thought were de-identified could be re-identified and linked back to individuals with relative ease,” says Omer Tene, managing director of Tene and Associates and a visiting fellow at the Berkeley Center for Law and Technology.
The results may also create new intellectual property that corporate policies don’t cover. All of which has the potential to make big data a big minefield as companies and individuals sort out how to balance business needs with privacy. “We are on the cusp of greater regulation,” says Matthew Karlyn, a Boston-based partner with the law firm Foley & Lardner.
Companies that ask the right questions now about how employees use data, how they manage new data-based products and services and how customers expect their personal information will be handled will be in a better position to protect customer privacy and corporate trade secrets.
1. Do employees understand the rules?
Insights from analyzing big data may come from the integration of many previously siloed data sources as well as the addition of new sources. Meanwhile, each data set may be governed by different rules, depending on where it was collected and why, and whether it’s subject to regulation or licensing agreements.
Access controls and corporate privacy policies go only so far, says Glazer. “We provide our information workers with very few clues,” about the data and how its use may be restricted. He compares the typical enterprise environment to a chemistry lab full of unlabeled jars: eventually, workers learn what’s in the jars only by making (possibly damaging) mistakes. “They don’t have the right information they need to handle [data] properly.”
Good metadata about why information has been collected, the company’s obligations to protect it and the parameters for using it can help to prevent misuse. “People will respect other people’s privacy so long as they realize” which uses are appropriate.
2. Do you know how you’ll protect your results?
Existing practices for handling data may not address what workers are allowed to do with the results of deep data mining, says Karlyn, who represents companies in IT and outsourcing initiatives. The algorithm that generates new insight into customer behavior—or the insight itself—might be a trade secret. But you can’t assume developers know that the analytics tools they are developing will make data more valuable to the company, and shouldn’t be shared.
“Before the concept of big data, we gathered data, we stored the data, but we didn’t really do anything with it,” says Karlyn. “Now we’re manipulating it and looking at what we’re collecting [for] how to monetize it to give the company competitive advantage.” In addition, if companies use data to create new products or services, he says, they will need to evaluate whether existing data collection policies should be revised to cover these new uses.
Meanwhile, companies that sell big data-based products need to think through how they’ll license them so that customers won’t misuse the underlying information. Data brokers, for example, may spell out how data they provide may be used.
Take LexisNexis, says Gartner’s Glazer. The company offers a variety of analytics and risk management services, including background checks. “They Hoover up an amazing amount of information and process it,” Glazer says. “They have a lot of controls about what can and can’t be combined.” In addition, the company vets its customers, a spokesman says, to “affirm that the prospective user is a legitimate organization with a permissible purpose.”
3. Are you being straight with your customers?
In addition to complying with privacy policies, any big data project should take into account the cultural norms of your industry, and customer expectations, says Glazer. “You can do something totally legal but it feels unsettling.” For example, targeting advertising online that follows the user from website to website can make some users feel uncomfortable.
Even using publicly-available information doesn’t let you off the hook. “The information might be public, but what isn’t public is the proprietary output of the analysis,” Glazer adds. Analysts correlating information combined from many sources can reveal more about a person than the information when it is used separately.
Tene says companies and individuals need to find a way to balance their interests. Beneficial innovations may depend on businesses and researchers having access to massive data collections, he and co-author Jules Polonetsky, director of the Future of Privacy Forum, argues in a Stanford Law Review Online article published in February.
“One way to reconcile this tension is providing individuals with greater access to their own information,” Tene says, “and greater transparency about the decisional criteria companies are using to analyze the data. If businesses let individuals have a piece of the pie, then [they] will be less concerned about sharing information.”
Elana Varon is contributing editor with Data Informed. Tell her your stories about leading and managing data driven organizations at firstname.lastname@example.org. Follow her on Twitter @elanavaron.