NEW YORK—“When we are talking about big data today, we’re talking about a philosophical outlook,” said Sean Gourley, CTO and cofounder of Quid, a San Francisco-based analytics company, near the start of this year’s GigaOm Structure Data two-day conference here.
That outlook suggests that while technology innovations continue to bring about ways to automate the collection of data from machines, and the analysis of that data, it is people who will direct the industry’s course and its influence on society. For example, Gourley noted that analytics currently does a great job of deciding where to position a cereal box in a supermarket but did a much less better job of solving the problem of childhood obesity.
He said that was due to big data’s current tendency to reach for low-hanging fruit, rather than bringing a more human and directed perspective to analysis. “Big data has trouble with big problems,” Gourley said.
Many of the event speakers on March 20 reinforced Gourley’s thought. But the crowd appeared young, hip and optimistic about the potential for using larger and larger datasets to power cutting-edge analytics designed to improve the prospects of companies and the lives of consumers. At the same time, participants said they see the industry is still working to overcome challenges like formulating standards of practice around data science techniques and finding answers to data privacy questions.
More than a few presenters acknowledged that “big data” was last year’s buzzword, and that today’s priorities for the industry were dealing with the often-cited three V’s of data: volume, velocity and variety. “The fourth V is ‘value’,” quipped one participant, Altaf Rupani, CTO of product development at Dow Jones.
In other words, if the big data industry expects to fulfill its promise of more analytical firepower for businesses, it will need to deliver more user-friendly ways to access data in real time, and in as many flavors and colors as possible.
Some data-driven successes are already evident, of course. EMC’s chief strategist, Paul Maritz, put forth the notion of big data analysis as a disruptive force. “Amazon does not manage brands,” he said during one discussion on stage. “Amazon manages the customer relationship.” Maritz suggested that big data analysis will continue to unearth similar pivotal moments.
Among other issues that panelists discussed on the event’s first day:
The potential impact of location data and mobile devices. Gaurav Dhillon, CEO of SnapLogic, a software company that provides “smart connectors” for linking various data sources, emphasized the importance that location data will play in the future, allowing retailers, for example to determine which part of the store a customer is in and how they can help the shopper based on that location.
So valuable are mobile devices as data collectors that Ira “Gus” Hunt, CTO of the Central Intelligence Agency (CIA), called the smartphone “a mobile sensor platform,” and hammered home the importance of operational excellence when gathering petabytes of data, since “we don’t know the future value of the data.”
The value proposition of data privacy. Privacy concerns made a few prominent appearances. Michael Palmer, head of innovation at Aetna, said that “privacy concerns are at the top of the list” for the insurance giant, which must de-identify all data before analyzing it and adhere to HIPAA rules on privacy.
But participants on a panel called “Addressing the Tension Between Personalization and Privacy” were all confident that concerns over personal security would subside as the value proposition for releasing individual data improved. “People will see the benefits of sharing data,” said David Shim, founder and CEO of Placed, a location analytics firm. Ken Chahine, a senior vice president at Ancestry.com, agreed, highlighting the benefits the online genealogy firm’s customers receive in return for submitting their DNA and other personal information.
The industry’s need to collaborate. In one of the day’s more insightful presentations, Matt Wood, principal data scientist at Amazon Web Services, underlined the importance that “collaboration and sharing” will play in terms of innovating in the big data space. He counseled against the “snowflake data science” approach, asserting that “reproducibility is increasingly important” when implementing best-of-breed analytics approaches.
Alec Foege, a contributing editor at Data Informed, is a writer and independent research professional based in Connecticut, and author of the book The Tinkerers: The Amateurs, DIYers, and Inventors Who Make America Great. He can be reached at firstname.lastname@example.org. Follow him on Twitter at @alecfoege.