Advancements in a number of technologies over the past 15 years have enabled enterprises to access and analyze data like never before.
Yet one increasingly important factor in leveraging data is decidedly low-tech: People, or more specifically, a crowd of people.
Crowdsourcing—the use of a large number of people to create, collect, filter and analyze huge amounts of data—is becoming a labor source option as Internet-enabled workers anywhere in the world can contribute to a data project on specific tasks like testing the accuracy of an algorithm.
This crowdsourcing differs from Wikipedia, the online encyclopedia which allows anybody who can connect to the site to add and edit information. With the exception of its small staff, Wikipedia’s contributors are unpaid volunteers. In contrast, a number of crowdsourcing services providers now offer their own paid “crowds” to enterprises that need huge amounts of data collected and manipulated.
“The brilliant thing about crowdsourcing is it can do just about anything,” says David Bratvold, founder of Daily Crowdsource, a website devoted to crowdsource-related news and education.
That includes handling huge amounts of data, Bratvold says.
“You can use crowdsourcing to do big data analysis,” he says. “You can use crowds to collect data. Say you want to collect thousands of email addresses; crowdsourcing can do that. If you want to cleanse it, make sure it’s accurate, validate it, crowds can do all of that too.”
Bratvold says that top players in the field include Serv.io, TopCoder, Amazon Mechanical Turk, Smartlink, Lingotek and Lionbridge Technologies. The website Crowdsourcing.org, includes a list of nearly 2,500 crowd labor providers covering a wide range of tasks.
Lionbridge, a $450 million Massachusetts-based public company that got its start in 1996 providing translation and localization services, launched its crowdsourcing division about 10 years ago. It now generates about $150 million in revenue, according to Dori Albert, Lionbridge’s enterprise crowdsourcing practice manager.
“We built a portal so we could hire translators as contractors from all over the globe,” Albert says. “Then some of the more forward-thinking tech companies—the Googles and Microsofts of the world—asked if we could build them a virtual team of workers that they would pay by task to do all kinds of things, but essentially mostly search relevance testing, rating search results to test algorithms.”
From that beginning, Lionbridge has assembled a “crowd” of more than 100,000 Internet users based in more than 100 countries. Albert says Lionbridge keeps them busy.
“The response to our business process crowdsourcing service has been unbelievable,” she says. “We’re having a hard time keeping up with demand right now.”
Many Hands Make the Repetitive Business Process Lighter
Bratvold says it’s important to understand that “crowdsourcing isn’t a solution, it’s a work process.”
“It’s ideal for stuff that’s very repetitive,” he says. “What you do with crowdsourcing is you break it down into tiny steps. You want to tag half a million images? Rather than hire one guy to do them all, hire half a million people and be done by the afternoon.”
Crowdsourcing repetitive, process-oriented tasks can save an enterprise time and money, Bratvold says, adding that the process can improve the quality of the data.
“Just imagine you personally tagging 4,000 photos,” he says. “After awhile you start wondering, ‘Did I just tag those last 10 photos correctly?’ Then you’ve got to go back and check. And before too long you’re going to start tagging them wrong, it’s just the nature of things.”
Crowdsourcing data projects help reduce human error by spreading the load, Bratvold says.
“You can even have someone else tag it a second time just to ensure it’s accurate,” he says. “The process removes so much inefficiency that you can’t help but save money.”
Of course, assembling an anonymous crowd to perform menial tasks for small amounts of pay introduces potential quality control issues. To draw on Bratvold’s example, what’s to stop a lazy crowd laborer from tagging every photo they handle with the word “dog”?
Albert says that’s why it’s imperative to offer large customers a level of quality control.
“What we have found is that if you want to provide enterprise-level services to a business, no one is going to buy a workforce where they have no idea who it is,” she says. “We’re taking the concept of crowdsourcing and blending that with our project management, governance and service levels that customers would be used to from a large traditional outsourcing engagement. That’s what differentiates us from the ‘wild crowd.’”
Bratvold says enterprises that want to crowdsource data projects “need to learn what they’re doing.”
“I always tell people to start very slow because crowdsourcing will speed up whatever you’re doing,” he says. “So if you’re doing something right it will speed up and exaggerate that, and if you’re doing something wrong it will speed up and exaggerate that.”