Dice, a leading tech-industry job site, has unveiled a new service that gathers, sifts and analyzes millions of personal profiles to help recruiters identify candidates for hard-to-fill IT jobs. While the OpenWeb service, announced January 30, promises to make online background searches of job seekers more efficient, it also raises privacy questions.
OpenWeb is based on a database that culls information from 400 million to 500 million personal profiles on more than 40 social and professional networking sites to let recruiters search for the candidates they want as well as check the background of those they consider, according to the company.
“What we’re trying to do is give our recruiter customers the ability to see as complete a picture as possible of peoples’ background –their hard skills and experience and the softer side, the ambitions or preferences they express online,” said Scot Melland, chairman, president and CEO of Dice Holdings Inc.
Dice software parses and analyzes the data to find all the information relating to a single person and include all the verifiable data points into a single “super profile” far richer than those usually offered by job-seekers themselves, according to Dice CTO Bennett Smith.
Each piece of information has to be verified as belonging to a single person by appearing in profiles from three different sources under the same name or other identifying criterion, Smith said.
OpenWeb “generically, is the ‘John Smith’ problem solved,” Smith said, referring to the difficulty in verifying which “John Smith” among dozens is the real owner of a particular profile. “It’s basically the ability to put two pieces of information together with the confidence they’re both about the same person.”
A single “super profile” could contain data from LinkedIn (such as a candidate’s resume and professional connections), Facebook (personal history, likes, dislikes, photos and friends), as well as a candidate’s level of knowledge and activity from comments, questions or answers posted at online communities for application developers and technical sites such as GitHub and SourceForge.
OpenWeb profiles are constantly updated by new information from the 400 million to 500 million profiles Dice culled from other websites, which make up the base of rough data to generate new profiles. Neither the subjects of the profiles nor the sites that hosted them are consulted. Dice runs searches against other websites in the same way Google or other search engines would, Melland said.
“What makes us feel comfortable about this is that all this information is publicly available. We’re not going behind firewalls to gather information. It’s the same information recruiters could get by searching on their own for a candidate,” Melland said.
Six Years in Development
The analytics that allow the service to work is the result of more than six years of development into natural-language processing engines and analysis Dice executives consider so uniquely valuable they bought the small software company that built the core features. The company, WorkDigital Ltd., builds cloud-based analytics focusing on natural language processing, semantics and metadata to make profiles from many sites searchable from one interface, as demonstrated at its TheSocialCV.com.
OpenWeb is currently in beta testing, but should become generally available to Dice’s recruiter customers during the second quarter of this year. When the service is launched, only recruiters will have access to the profiles. Dice does plan a second service to debut probably sometime later this year, that will give subjects of the profiles a chance to see and possibly modify them, Melland said.
Melland said that Dice plans to allow profile subjects to see their own information collected in OpenWeb. The company won’t launch that part of the service until probably later this year, he said.
It’s not clear yet how much control people would have over their own profiles or how they could make corrections. Users wanting to make changes would have to verify their identities and the information, though how and when that will happen depends on “a whole gamut of [product] decisions and how much interaction is appropriate,” Smith said.
In an online demonstration of the OpenWeb service, Melland pulled up the profile of a Java developer with a specific amount of experience and geographical location. The candidate’s involvement in developer communities reveals details about his professional engagement, he said.
“You can see he is involved in projects on GitHub, which shows he’s a passionate developer, it’s not just his day job,” Melland said. “You can see he’s very involved in stack overflow issues and that the community thinks his answers are pretty good from the way his comments are rated in the communities.”
Online Profiles May Offer What a Hiring Manager Can’t Ask
There may be value for recruiters in aggregating data about candidates from many profiles, but there’s also potential risk for the recruiters and hiring managers who might run afoul of federal rules designed to prevent discriminatory hiring, according to Aleecia McDonald, director of privacy research at Stanford University’s Center for Internet and Society. McDonald is the former co-chair of the World Wide Web Consortium group working on a standard technical implementation for Do Not Track features in Web browsers, and a consultant for the Mozilla on Do Not Track project.
“In the United States there is a whole list of protected classes employers can’t even ask a candidate about,” McDonald said, citing questions about marital status, sexual orientation, ethnic or religious background and other characteristics against which employers are not allowed to discriminate.
“A lot of employers are using Facebook today to get extra information on candidates, but there is a risk to that: You can wind up learning information you should not have. If that happens it’s very difficult to document that you did not use that information in hiring decisions,” McDonald said.
In addition to potential charges of discrimination, by republishing data from other sites Dice is bypassing the promises each of those sites made about how they will re-use that data, potentially violating those agreements without giving either sites or individuals the chance to opt in, she said.
Online background searches are routine, but “the goal is to get information that is more personal,” according to Michael Gualtieri, analytics analyst at Forrester Research, Inc. “Companies are looking for non-obvious information, connections they might not see otherwise that could become part of a hiring decision.”
“Facebook is working on something similar with its Facebook Network Search, but that’s only one site,” Gualtieri said. “Gathering even public information together on a large scale would help look for those [pieces of] non-obvious information, but also makes that information a lot more public,” he said.
Gualtieri said that while OpenWeb could identify job candidates whom Dice calls passive job seekers, the resulting recruiting effort could appear unsolicited. “If someone calls me with some offer I didn’t ask about, it might not seem like a compliment. It might just be spam,” he added.
Melland disagreed, saying few people would consider an unsolicited job offer to be spam. Copyright and privacy issues don’t really apply to the profiles Dice builds because all the data in OpenWeb is publicly available from the host sites, meaning profile subjects have already chosen to publish it.
Still, ownership of the data could become an issue, McDonald said.
“If someone posts something in Facebook and Dice scoops it up and profits from that data, the people who posted that data could claim they have copyright and say not to take that information,” McDonald said. “Facebook’s user agreement also claims copyright, so if it shows up on Dice, Facebook might want to have something to say about it, too.”
Kevin Fogarty is a veteran writer, editor and analyst whose work has appeared in CNN.com, InformationWeek, CIO, Computerworld, Network World and other news and technology sites. Reach him at email@example.com or on Twitter at @kevinfogarty.