When Columbus sought support for his westward sea route to the “Indies” in Southeast Asia, he was handicapped by the fact that he did not have convincing answers to a few simple questions: How far is the trip? How long will it take? What will it cost? This lack of information made it difficult to convince potential sponsors, one of which he eventually found in the Spanish Crown.
He apparently did not know the work of Eratosthenes of Cyrene, a Greek mathematician and astronomer working in Alexandria, who calculated the circumference of the earth in 200 B.C., nearly 1,700 years before Columbus set sail. Had he known this information, he wouldn’t have thought he could have arrived in the Indies in such a short distance. The term “West Indies” for the Caribbean basin sounds like a mockery to the poor great man, even 500 years later.
You may think that in the end it didn’t matter to Columbus whether he ran ashore in America or Asia since he was able to send treasures home by the shipload. But you would certainly be ill advised to propose a project with such an uncertain outcome to any sponsor – be it a customer or a venture capitalist. It’s way too hard to strike gold by accident. Try the lottery instead.
Modern Organizations Realize They Don’t Know What They Know
Dr. Thomas Lackner, head of innovation at Siemens, once said, “If Siemens knew what Siemens knows, we would be much more efficient.”
Lackner was presented with the challenge of not knowing what usable components Siemens had when the company faced complex new projects such as large power stations. Redeveloping hundreds of existing components for a given project is a recipe for going out of business with exploding costs and missed delivery times. But in any large company it is a nontrivial challenge to “know what you know.” Thus, Siemens implemented a “techno search” on top of a cognitive search platform to provide access to Siemens’ accumulated knowledge.
Speed is of the essence in the pharmaceutical industry where the winner takes all. The first on the market with a new drug can profit substantially, while the followers may never recuperate their investments.
When working on a new drug or repositioning an existing one, every pharmaceutical company must know the best experts in its own R&D as well as outside experts in pharmacology, medicine, biology and genetics. For example, AstraZeneca CTO Nick Brown and his team implemented a cognitive search platform and real-time expert search as part of an app store of information applications on top of that platform. They were able to provide concrete results with seven “InfoApps” for their colleagues in R&D within four months.
The French Institute for Nuclear Safety and Protection from Radioactive Radiation (Institut de Radioprotection et de Sûreté Nucléaire – IRSN) could not afford to lose its institutional knowledge when a large wave of founding employees retired 30 to 40 years after most French reactors were built. They had to make it easy and fast for the newcomers to find any safety-relevant information on every installation, and any relevant information for the maintenance of France’s nuclear installations.
All these examples show the value of knowledge to avoid institutional amnesia. By serving the right information to the right people at the right time, you help them:
- Make better decisions and act accordingly
- Better serve customers
- Accelerate business processes
- Design new, more efficient business processes
- Maintain industrial installations, aircraft, cars and other complex machinery
- Perform tasks more efficiently: Big organizations lose millions per year while their employees search – often in vain – for information they need to do their jobs
How do you locate knowledge in seconds? By extracting insight from enterprise data. Note that we willfully blend people and organizations into the “you” of this question. People must benefit from the accumulated knowledge of organizations in which they work.
We use the term “insight” to mean relevant, meaningful information for a given user. We use the term “enterprise data” to mean internal and external information accessible to an enterprise or public organization. This data can be structured in enterprise applications and databases or unstructured as in natural language texts, images, videos and audio.
Extracting Insight from Data
Modern technologies are helping organizations find out what data is about and what it might mean to a person. Relevant information for one person may be garbage for others.
Some of these technologies include:
- Search technology to help locate information and retrieve it instantly, while offering a simple user interface: natural language queries.
- Natural language processing (NLP) determines what texts are about, who wrote them and whether two texts have similar content even though they don’t use the same vocabulary.
- Machine learning (ML) algorithms are used, for example, to cluster content in different categories without human guidance; to classify content according to a sample classification, without explicit rules for doing so; to compute similarity of content; to provide recommendations for users interested in certain topics; and to perform predictive analysis using linear regression. The latter helps predict future values of variables and identify “outliers” such as implausible values that might hint, for example, to fraudulent monetary transactions or the impending failure of an engine.
The Importance of Combination
Combination is important when collecting and analyzing data. You need to intelligently combine artificial intelligence (AI), NLP, statistics and search technologies to provide the most relevant information to the right people at the right time within their work contexts. Sophisticated NLP, using classical linguistics as well as AI-based NLP, extracts concepts from texts. ML algorithms can then be unleashed on the enriched collection of content, leading to better and faster results and preventing ML algorithms from going tangent on enterprise data that might not be rich enough and exhaustive enough for them to work properly.
High Performance and Scalability
To serve the ever more impatient user, a cognitive search and analytics platform must have a solid high-performance, scalable architecture. It must be able to serve organizations with more than 10,000 concurrent and intensive users in a call center where customer profiles are extracted from data scattered across 30 enterprise applications and displayed in less than two seconds. It must be able to manage 500 million documents and billions of database records. Large organizations may well have more than 350,000 users – some of them occasional, others intensive.
Steps to Add New Knowledge/Information
Knowing what you need to know, when you need it, is helpful. But what about adding knowledge or relevant information to the corpus of available data and information?
In general, humans are best at “associative thinking,” such as combining ideas that were previously unconnected to create new ideas and insight. Machines or software will not surpass humans in this field anytime soon.
Nevertheless, machines can detect patterns in vast sets of data that humans fail to see because of the sheer volume. They can find semantic similarities in large sets of documents that humans cannot read in a lifetime. Machines can trace relationships between tens of thousands of people where humans would be lost in the mesh. And they can then add relationships to the knowledge to be exploited quickly. You need a place to store this additional insight to retrieve it quickly with related information. We call this repository a “logical data warehouse.”
The Winning Combination: Humans and Computers
The best results in most disciplines are obtained by intelligently combining the strengths of humans and machines. This point is best illustrated by the history of computers in chess tournaments. The world was stunned when IBM’s Deep Blue won a tournament against then world champion Garry Kasparov.
However, in the first mixed tournament, where humans with computers played against humans or against computers alone, none of the stars won. Neither a grandmaster nor Deep Blue nor another computer won, but rather, two students with three laptops were the winners. The students were not as good as the grandmasters, and their laptops were not as powerful as Deep Blue. But the intelligent combination of humans and computers enabled them to beat the best humans and the best computers.
Working together, computers and humans combine cognitive strengths to capture, analyze and produce insights that are no longer lost so that no one will ever set sail without knowing where and when they’re going. Amnesia and discoveries by accident will be a thing of the past.
Hans-Josef Jeanrond, CMO of Sinequa, brings more than 25 years of experience in IT marketing to the job, having served as marketing director at SAP France for 6 years, and for a whole range of other companies a marketing director. His training as a Computer Scientist at the universities of Saarbrücken (Germany), Oxford and Edinburgh, and more than 15 years’ experience in software R&D, enable him to mediate between the technical world and that of end users interested in the business value of IT innovations