It’ll take more than world-class researchers and deep-pocketed pharmaceutical companies to find a cure for multiple sclerosis. According to Dr. Murali Ramanathan, data analytics “is the very first and earliest step of drug discovery” for debilitating diseases like MS.
Ramanathan would know. Professor of pharmaceutical sciences, Ramanathan leads a research team at the State University of New York (SUNY) at Buffalo that is using data analytics to study the potential causes of MS, a disease of the central nervous system that can cause numbness, paralysis and vision loss. There are approximately 400,000 people in the United States living with MS and 200 more people are diagnosed each week, according to the National Multiple Sclerosis Society.
Fortunately, researchers are making gains: In September, the U.S. Food and Drug Administration approved Aubagio, a once-a-day tablet for adults with relapsing forms of the disease. And in early November, scientific journal The Lancet reported on new trials that demonstrate a cancer drug’s effectiveness in battling MS.
These advances fall short of a cure, though. And Ramanathan says that behind today’s clinical trials and FDA approvals is “a data-intensive problem with lots of computation difficulties” that only data analytics can solve.
That’s why Ramanathan and his SUNY team are developing algorithms capable of slicing through huge sets of scientific data. Relying on an IBM Netezza supercomputer and Revolution Analytics software, the group is exploring the environmental factors, such as toxins or diet, that may cause MS, as well as how variables such as gender, geography, ethnicity, working conditions and sun exposure can lead to the disease. Using Revolution Analytics, researchers can run parallel computations involving huge amounts of genetic data without having to rewrite algorithms and at record speeds.
Studying Multiple Hypotheses of a Disease in Parallel
While data analytics’ role in medical research is hardly new, Ramanathan says the very nature of MS renders models and algorithms a scientific necessity.
Part of the problem, says Ramanathan, is that “the genetics of MS are not simple. There’s no particular gene that is directly responsible for MS.” As a result, researching the disease requires examining hundreds of thousands of genetic variations, called single nucleotide polymorphisms (SNPs), and how these genes interact with one another. What’s more, environmental factors such as sun exposure, vitamin D levels, Epstein-Barr virus infection and smoking are also believed to contribute to MS.
“The problem gets complex very fast due to a combinatorial explosion,” says Ramanathan. “You’re no longer looking at individual genes or an individual environment. You’re looking at pairs of genes and different combinations. In fact, the number of combinations increases explosively along with the number of predictors and variables which creates a whole host of data and analytics problems.”
To address this dilemma, the SUNY team created algorithms capable of identifying both gene interactions and thousands of SNPs and environmental factors—combinations that can number in the quintillions. Variables can be added and removed from each model quickly and easily without having to write hundreds of lines of code—a change from SUNY’s previous data analytics system which required researchers to rewrite entire algorithms. And the system’s search capabilities let researchers drill down to identify the most promising combinations of variables.
“We use the search metric to guide our algorithm to places in the combinatorial space that are rich in interaction,” says Ramanathan.
But while the SUNY team has “been able to do analysis they’ve never been able to do before,” Ramanathan says discovering new drugs to treat MS remains a challenging undertaking.
“None of the drugs we have today are a cure for MS,” says Ramanathan. “We still don’t know what causes MS. And finally, we really have no treatments whatsoever for the progressive forms of MS.”
Analytics, however, gives hope by offering researchers a chance to perform “target identification,” says Ramanathan. Researchers typically devote huge amounts of time and effort to determining what target they’ll be researching. For example, a team may decide to target the immune response in MS from a particular subgroup of patients. Once the target is selected, a vast amount of resources are consolidated and devoted towards that single target. All of which can result in a colossal misuse of time and resources if the chosen target winds up being off the mark.
Analytics can help by integrating data from disparate sources, and identifying a target, such as a subgroup of patients that responds better to one particular drug. That process is more likely to lead to discoveries and eventually, drug development.
“The real advantage of analytics emerges when we can identify subgroups of patients with a particular feature, disease or pathological mechanism more efficiently. That’s where the benefit of analytics is,” says Ramanathan. The rest is up to razor-sharp researchers and the right funding.
Cindy Waxer is a Toronto-based freelance journalist and a contributor to publications including The Economist and MIT Technology Review. She can be reached at firstname.lastname@example.org or via Twitter @Cwaxer.