In Cancer Research, Big Data Needs to Be Smart, Too
October 23, 2013
The University of Texas MD Anderson Cancer Center has announced that it will partner with IBM's Watson supercomputer to fight cancer - specifically leukemia. But myeloma patients may well ask: "What can Big Data do for me?" The answer is: "A lot."
The IMF has been conducting studies with Big Data analyses for a number of years now. But let's credit the MD Anderson-Watson alliance with spotlighting the very important role Big Data can and should play in seeking cures for cancer, including myeloma.
The idea is that Watson, the same computer that was a winner on "Jeopardy," can review data related to the 100,000 or so patients cared for each year at MD Anderson and spot trends. When you're dealing with sample sizes that big, insights and patterns - relationships between treatments and recovery - that might otherwise be missed can make sense, as long as you've got immense cognitive computing power comparing each patient to the next.
The IMF approach--through the efforts of its research arm, the International Myeloma Working Group (IMWG)--has been utilizing Big Data somewhat differently. A 2008 study and a follow-up 2010 study on the impact of age on myeloma recovery used data from 10,549 patients, all having myeloma. That's about 10% of the sample Watson will have access to annually - but the IMWG sample comprised myeloma patients, all of whom were carefully evaluated, treated, and monitored within clinical trials. They were, in other words, not randomly selected, as are referrals to a particular medical center such as MD Anderson.
That difference makes the aging study far more robust than sheer sample size might suggest, and the conclusions can have a profound impact on treatment.
Similarly, in 2005, the same patient sample was used for the development of the International Staging System for myeloma, which has added greatly to our understanding of how to categorize patients and apply different treatments based on those categories.
With a random sample such as that used in the Watson program, there are many variables and unknowns that make it really difficult to fully interpret outcomes no matter how much you "crunch the data" with a super computer. Connecting genetic information to random outcomes can highlight trends, but that doesn't necessarily give enough targeted information to immediately advance research and provide a path to a cure.
The IMF, as part of Bank on a Cure, identified genetic features linked to bone disease and the likelihood of myeloma. (2009) However, even four years later it has proven challenging to apply this information directly to strategies for prevention of myeloma and/or to the development of new therapies. More recently (in 2012), we were able to use Big Data to generate genetic FISH information to enhance the ISS staging system.
So, use of Big Data is not new to the IMF. We believe in Big Data and we want Watson to succeed. But Watson needs precise input to have clear output. Teaching Watson the nuances of all different cancers is no easy task. After all, Watson's a whiz at math, but he didn't go to medical school. Getting the most out of Big Data requires us to be smart about the data provided and to think ahead about the answers we need and want. So let's see how this works out, and hope Watson succeeds in medical school and an oncology fellowship.
Dr. Durie sincerely appreciates and reads all comments left here. However, he cannot answer specific medical questions and encourages readers to contact the trained IMF Hotline staff instead. Questions are answered with input from Dr. Durie and/or other scientific advisors and IMWG members as appropriate. To contact the IMF Hotline, call 800-452-CURE, toll-free in the US and Canada, or send an email to email@example.com. Hotline hours are 9 am to 4 pm PST. Friday summer hours are 9 am to 3 pm PDT. Thank you.