$1M Digital Mammography Challenge Calls for Machine Learning Algorithms

$1M Digital Mammography Challenge Calls for Machine Learning Algorithms

A coalition of oncology and technology partners led by Sage Bionetworks and DREAM Challenges announced last week the opening of the training phase for the Digital Mammography DREAM Challenge, an open-science data competition designed to improve the accuracy of mammography screening.

With funding from Laura and John Arnold Foundation (LJAF), the Challenge will award up to $1.2 million to data scientists, researchers and coding experts who develop predictive algorithms that achieve milestone goals related to reducing the recall rate of mammography screening.

The coalition supporting the Challenge as organizers, sponsors, partners and advisors from the health, tech, regulatory and for-profit competition sectors includes: Amazon Web Services, FDA, Group Health Cooperative, IBM, Icahn School of Medicine at Mount Sinai, Innocentive, NCI, Radish Medical and Seattle Cancer Care Alliance.

Each year, more than 40 million women in the U.S. undergo routine mammogram testing to screen for breast cancer. Mammograms are widely considered to be the most accessible and cost-effective breast cancer screening method.

However, the United States Preventive Services Task Force and the American Cancer Society recently issued changes to recommendations regarding start age and frequency of screening. These changes are due, in part, to the large number of false-positive mammograms.

One in 10 women undergoing screening mammography are recalled for diagnostic workup, of which fewer than 5 percent will eventually be found to have cancer. Recalled patients often experience stress and additional medical costs, and some require interventions including unnecessary biopsies.

The Digital Mammography DREAM Challenge, running through mid-2017, will seek to attract data experts from both inside and outside the medical field to develop predictive algorithms that will reduce false-positive mammograms while maintaining or improving cancer detection.

Participants will be asked to create algorithms that will help doctors determine whether a patient’s mammogram has a high or low likelihood of harboring a breast cancer, and whether or not a patient should undergo additional testing.

New algorithms may allow doctors to customize screening regimens for patients and identify women who would benefit from more or less frequent screening.

“Our goal is for the Challenge to demonstrate that we can extract more information from mammograms than what meets the eye," Dr. Christoph Lee of the Seattle Cancer Care Alliance and clinical advisor to the challenge explained.

"Now that we are in an age where machines can train and learn how to recognize images, there is the possibility that machines can learn to recognize cancer-specific pixelated patterns in a digital mammogram that humans cannot detect. If highly accurate algorithms can help provide women with more clinically relevant and accurate information, then we can dramatically change the field of breast cancer screening,” said Lee.

To create the algorithms, participants will use anonymous patient data including nearly 650,000 digitized mammograms provided by the Group Health Cooperative through the NCI-funded Breast Cancer Surveillance Consortium and by the Icahn School of Medicine at Mount Sinai.

Solvers’ algorithms will be evaluated against corresponding data on known patient outcomes, and scores will be assigned based on measures of accuracy. Algorithms that identify the fewest false positives while maintaining high rates of cancer detection will receive the highest ranking on the publicly accessible Challenge leaderboard. These medical images will be securely stored in the Amazon and IBM clouds.

“This Challenge holds great promise to improve breast cancer screening,” Group Health Cooperative Senior Investigator Dr. Diana Buist said. “It is only possible through the long-term investment the National Cancer Institute has made in the Breast Cancer Surveillance Consortium, which provides real clinical data harmonized with gold standard cancer ascertainment.”

Challenge organizers are working to maximize the level of solver participation through a targeted marketing campaign and by directly engaging the combined solver communities of DREAM and Innocentive that together include more than 370,000 registered individuals from over 200 countries.

The Challenge is also part of the Coding4Cancer initiative that was featured at Vice President Biden’s June 2016 National Cancer Moonshot Summit. Coding4Cancer seeks to drive improvements in cancer detection methods through the development of better algorithms for imaging tools. Coding4Cancer will hold a second Challenge in 2017 to improve lung cancer screening techniques.

“The Digital Mammography DREAM Challenge is the start of a larger movement focused on using prizes and Challenges to improve early cancer diagnosis,” LJAF Vice President of Science and Technology Michael Stebbins said. “We are eager to hear more exciting ideas that will help to improve the use of medical imaging techniques to support early diagnosis.”

Due to the massive volume of data (more than 10 terabytes) and sensitivity of the Challenge’s digitized mammograms, solvers will not have direct access to the data. Rather than the usual approach of bringing the data to the algorithm, in this Challenge, organizers will bring solvers’ algorithms to the data for training and scoring.

Sponsorships from both AWS and IBM are providing cloud computing for data hosting as well as the computational firepower needed to support solvers’ deep learning approaches for model training.

DREAM founder and IBM Research Director Dr. Gustavo Stolovitzky remarked, “We at DREAM are thrilled to be running this massive machine learning exercise on mammography data. And with Sage Bionetworks’ ability to host all the solvers’ re-runnable submissions and open source code after the Challenge, we are also creating a large resource of reproducible methods, all in a consistent, portable framework that can be run on any data set and developed further by any end-users to continue to advance the emerging opportunity to apply computer vision to medical imaging.”

Interested participants can sign up here >

Article published by icrunchdata
Image credit by Getty Images, Image Source, REB Images
Want more? For Job Seekers | For Employers | For Contributors