Can we trust scientific discoveries made using machine learning?


WASHINGTON – (February 15, 2019) – The Rice University statistician, Genevera Allen, says that scientists must continue to question the accuracy and reproducibility of scientific discoveries made by machine learning techniques until That researchers develop new computer systems that can criticize themselves.

Allen, an associate professor of statistics, computer science and electrical and computer engineering at Rice and pediatric-neurology at Baylor College of Medicine, will address the topic at a press conference and one-on-one session. General held today at the annual meeting of the American Association in 2019. for the Advancement of Science (AAAS).

"The question is:" Can we really trust the discoveries that are currently being made using machine-learning techniques applied to large datasets? Allen said. "The answer in many situations is probably" not without verification, "but work is underway on next-generation machine learning systems that will evaluate the uncertainty and reproducibility of their predictions. "

Machine Learning (ML) is a branch of statistics and computer science in charge of setting up computer systems that learn from data instead of following explicit instructions. According to Allen, a lot of attention in the area of ​​BC has been focused on developing predictive models allowing ML to predict future data based on his understanding of the data he has studied.

"Many of these techniques are designed to always make a prediction," she said. "They never come back with" I do not know "or" I did not discover anything, "because they're not made for."

She said that the uncorroborated, evidence-based findings from recently published cancer studies on ML are a good example.

"In precision medicine, it's important to find groups of patients with similar genomic profiles to be able to develop drug therapies that target the specific genome of their disease," Allen said. "People have applied machine learning to genomic data from clinical cohorts to find groups or groups of patients with similar genomic profiles.

"But there are cases where the discoveries are not reproducible, the clusters found in one study are completely different from those found in another," she said. "Why? Because most of the machine learning techniques of today always say:" I found a group. "Sometimes it would be much more helpful if they said, "I think some of them are really grouped together, but I'm not sure about these others". "

Ms. Allen will discuss today the uncertainty and reproducibility of ML techniques for data-based discoveries at a press conference at 10 am. She will also discuss case studies and research to resolve uncertainty and reproducibility at 3:30 pm. General Session, "Machine Learning and Statistics: Applications in Genomics and Computer Vision". Both sessions take place at the Marriott Wardman Park Hotel.


Allen is the founding director of the Rice Center for Data Transformation in Knowledge (D2K Lab) and a member of the Jan and Dan Duncan Neurological Research Institute of Texas Children's Hospital. His research focuses on modern multivariate analysis, graphical models, statistical learning and data integration, with a focus on statistical methods that help scientists understand "big data" from high genomics. flow, neuroimaging and other applications. His previous distinctions included a CAREER award from the National Science Foundation, the Young Statistician's Award from the International Society of Biometrics and Forbes' 30 under 30's in Science and Health Care.

AAAS is the world's largest multidisciplinary scientific society and the annual meeting of AAAS, held February 14-17, is the largest general scientific meeting in the world. For more information, visit: https: //

AAA Annual Meeting 2019:

About the meeting: https: // /about the meeting /?

Program: https: // /program/?

Press room: https: // /aaasnewsroom /2019 /

Media Registration: https: // /aaasnewsroom /2019 /recording/?

Media Contacts: https: // /aaasnewsroom /2019 /contacts /

The search for Allen: http: // /~ gallen /index.html

Rice's D2K Lab: https: // /sure/d2k-lab

High Resolution IMAGE is available for download at:

https: // /files/2019 /02 /0211_AAAS-Allen01a-lg-yvriyj.jpg? LEGEND:

Genevera Allen is a statistician at Rice University, a scientist and founding director of Rice's D2K lab. (Photo by Jeff Fitlow / Rice University)

This release is available online at

Follow Rice News and Media Relations via Twitter @RiceUNews.

Located on a 300-acre forest campus in Houston, Rice University consistently ranks among the top 20 universities in the country by US magazine U & US News & World Report. Rice has highly respected schools of architecture, commerce, permanent studies, engineering, social studies, music, natural sciences and social sciences and is home to the Baker Institute for Public Policy. . With 3,962 undergraduate students and 3,027 graduate students, Rice's ratio of undergraduates to teacher is just under 6 to 1. His network of residential colleges builds very unified communities and eternal friendships, which partly explains why Rice is ranked # 1 for many race / class interactions and # 2 for quality of life by the Princeton Review. Rice is also considered the best value for money among private universities by Kiplinger's personal finances. To read "What they say about rice," go to http://tinyurl.comcom /RiceUniversityAperçu.

Warning: AAAS and EurekAlert! are not responsible for the accuracy of the news releases published on EurekAlert! contributing institutions or for the use of any information via the EurekAlert system.

Source link