[ad_1]
Crisis in science
The Rice University statistician, Genevera Allen, launched a serious warning this week at a major scientific conference: scientists rely on machine learning algorithms to find data models even when they are focus on noise that will not be reproduced by another experiment.
"There is a general acknowledgment of a reproducibility crisis in science at the moment," she told the BBC. "I would venture to say that much of this comes from the use of machine learning techniques in science."
reproducibility
According to Allen, the problem can arise when scientists collect a large amount of genomic data and then use poorly understood machine learning algorithms to find clusters of similar genomic profiles.
"Often, these studies are not inaccurate as long as there is not another big data set that someone applies these techniques to and says," Oh my God, the results of these two studies do not overlap " , she said at the BBC.
Draft noise
According to Allen, the problem with machine learning is that it is driven to look for models, even if there are none. She suspects that the solution will be based on new generation algorithms, more able to evaluate the reliability of forecasts.
"The question is:" Can we really trust the discoveries that are currently being made using machine-learning techniques applied to large datasets? "Allen said in a press release." The answer in many situations is probably "Not without verification", but work is underway on next-generation machine learning systems that will evaluate the number of students in the classroom. uncertainty and reproducibility of their predictions. "
[ad_2]
Source link