How many Doctored papers are there?



[ad_1]

Just how much shit is there in the scientific literature? "A little" comes the answer of anyone with real experience, but it is not too quantitative. Here, however, is an analysis of a newspaper (perfectly respectable) of its own production, and the results are. . Well, they go from "very bad" to "honestly, I expected me even worse," depending on your level of cynicism.

Molecular and Cellular Biology and its parent organization (the American Society for Microbiology) reviewed articles published in the journal from 2009 to 2016 (960 in total, 120 random articles per year), looking for trafficked / duplicated images (which is still one of the easiest ways to detect negligence and fraud). The procedure they used seems effective, but it does not evolve very well: the first step was Elisabeth Bik look at each article (here is an interview with her and the other co-workers). authors). She seems to have a very good eye for picture problems, and as an amateur astronomer, I can tell you that she would have made a very effective comet or supernova hunter for exactly the same reasons . The "cuts and embellishments" were not considered problematic – there must have been duplicates and / or serious alterations.

What they found was 59 papers with clear duplicates, and in each case the authors were contacted:

cases of improper image duplications led to 42 corrections, 5 retractions and 12 cases in which no action was taken (Table 1). The reasons for not acting included the origin of the laboratories that had closed (2 articles), the resolution of the problem in correspondence (4 articles) and the occurrence of the event more than six years ago (6 articles ), in accordance with the ASM policy and regulations established in 42 CFR § 93.105 to pursue allegations of misconduct of research. Among the retracted documents, one contained several image problems, so one correction was not an appropriate remedy and for another paper retracted, the original and underlying data were not available, but the study was strong enough to allow the reconsideration of a new document. It's worth noting that this article also indicates how long it took, and it's substantial – at least 6 hours of work per diary, involving hundreds of emails as a whole and a lot of feedback and … for nothing. As usual, cleaning up something takes a lot longer than making it messy in the first place. At that time, the magazine introduced the pre-publication of images in 2013, and the incidence of problems has indeed decreased from this year. (They did not tell Elizabeth Bik when the policy was introduced, so as not to bias it.)

As these figures show, the good news is that many duplicate images appear to be of pure negligence and can be repaired. . But at least 10% of the marked paper had to be completely drawn. Extrapolating from this experiment (and two other previously reviewed journals), it is estimated that the Pubmed Literature database 2009-2016 (nearly 9 million articles) is expected to yield about 35,000 (and, well, sure, that means that many more papers still need to be corrected). Overall, the number of unwanted publications can be described as "small but still significant", and there is no reason to clutter them with literature.

Increased selection in the editorial phase is worth the effort. but not as much as the time it takes to go back and fix things later. (And this corresponds to another long-standing tip, that if you do not want shit to land on you, then do not let it get up in the first place). As the authors note, this is a recent problem due to the proliferation of the digital tools needed to make this mistake – and, to be fair, these tools also make it possible to make faster mistakes and more easy. . And he also admits modern solutions – software to capture the duplication of images has been (and is being) worked by several groups, and should avoid the need to clone Elisabeth Biks.

This brings us back to the question in the first paragraph, about how much shit is there. Papers with clearly fraudulent images are obviously in this category, but there are many other less obvious ways that papers can be fraudulent. I would therefore call this estimate of 35,000 a likely undercount, even since there are many articles in PubMed that do not contain images of this kind.

But beyond fraud, there are more honest documents but they are simply not good – statistically insufficient studies, non-repeatable procedures, inadequate descriptions, conclusions that do not necessarily follow presented data. Literature has always had these things in it. Poor quality work did not wait on image editing programs to make it possible; we all come with the necessary software pre-installed between our ears. The elimination of fraud is an obvious first step, but it is also (unfortunately) the easiest. The other thing is, as has always been the case, on the players to watch.

This raises one last point, which has been done here and in other places before. In these modern times, as the Firesign Theater said, some of these clients of the scientific literature are not human beings. The machine learning software is very promising for analyzing the huge amount of knowledge that we have generated, but such algorithms are easily poisoned by the trash problem. Data curation is and always will be a crucial step for any machine learning effort to yield useful conclusions, and studies like this just remind us that the conservation of biomedical literature is not a simple thing.

[ad_2]
Source link