Machine learning model of U.S. largest COVID-19 dataset predicts disease severity

[ad_1]

A centralized repository of COVID-19 health records built last year is starting to show results, starting with a new article released today. The repository is the largest set of COVID-19 records to date, and was built by a team of researchers and data experts last year to help make sense of COVID-19.

The study, published in the journal JAMA network open, examined the risk factors for severe cases of COVID-19 and traced the progression of the disease over time. The authors built machine learning models to predict which hospital patients would develop serious illness based on information gathered on their first day in the hospital.

Using the centralized database, called the National COVID Cohort Collaborative Data Enclave, or N3C, allowed the research team to include hundreds of thousands of patient records in their analysis. The study used data from 34 medical centers and included more than one million adults – 174,568 who tested positive for COVID-19 and 1,133,848 who tested negative. It includes records spanning January 2020 through December 2020.

The analysis shows how treatment for COVID-19 changed during 2020, as doctors tried new treatments and gained more experience with the disease. The percentage of patients treated with hydroxychloroquine, an antimalarial drug, promoted by former President Donald Trump before proving ineffective, fell to almost zero in May 2020. The use of the steroid dexamethasone increased in June, after studies. has shown that it can improve survival rates.

He also confirmed that survival rates for patients with COVID-19 improved during 2020. In March and April, 16% of people admitted to hospital with COVID-19 died. In September and October, that figure fell to just under 9%.

People who had higher heart rates, breathing rates, and temperatures when they arrived at the hospital were more likely to need drastic interventions like ventilation. They were also more likely to die. Abnormal white blood cell count, inflammation, blood acidity, and kidney function were also linked to more severe cases. The research team built machine learning models using these and other data points that could predict which patients would become seriously ill. The models could potentially serve as the basis for decision-making tools with additional testing, the authors wrote.

Researchers are analyzing the trajectory of COVID-19 since the very beginning of the pandemic. This study has the advantage of taking advantage of a large and diverse data set – it is not limited to one hospital or state. In the United States, researchers often limit themselves to studying the medical records of patients in the institutions where they work. This means that the number of records they can include in studies may be limited and they are not able to easily verify whether their findings would apply in other places.

A resource like the N3C, which brings together records from dozens of institutions, circumvents these limitations. Currently, the N3C includes data from 73 healthcare facilities and has records of over 2 million COVID-19 patients. More than 200 research projects using the data are underway, including studies examining risk factors for COVID-19 re-infection and the disease’s impact on pregnancy. This is not perfect – standardizing information between hospitals is difficult, and there may not be complete data on many patients.

Yet having such a large data set is invaluable. Researchers are using this resource to conduct studies that they may not have been able to complete with their own institution’s resources alone, said Elaine Hill, a health economist at the University of Rochester working on the pregnancy research. The edge last fall. “It sheds light on things we wouldn’t be able to do,” she said.

[ad_2]

Source link