INDUSTRY
Pitt leads creation of global, cloud-based data system for infectious diseases
- Written by: Tyler O'Neal, Staff Editor
- Category: INDUSTRY
Backed by a five-year, $6.7 million National Institutes of Health (NIH) grant, the University of Pittsburgh Graduate School of Public Health today announced that it plans to lead a culture shift in data-sharing rippling through scientific fields and harness it to improve global knowledge of infectious diseases.
Pitt Public Health will lead a multidisciplinary group of computer scientists, biostatisticians and biomedical informatics experts to direct the inaugural Network Coordination Center for the Models of Infectious Disease Agent Study (MIDAS), a collaborative research network originally launched by the NIH in 2004 to assist the nation in preparing for infectious disease threats. {module In-article}
"The scientific community is increasingly recognizing that sharing research data and software not only benefits individual research projects, but increases the impact of science and innovation on the greater good. However, nobody's figured out exactly how to do this for global infectious diseases," said Wilbert van Panhuis, M.D., Ph.D., assistant professor of epidemiology at Pitt Public Health and biomedical informatics at Pitt's School of Medicine, who will lead the new center. "What we're going to do is leverage that interest in 'open science' to create a framework that will make it easy to share, find and use research data and software to combat infectious diseases."
Currently, infectious disease researchers are sharing their data in a scattershot way, including online supplements with research publications or paying to upload it to a generic data repository, Van Panhuis explained. When large datasets are warehoused in this way, they can be difficult for other scientists to find and may not be formatted in ways that make them easy to use.
The MIDAS Network Coordination Center will strive to accelerate infectious disease research and discoveries by developing a cloud-based platform where scientists can store, share, access and use massive libraries of infectious disease data with high-performance computing. The approach will be aligned with the NIH Strategic Plan for Data Science and will follow the "Findable, Accessible, Interoperable, and Reusable" (FAIR) Data Principles, which were published in 2016 to guide management and stewardship of large datasets for scientific use.
In its first year, the MIDAS Network Coordination Center will largely concentrate on standardizing and uploading hundreds of existing infectious disease datasets into its platform, as well as reaching out to scientists who use such data to ask how MIDAS data and software can best serve them.
"Our hope is that after that first year, the MIDAS network will be able to demonstrate the benefits of open science and open data for making new discoveries," Van Panhuis said. "We'll also be going after new data ourselves, on behalf of MIDAS, collecting datasets from health organizations and government entities worldwide, so that the scientists have to spend less time obtaining data and can instead concentrate on making discoveries with it."
Van Panhuis is an infectious disease epidemiologist funded by the NIH Big Data to Knowledge program to improve access and integration of public health data for research and policy. He's uniquely positioned to direct the Network Coordination Center, having served as the lead scientist for Project Tycho for more than 10 years. Project Tycho is a success story of unlocking data in global health. In five years, the Project Tycho user community has grown to over 4,000 members.
Much like Project Tycho, the new MIDAS Coordination Center will support an open, community-driven discovery process, where all scientists and the broader community have the chance to analyze datasets, Van Panhuis said. But, unlike Project Tycho, it will contain more diverse datasets and coordinate a specific community of infectious disease modelers, including facilitating meetings, workshops and software sharing. It also will be flexible to manage data and modeling support in emergencies, such as pandemics, provide outreach and educational opportunities, and strengthen the sense of community and collaboration within MIDAS and the broader community.