IU Data to Insight Center receives NSF grant to investigate In-Situ Archive for reliable, long-term data storage

Data provenance—or the process of ensuring the history and quality of stored and archived data -- is the focus of a new project funded by the National Science Foundation (NSF) and led by the Data to Insight Center (D2I), a division of the Pervasive Technology Institute at Indiana University.

Watch a video of Beth Plale discussing the development of the In-Situ Archive.

As scientists increasingly rely on advanced computational methods, properly preserving and archiving the resulting data has become essential to scientific accuracy and research quality. This is particularly true for scientists working in the areas of climate and global environmental change. Climate and environmental data collected today will need to be preserved for study by scientists working now and decades into the future. These scientists will require assurances that the data is genuine and has not been damaged or altered.

During this two-year, $200,000 project, D2I researchers will build an "In-Situ Archive," a provenance-focused data archive that can be used alongside a researcher's typical tools for storing data.  Photo by Chris Meyer  Beth Plale

"Some data collections are constantly evolving, so the traditional process of archiving a snapshot is not enough. Moreover, some organizations cannot afford heavy-weight archival solutions that require a high level of computer science expertise to install and run," said project Principal Investigator Beth Plale, professor of computer science and director of D2I.

D2I will develop the In-Situ Archive system in collaboration with the International Forestry Resources and Institutions (IFRI) network. With researchers in 11 countries and on four continents, the IFRI network collects forestry observation data from remote, out-of-the-way locations. IFRI's database is a highly valuable research collection containing 15 years of detailed data on forest sites, governance, and uses. The In-Situ Archive will work to address challenges faced by IFRI users -- many of whom have limited computing and bandwidth capabilities, but still need high-quality, long-term data preservation.

"The development of the In-Situ Archive will build a foundation for outreach through IFRI that has broad potential for science and policy impacts worldwide," continued Plale. "Maintaining accurate and reliable data on the environment and global climate change is an area of emerging importance for many generations. And it is an area that we must improve now -- before valuable data are damaged or lost forever."