ERICA to the Rescue!

ERICA to the Rescue!
A screenshot of ERICA, the online archive containing 500,000 full-text PDFs rescued from ERIC

ERIC, the research repository of the Department of Education, was defunded this week. The ERIC catalog lists over 2 million education-related publications. More than 500,000 of these publications are directly hosted by ERIC as full-text PDFs. If the ERIC website goes offline, most of these 500,000 Open Access publications will not be available anywhere else on the internet.

Several volunteers and partners of the Data Rescue Project (DRP) have been working tirelessly for the last few weeks to back up the ERIC catalog and full-text PDFs. A DRP volunteer going by the username of crizzo has single-handedly submitted the links to 500,000 PDFs to the Internet Archive’s Wayback Machine. To ensure continued access to the publications, crizzo has developed a lightweight rescue catalog called ERICA, which lists all the PDFs saved in the Wayback Machine together with their basic metadata, such as title, author and publication year. The open-source code and metadata for ERICA can be found in our code repository. Because ERICA is a simple static website, which does not require a server backend to run, anyone can clone this repository and host the website, be it on a university server, a home NAS, or even a mini-computer like a Raspberry PI.

If you would like to download all 500,000 PDFs, including the original ERIC metadata, as a bulk data dump, you can download the 600GB ERIC torrent from SciOp.net. SciOp is a torrent tracker developed by our friends at Safeguarding Research & Culture (SRC), an initiative dedicated to decentralised data preservation. Anyone can help preserve public data using their own computer by downloading and seeding any of the dataset torrents listed on SciOp.net. The complete ERIC metadata, including the additional 1.5 million catalog entries for the publications not directly hosted on ERIC, is also available as a research dataset on Zenodo compiled by Eric Phetteplace, a systems librarian at the California College of the Arts.

ERIC is only one of many public data repositories threatened by funding cuts. We are grateful to all the volunteers who are spending countless hours to preserve public data in their free time. At the Data Rescue Project we do our utmost to make every backup discoverable and accessible through our Data Rescue Tracker, which already lists over 800 datasets from almost 70 government agencies. A big thank you to everyone who contributes to these shared data rescue efforts!