The Historic Education Archive
The Historic Education Archive at the James E. Walker Library is a digital collection containing curated educational items published before 1925. This archive was created for students and faculty at the College of Education at Middle Tennessee State University who want to research the history and evolution of pedagogical methods.
As the Digitization Specialist, my primary role in this project was to collect metadata on and scan titles chosen for the HEA. I also helped curate the collection by gathering statistics on each item through WorldCat by OCLC and identifying rare titles through evaluation criteria. This narrowed down the collection to 800 titles, all of which I digitized. This process began by collecting metadata on each book using an internal standard and FOLIO, our ILS. After this, I used a high speed scanner with the CapturePerfect software for most of the titles. For the items that would be moved to our Special Collections, I scanned them with a Bookeye.
One major challenge we encountered was ensuring that each title would be accessible to visually impaired users through Optical Character Recognition (OCR). The proprietary OCR software that was integrated with our scanner was not producing readable output given the age and formatting for most of the HEA items. To address this issue, I used the open-source OCR software Tesseract to create a Python script which would OCR specific titles of the collection. This script and supporting documentation can be found on this GitHub repository.
The archive is currently being processed and will be published in full through the JEWLScholar digital repository.