About the Project
The University of Iowa Libraries received funding from the National Historical Publications and Records Commission (NHPRC) (grant RD10020) to digitize the microfilm edition of the Henry A. Wallace (1888-1965) Papers to create an open-access online collection. The 67 reels of microfilm contain approximately 67,000 frames depicting correspondence (letters, telegrams, and postcards), appointment books, and memoranda by Wallace, the 33rd vice president of the United States. Wallace, an Iowa native, and his heirs have donated portions of his personal papers to The UI Libraries Special Collections & Archives since 1954. Contents of the microfilm edition span the years 1900-1965 and include all of the documents in the above listed categories as the collection existed at the time of filming. In 1970, The UI Libraries prepared a joint index to the microfilm editions of Wallace papers at Iowa, the Library of Congress, and the Franklin D. Roosevelt Library. Each institution filmed its collection independently, with Iowa’s reformatting project and compilation of the joint index supported by a National Historical Publications and Records Committee (NHPRC) grant. The collection also includes four reels of correspondence from Wallace’s father, Henry Wallace (1836-1916), the founder of Wallaces’ Farmer.
In the spirit of the Digitizing Historic Records grant, which funds cost-effective methods to digitize nationally significant historical record collections and make the digital versions freely available online, the project team developed an effective workflow that repurposed existing descriptive material, rather than creating new metadata. Efficiencies were also achieved in the areas of image processing and upload. Here are a few key efficiencies and the associated trade-offs:
Displaying images as they appeared on microfilm. We bypassed individual cropping and straightening of the 60,000 images and opted instead to batch resize them for display. The images aren't pristine but the processing time was cut dramatically as a result of this decision.
Organizing digital files at the reel level. We created one compound object per reel. Such large objects taxed the content management system, which resulted in slower load times. But the gains in processing time, the ability to better manage search results, and the ability to retain the organizational structure created during the microfilm project outweighed the concerns about load times.
Describing items at the reel level. Providing item-level description for each letter or memo does not scale. Instead, we repurposed metadata from the microfilm inventory and performed optical character recognition (OCR) scanning on the digital images to allow full text searching. Metadata was added at the reel level (“parent record”), rather than at the level of the individual images within each reel (“child records”); controlled access points for major correspondents were added to parent-level records, with names drawn from the existing inventory and mapped to standardized headings from the Library of Congress name authority file.
Providing a digital surrogate of the print index. Initially, users could only locate specific items by consulting the online index and then searching the collection by microfilm reel and frame number. While not the most elegant solution, this cross-referencing is familiar to researchers and was deemed a necessary and acceptable trade-off for the purposes of this project. As a second phase of this project, the Libraries transformed the index into a searchable database of letters from all three repositories. Users can now search with precision on date and correspondents.
Reformatting Random sampling test scans of the Wallace film by our vendor confirmed that we needed to scan at 8-bit grayscale; the black and white or 1-bit bitonal lost too much information, especially penciled notations. There was enough discernible difference between 300 and 400 ppi that we decided to go with the 400 ppi, TIFF, (ITU Group IV), uncompressed.
Each reel frame was numbered including all targets. Therefore, we retained the targets and any duplicate exposures in order to allow automatic file naming. Targets were removed later. Our file naming convention used a simple reel/frame number: Ia(reel number)-(frame number to 4 digits). Example for reel 40, frame 1: Ia40-0001.
JPEG images were derived from original TIFF and resized to 1500 pixels wide for display. Original resolution (400 ppi) was retained in the display images to increase accuracy of Optical Character Recognition. Images were uploaded into the CONTENTdm digital asset management system as non-hierarchical compound objects (multi-page items were not nested). OCR processing within CONTENTdm allowed for search word highlighting directly on the page image.
We have converted the print index into a searchable database. The results link to our digitized materials or to the reel number for the other collections. When users start typing in the correspondent search box, names will appear, showing the number of results for each name, based on sender or recipient. Select the name you want and press search.
The resulting list includes everything to or from the individual selected. The film numbers for the Iowa microfilm link directly to the digitized version. The results display in date order. Columns can re-sort by sender, recipient or film number. You can also search by year or by multiple years.
If interested in a specific date, try the general search box, formatting the date with the year first, e.g. 1945-04-12. This top search box also allows you to search by first name or by a last name to get correspondence for everyone with that last name.