My Life as a Digital Archiving Lab Intern

Written by Digital Archiving Lab intern, Chase Monroe ’21.

Over the course of this Spring, I have had the opportunity of being an intern in the Digital Archiving Lab under the supervision of the Digital Resources Librarian, Angie Kemp. My major project during my internship involved migrating the UMW publications (The Battlefield, The Aubade, Alumni Magazines, and more) from Eagle Explorer into our Digital Collections in Preservica. Throughout this project, I learned the basics of digital collections project management, including the creation and transformation of metadata.

Moving to Preservica provides virus protection for the publications, keeps the PDFs searchable, and provides workflows and options for file types that may become obsolete over time. Our Digital Collections has many historic resources like The Centennial Image CollectionThe James L. Farmer Collection, and the UMW Blueprints and Architectural Drawings, so the addition of the publications allows a one stop shop for searching.

While Eagle Explorer allows users to search full-text publications, the publications and data describing them (metadata) are actually hosted in the Internet Archive (IA) using their own metadata elements. The metadata in our Digital Collections is in the Dublin Core schema (Figure 1). Metadata can be arranged in other schemas and is important as it standardizes the data elements that go into our Digital Collections. Additionally, having a uniform standard allows searching for the same date or subject across all collections to be simple and easy, and allows for the sharing of data across platforms.

A screenshot of a gray box with bolded metadata categories containing descriptive information about a photograph.

Figure 1. A cropped screenshot of metadata of the “James Farmer teaching civil rights class” photograph in the James L. Farmer Collection that has some of the Dublin Core elements (in bold).

I started the metadata transformation process by reviewing the descriptive metadata in the Internet Archive to see if there was anything we wanted to remove, keep, or edit for Preservica. However, our Digital Collections requires Dublin Core schema which is not used by the Internet Archive. So, I mapped out the Internet Archive elements and metadata to the appropriate Dublin Core corresponding element and assessed the metadata going into Preservica. For instance, the “Call number” element in IA corresponds to the ”Relation” element in Dublin Core. Next, I made a new project folder in Oxygen XML Editor, containing the batch of XML records for the publication that would be transformed into a new set for Preservica. Then, I created a Dublin Core template in Oxygen XML Editor (Figure 2) to visualize Dublin Core schema for the XSL Stylesheet.

Screenshot of an XML file opened in Oxygen XML Editor software. Metadata elements, which are in angle brackets, and their associated content are listed.

Figure 2. A cropped screenshot of the Battlefield Dublin Core Template in Oxygen XML Editor.

After my Dublin Core template was complete, I created the XSL Stylesheet in Oxygen (Figure 3) using the Dublin Core template as a guide. The XSL [eXtensible Stylesheet Language] Stylesheet allows you to change the format of a batch of XML records all at once into Dublin Core or other schemas! Angie provided me the stylesheet template, and I made edits depending on the specific needs of the individual publication.

Screenshot of an XSL transformation file opened in the Oxygen XML software program, displaying the Dublin Core elements in angle brackets and the transformation directions for creating the content for each element.

Figure 3. This cropped screenshot of the Battlefield XSL sheet in Oxygen XML Editor shows the description that will appear in all new metadata files and the subject element that will pull from the IA XML files.

Once I checked the XSL sheet for any errors, I began the transformation process by right clicking on the original IA XML folder in the “project” tab and selecting “configure transformation” in Oxygen. I finished all the technical input, including programming Oxygen to recognize the XSL file I created, and the software produced a new output folder of my final metadata. For the Battlefield, 100 IA XML files transformed into Dublin Core XML files (Figure 4). You can view the transformed publications (Figure 5) in our Digital Collections here!

Screenshot of an XML file opened in Oxygen XML Editor software. Metadata elements, which are in angle brackets, and their associated content are listed.

Figure 4. An example of a resulting output XML file in Oxygen XML Editor from the batch transformation process.

The project was a team effort; Carolyn Parsons, Sarah Appleby, Angie Kemp, and I assessed what data was necessary to keep for the publications. After the final decisions, I moved forward on my own, and Angie reviewed the files after I finished each publication. I communicated with Angie daily, whether I was asking questions, or getting help on creating an XSL file.

Finally, I would like to thank Sarah and Carolyn for their valued input on the publication migration project and for their kindness. I enjoyed working with them in team meetings. I am so thankful for this opportunity to work for the Digital Archiving Lab, and with Angie, who I have known since I was a freshman.  She is a wonderful mentor and working with her furthered my passion for working with metadata. I am proud to announce that I applied for the online Masters of Information Science program at the University of Tennessee in Knoxville, starting Fall 2021.

Screenshot of a digital collection web page, with a gray box containing bolded metadata categories and descriptive information about the collection. Thumbnails of 12 publication covers with titles and dates are listed below the gray metadata box. Facets allowing refinement by decade display in a gray box to the left of the thumbnails.

A screenshot of the Student Handbook publication within our Digital Collections.

April 15, 2021

One thought on “My Life as a Digital Archiving Lab Intern

Comments are closed.