Since 2009, AVP has served as the chief consultant to Indiana University in its efforts to develop a comprehensive media preservation program that resulted in what is now known as the Media Digitization Preservation Initiative (MDPI). AVP provides analysis, planning, and roadmapping regarding digitization technology and systems, digital storage, operations and workflows, and finances associated with the digitization of over 220,000 hours of audiovisual content. This work helped the project successfully obtain an initial round of funding of $15M from the University that funded a partnership with a private company, the development of a top-notch digitization facility, and the hiring of multiple positions.

Related to these efforts, AVP developed a software application for Indiana University to aid in the surveying and prioritization of their large-scale collections based on both technical and intellectual criteria. This software was developed as open source and is known as MediaSCORE and MediaRIVERS.

In 2015, IU hired AVP to investigate models and develop a strategy for a high-throughput description of the audiovisual materials being digitized as part of MDPI to improve discoverability and access. AVP gathered information from collections managers and users of MDPI content to understand whether metadata existed (it often did not) and, if so, in which formats and structures. AVP also noted optimal outputs and potential uses for the metadata, as well as related rights and permissions issues for the digitized objects.

The need for new approaches to the mass generation of metadata for digital audiovisual material was clear. AVP identified nearly 30 existing metadata generation mechanisms (MGMs) — natural language processing, facial recognition, legacy closed caption recovery, human generated metadata, OCR of images and transcription and more — that have the potential for capturing and producing metadata at a massive scale when unified in a modular audiovisual metadata platform (AMP) architecture.

This initial research led to the proposal for a three-phase iterative approach to metadata capture. AMP would act as the workflow engine that pushes data from one MGM to the next, serving as a decision engine, storing metadata for processing, and providing a metadata warehouse for longer term storage of all the metadata generated, serving as a source for target systems such as Avalon.

First-phase MGMs would create sets of data that could be analyzed by second- and third-phase MGMs. By phase three, MGMs would begin to triangulate the various outputs from early processes to improve usability and provide confidence scoring to improve search results. Interviews conducted by AVP with users and collection managers resulted in a set of metadata fields requisite for discovery in IU’s audiovisual access system Avalon, their overall value for discovery, and their value for the generation of other metadata (e.g., general keywords can be analyzed to produce names, subject terms, and dates).

Additionally, AVP performed an analysis of all costs, resources, staffing allocations, technology, and services required in the implementation of AMP within IU. This project offered IU: (a) an architecture and strategy for AMP, (b) a realistic view of what would be required to implement AMP, and (c) the opportunity for vast improvements to discoverability and access to their audiovisual collections.

More Case StudiesContact Us