Uncategorized
AVPreserve, METRO Release Audiovisual Cataloging & Reporting Tool
4 September 2013
AVPreserve announces the Beta release of the AVCC Cataloging Tool and Reporting Modules in conjunction with The Metropolitan New York Library Council’s (METRO) Keeping Collections Project. The tool is offered as a free resource to help archives manage, preserve, and create access to their collections.
AVCC is a set of forms and guidelines developed to enable efficient item-level cataloging of audiovisual collections. Each module (Audio, Video, Film) includes individualized data entry forms and reports that quantify information such as format types, base types, target format sizes, and other data critical to prioritizing and planning preservation work with audiovisual materials.




Based on years of experience with how audiovisual collections are typically labeled and stored, AVCC establishes a minimal set of required and recommended fields for basic intellectual control that are not entirely dependent on playback and labeling, along with deeper descriptive fields that can be enhanced as content becomes accessible. The focus of of AVCC is two-fold: to uncover hidden collections via record creation and to support preservation reformatting in order to enable access to the content itself.
More information and a request form for access to your own module is available through the METRO Keeping Collections website at http://keepingcollections.org/avcc-cataloging-toolkit/. AVCC is currently in Beta form and has been designed in Google Docs. Currently only the Audio and Video modules are available. The development of a more stable web-based database utility is anticipated in early 2014, and your feedback in testing this current version will help. Please direct any questions to AVPreserve Senior Consultant Josh Ranger via the AVCC or avpreserve.com contact forms.
AVCC was developed by AudioVisual Preservation Solutions with support from The Metropolitan New York Library Council (METRO) Keeping Collections project. Keeping Collections was launched to ensure the sustainability and accessibility of New York State’s archival collections as part of the New York State Archives Documentary Heritage Program. Keeping Collections provides a variety of free and affordable services to any not-for-profit organization in the metropolitan New York area that collects, maintains, and provides access to archival materials.AVCC is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc-sa/3.0/ or send a letter to Creative Commons, 444 Castro Street, Suite 900, Mountain View, California, 94041, USA.AVCC ©2013 AudioVisual Preservation Solutions, Inc.
AVPreserve Releases Interstitial – Free Error Detection Tool
3 September 2013
AVPreserve is pleased to announce the release of interstitial, a new tool designed to detect dropped samples in audio digitization processes. These dropped samples are caused by fleeting interruptions in the hardware/software pipeline on a digital audio workstation. The Interstitial tool Follows up on our work with the Federal Agencies Digitization Guidelines Initiative (FADGI) to define and study the issue of Audio Interstitial Errors.
interstitial compares two streams of digitized audio captured to a digital audio workstation and a secondary reference device. Irregularities that appear in the workstation stream and not in the other point to issues like Interstitial Errors that relate to samples lost when writing to disc. This utility will greatly decrease post-digitization quality control time and help further research on this problem.
interstitial is a free tool available for download via GitHub. Visit our Tools page for this and other free resources.
Toward Less Precious Cataloging
9 August 2013
For a number of years after college I was in the workers compensation and general liability industry, bouncing around among positions from file clerk to claims adjuster to data analyst. Somewhere in there I did transcriptions of phone interviews between adjusters and claimants. One of the more depressing parts of the job, I tried to distract myself by attempting to transcribe things as faithfully as possible, reflecting every single er, uhm, stutter, pause, and garbled sentence fragment. I wanted to reflect how people really spoke in all of its confusion and banal glory.
Also, coming out of literary studies, I forced myself to imagine there was some meaning to those various tics and patterns, but I also wanted to see how far I could stretch the plasticity of punctuation. What length of pause was a comma, a dash, a double (or triple!) dash, an elipsis? Are fragmented thoughts governed by semi-colons even if such marking was unintended by the speaker? One almost has to feel (or, at least I do) that under copyright law my expression of this recorded speech was so transformative that those transcriptions were wholly new works of art!
Did I mention I was very bored at this job?
It is true, however, that transcription is an art. It is the interpretation of information in one medium into another, not necessarily compatible, medium. I guess it wouldn’t be considered a fine art because it is more along the lines of didacticism — the direct communication of that information with maximal understanding as a goal, rather than an approach of impressionism, nuance, and open questions of interpretation.
But an art it is, requiring specialized knowledge to be done well and skill greater than just typing the words as they come. One can look at the frequent mistakes of spell check and speech-to-text, and consider the complex set of rules structured around such utilities, to appreciate how difficult written language is, in spite of our false sense of security in the rules of grammar. The ability of a text to be read by a machine does not confer infallibility and comprehension to that machine — on its own or as a conduit to a user. Despite Strunk & White’s efforts, there was a reason they wrote of elements of style.
*********************************
I find cataloging and related record creation to follow along these same lines. It too is a system of communicating information that is more of an art than its (multiple) rules based structures may suggest. It is also a system in which the art and stylistic traditions can and do fail machine readability, even when specifically designed for such.
You see, part of the art of cataloging is a series of visual tropes or signifiers that denote concepts or avenues of interpretation. For instance, [bracketed text]. This typically does not mean that there are actual brackets in whatever text, but rather may mean “This is an unofficial value I made the decision to assign to this field”…or maybe “I couldn’t read the handwriting and this is my guess”…or maybe “This is commentary, not actual information from the object”. Likewise brackets may be used to contain text such as […] or [?], which could mean “Unintelligible”, or “Maybe?”, or “Just guessin’ here”. (That last is my own way of sayin’ it.) One sees this frequently with titles, dates, and duration especially, though such visual signifiers may appear anywhere in a record.
And these markings are a thing of beauty — poetic layers of expression in six line strokes. A toehold of information in an uncertain world. A reflection of the cataloger’s fidelity to truth, even when it means admitting the truth is unknown and that one has failed in identifying it.
But.
But.
But, as at times can be the case, such art is impractical, limiting the efficiency of working with large data sets, or the advantage of machine readability to cleanup and search data, or the achievement of goals associated with minimal processing, which for archives is often the first pass at record creation.
As I have argued before, the outcomes of processing legacy audiovisual collections should support planning for reformatting and for collection management. This mean creating records at or near item level. This means data points like format, duration, creation date, physical characteristics, asset type, and content type must be quantified to some degrees. This means no uncertainties. There must be some kind of value for planning and analysis to work, whether it is an estimate or a guess or yeah-that’s-wrong-but-things-will-come-out-in-the-wash — or whether we just use common sense and context to support our assumptions rather than undermining them. This means be less precious.
The diacritics of uncertainty are meaningful, but when analyzing data sets they can push segments of records aside into other groupings when they likely shouldn’t be, and they make it difficult or impossible to properly tally data for planning and selection. A bracketed number or number with text in the same field is not a number to a machine. It cannot be summed. An irregular date cannot be easily parsed and grouped within a range. A name or term with a question mark is a different value than that same term with the question mark.
We know all of this — it is why we have controlled vocabularies and syntactical rules, and why we fret about selecting the “right” metadata schema. There is much theory around archival practice that comes down to exactness and authenticity of objects and documentation. However, there is a division in how such things are expressed and understood by a computer and by the human mind. There is also a division in exactness as data and exactness as a compulsive behavior. And any exactness is laid to waste by the reality of how collections are created.
When processing for reformatting, it must be kept in mind that the data created is not the record of record. It is a step to a complete record. It can be revised when content is watchable, or when embedded technical metadata gives us exact values like with duration. We know we are wrong or uncertain. The machine does not need to know it, cannot know it. But that fact will be apparent enough to future generations over time, as it is in so many other areas.
— Joshua Ranger
AVPreserve Second Lining It To SAA 2013
8 August 2013
AVPreserve Senior Consultant Joshua Ranger and Digital & Metadata Preservation Specialist Alex Duryee will be attending the Society of American Archivists 2013 Annual Meeting in New Orleans next week.
Josh will be presenting on the panel “Streamlining Processing of Audiovisual Collections” (Session 309) on Friday morning along with expert colleagues from the Library of Congress and UCLA. He will be discussing approaches AVPreserve uses in processing collections and performing assessments as outlined in his white paper “What’s Your Product”, as well as the AVCC and Catalyst tools we have developed to support efficient inventory creation.
Alex will be presenting in Tuesday’s Research Forum “Foundations and Innovations” on the topic of compiling and providing access to large sets of email correspondence.
We’re excited to be contributing as members of SAA this year and look forward to seeing friends old and new. We love to chat about media archiving, so look for us and stop and talk — it will be too hot to do anything else in NOLA. And oh yeah, buttons
New Tools In Development At AVPreserve
10 July 2013
AVPreserve is in the finishing phases of development for a number of new digital preservation tools that we’re excited to release in the near future.
Fixity: Fixity is a utility for the documentation and regular review of checksums of stored files. Fixity scans a folder or directory, creating a database of the files, locations, and checksums contained within. The review utility then runs through the directory and compares results against the stored database in order to assess any changes. Rather than reporting a simple pass/fail of a directory or checksum change, Fixity emails a report to the user documenting flagged items along with the reason for a flag, such as that a file has been moved to a new location in the directory, has been edited, or has failed a checksum comparison for other reasons. Supplementing tools like BagIt that review files at points of exchange, when run regularly Fixity becomes a powerful tool for monitoring digital files in repositories, servers, and other long-term storage locations.
Interstitial Error Utility: Following up on our definition and further study of the issue of Audio Interstitial Errors (report recently published on the FADGI website), AVPreserve has been working on a tool to automatically find dropped sample errors in digitized audio files. Prior to this, an engineer would have to use reporting tools such as those in WaveLab that flag irregular seeming points in the audio signal and then manually review each one to determine if the report was true or false. These reports can produce 100s of flags, the majority of which are not true errors, greatly increasing the QC time. The Interstitial Error Utility compares two streams of the digitized audio captured on separate devices. Irregularities that appear in one stream and not in the other point to issues like Interstitial Errors that relate to samples lost when writing to disc. This utility will greatly decrease quality control time and help us further our research on this problem.
AVCC: Thanks to the beta testing of our friends at the University of Georgia our AVCC cataloging tool should be ready for an initial public release in the coming months. AVCC provides a simple template for documenting audio, video, and film collections, breaking the approach down into a granular record set with recommendations for minimal capture and more complex capture. The record set focuses on technical aspects of the audiovisual object with the understanding that materials in collections are often inaccessible and have limited descriptive annotations. The data entered then feeds into automated reports that support planning for preservation prioritization, storage, and reformatting. The initial release of AVCC is based in Google Docs integrated with Excel, but we are in discussions to develop the tool into a much more robust web-based utility that will, likewise, be free and open to the public.
Watch our Twitter and Facebook feeds for more developments as we get ready to release these tools!
AMIA 2013 Presentations
8 June 2013
AVP was involved in a number of panels and events at the 2013 Association of Moving Image Archivists Conference in Richmond, Virginia, including organizing the first ever AMIA HackDay and presentations related to the imminent decay of magnetic media, the importance of metadata development for digital preservation, and the intricacies of vendor selection for digital asset management systems.
- CHRIS LACINAK “THE END OF ANALOG MEDIA: THE COST OF INACTION AND WHAT YOU CAN DO ABOUT IT” (PDF)
- MIKE CASEY, INDIANA UNIVERSITY, “WHY MEDIA PRESERVATION CAN’T WAIT THE WEATHERING STORM” (PDF)
- AMIA & DLF HACKDAY 2013 PROJECTS (Wiki)
- SETH ANDERSON, “NAVIGATING THE DIGITAL ARCHIVE: FIRST, KNOW THYSELF” (PDF)
- SETH ANDERSON, “MASTERING YOUR DATA: TOOLS FOR METADATA MANAGEMENT IN AV ARCHIVES” (PDF)
- KARA VAN MALSSEN “FROM ZERO TO DAM” (PDF)
MARAC 2013 Presentations
8 June 2013
The Mid-Atlantic Region Archives Conference epitomizes the importance and reach of regional professional organizations, opening educational and networking opportunities to broader audiences that cannot regularly attend national conventions. AVP was a proud first-time participant on two panels this year on the topics of preserving complex digital artworks and the refinement of archival data using tools such as Open Refine. We look forward to presenting at future meetings.
AVPreserve At NYAC 2013
30 May 2013
AVPreserve President Chris Lacinak will be chairing a panel at the upcoming New York Archives Conference to be held June 5th-7th in Brookville, New York. NYAC provides an opportunity for archivists, records managers, and other collection caretakers from across the state to further their education, see what their colleagues are doing, and make the personal connections that are becoming ever more important to collaborate and look for opportunities to combine efforts. We’re pleased to see that this year’s meeting is co-sponsored by the Archivists Round Table of Metropolitan New York, another in their critical efforts to support archivists and collections in New York City and beyond.
Chris will be the chair for the Archiving Complex Digital Artworks panel with the presenters Ben Fino-Radin (Museum of Modern Art), Danielle Mericle (Cornell University), and Jonathan Minard (Deepspeed Media and Eyebeam Art and Technology Center). It’s hard to believe, but artists have been creating digital and web-based artworks for well over 20 years, and the constant flux in the technologies used to make and display those works has brought a critical need for new methodologies to preserve the art and the ability to access it. Ben, Danielle, and Jonathan are leading thinkers and practitioners in the field and will be discussing the fascinating projects they are involved with. We’re proud to work with them and happy their efforts are getting this exposure. We hope to see a lot of our friends and colleagues from the area in Brookville!
New Software Development Services At AVPreserve
29 May 2013
One of the core concepts behind the work we do is that because the necessary tools for media collection management and preservation have been limited in scope and because the need for such tools is so urgent, it is up to us as a community to develop those tools and support others who do as well, rather than hoping the commercial sector will get around to it any time soon in the manner we require. This has been an important factor of why AVPreserve has looked to open source environments in developing utilities such as DV Analyzer, BWF MetaEdit, AVI MetaEdit, and more.
While most are aware AVPreserve has been a part of developing those desktop utilities, over the past year we have been doing a greater amount of work in the design and development of more complex, web-based systems we are using for the various needs of cataloging, managing and updating metadata, collection analysis, and digital preservation. With the collaboration of our regular development team and the designer Perry Garvin we work on all aspects of frontend and backend development, from needs assessment to wire-frames, implementation, and support. Some of the projects we have been working on are entirely built up from scratch, while others focus on implementation of existing systems or the development of micro-services that integrate with those systems to fill out their abilities for managing collections.
Some of our new systems include:
Catalyst Inventory Software:
In support of our Catalyst Assessment and Inventory Services we developed this software to support greater efficiency in the creation of item-level inventories for audiovisual collections. At the core is a system for capturing images of media assets, uploading them to our Catalyst servers, and them completing the inventory record based on those images. Besides setting up a process that limits disruption onsite and moves the majority of work offsite, Catalyst also provides an experience-based minimal set of data points that are reasonably discernable from media objects without playback. This, along with the use of controlled vocabularies and some automated field completion, increases the speed of data entry and provides an organization with a useable inventory much quicker. Additionally, the images provide information that is not necessarily transcribable, that extends into deeper description beyond the scope of an inventory, or that may be meaningful for identification beyond typical means. This may include run down sheets or other paper inserts, symbols and abbreviations, or extrapolation based on format type or arrangement. Search and correct identification can be performed without pulling multiple boxes or multiple tapes, and extra information can be gleaned from the images. Catalyst was recently used to perform an item-level inventory of a collection of over 82,000 items in approximately three months.
AMS Metadata Management Tool
The AMS was built to store and manage a set of 2.5 million records consolidated from over 120 organizations. The system allows administrative level review and analytic reporting on all record sets combined, as well as locally controlled and password protected search, browse, and edit features for the data. This editing feature integrates aspects of Open Refine and Mint to perform complex data analysis quickly and easily in order to complete normalization and cleanup of records or identify gaps. Records are structured to provide description of the content on the same page along with a listing of all instantiations of that content where multiple copies exist, while also providing streaming access to proxies.
Indiana University MediaScore
AVPreserve’s development team supported Indiana University in the design and creation and MediaScore, a new survey tool used to gather inventories of collections. Within MediaScore assets are grouped at the collection or sub-collection level and then broken down by format-type with like characteristics (i.e., acetate-backed 1/4″ audio, polyester-backed 1/4″ audio, etc.), documenting such things as creation date, duration, recording standards, chemical characteristics, and damage or decay. This data is then fed into an algorithm that assigns a prioritization score to the collection based on format and condition risk factors. In this way the University is able gather the hard numbers they need for planning budgets, timelines, and storage needs for digitization their campus-wide collections while also providing a logic for where to start and how to proceed with reformatting a total collection of over 500,000 items that they feel must be addressed within the next 15 years.
**************
These are just a few of the projects we’re working on right now. We’re thrilled to be augmenting our services this way and to be able to provide fuller support to our clients, as well as to continue developing resources that will contribute to the archival field. We’re looking forward to new opportunities collaborate and provide solutions in this way.
AVPreserve Welcomes Alex Duryee
24 May 2013
AVPreserve is pleased to welcome Alex Duryee to our team. Alex is a 2012 MLIS graduate from Rutgers University with a specialization in Digital Libraries and comes to us from Rhizome where he worked on the preservation of net- and disc-based artworks. Experienced in web archiving, data preservation, and the migration of content from optical and magnetic digital storage, Alex is working on a number of digital preservation projects with us. These range from developing new tools and software for support of integrity assurance for digital repositories and audio transfers (extending on our work with BWF MetaEdit and Digital Audio Interstitial Errors, to analysis and normalization of a dataset consisting of 2.5 million records. The work he is doing is a part of our growing focus on developing tools and support for managing metadata and digital files in a preservation environment. We’re excited to add this to our continuing services involving physical collections and excited to have Alex on board!