Digital Preservation Is People

By: Joshua Ranger
November 14, 2014

Preservation is a resource heavy endeavor. People. Time. Equipment. Infrastructure. Facilities. Space. Training. And, I suppose, some cash-ola.

Collection management is, to a degree, resource management. This is what I have available: How do I balance? Where can I add and where can I cut and still do the most good? (Or as unfortunately can be the case, where do I have to cut, and cut, and cut, and cut, and cut…)

In the best case scenario, these decisions can be made or primarily influenced by the archivist or knowledgeable caretaker. But in terms of facilities or IT support, that may not always be the case. Broader institutional budgets, policies, or departmental politics may limit the caretaker’s influence in such matters.

Not that IT is being malicious here, but as another resource heavy undertaking, IT departments have their own concerns, their own resources to manage, and policies and procedures that have to be applied across multiple departments with needs that may be more focused on day-to-day computing requirements, finite or scheduled storage, and access to cat videos or online shopping with minimal load time.

Really, though, due to the increasingly interrelated nature of archives and other departments like IT, policy and budgetary decisions will be under repeated review and questioning by those further afield from our field. This has definitely been the case with file storage and digital preservation, where archives require a greater degree of space, monitoring, and retention than has typically been applied under normal records management.

As the digitization of legacy materials and the increasing accessioning of born digital collections ramp up and up in archives, the concern for preservation quality digital storage continues to explode. And in spite of the belief that storage is getting cheaper, there is still the fact that increasing from 1TB to 10TB is a rapid increase that can burdensome to expand to all of a sudden when compared to former budgets and system management needs.

The potential or imagined cost savings of cloud computing and storage have inextricably hung the cloud over the IT conversation, and, therefore, looming over the conversation about the storage and access of digital archival materials. Depending on collection size, type, and internal resources, the cloud is not always the answer, but the option exists and must necessarily be reviewed. In the past few months there have been a number of blog posts and white papers on the topic of using cloud storage as a preservation environment, ranging from the British National Archives’ work on guidance and our own series on assessing cloud storage service providers, to some more recent excoriations on the viability of Amazon S3 to act as reliable preservation storage.

Amazon in and of itself is likely not a great preservation storage solution. Amazon in tandem with well-outlined requirements and a preservation-related system and/or enforced preservation policies and/or other storage management services has potential. S3 is not the only cloud storage solution, but it often becomes a metonymic stand-in for all cloud services because it is the big fish in the pond. The cloud is not a specific type of storage media or system — rather it is merely a computing or storage service provided remotely by a third-party. The same services in-house IT may provide depending on resources and capabilities. The service may involve disc or tape. It may involve complex management software or a simple web interface. It may involve cursing at your computer. No, it will involve cursing at your computer.

Regardless, as we all should know by now, any third party solution is a preservation risk due to potential changes in support or the discontinuation of a product (or the discontinuation of a service provider). Preservation decisions are inherently risk management decisions — determining where trade offs are acceptable and where higher risks are allowed or lower risks are required. Strategies such as redundancy and geographic separation are risk management solutions. But risk assessment and planning are also part of risk management. Where are the potential weak points in the chain and how do I accommodate for them or prepare for an inevitable break in the chain? What is my exit strategy from a provider and, in assessing a vendor, will their policies support those requirements?

A major issue in the archival field is the professionalization of the field, which is in part impacted by the application of standards and by available training in areas such as AV and digital preservation. The problem is, the education is not in place widely enough and the standards are not firmly in place. Or the problem is, there is no single standard to follow and apply when it comes to file-based collection management and preservation. It is a decision tree as to applying standards (or non-standards) and tools, and, of course, an assessment of risk regarding those decisions. What is high risk, what is lower risk, and what am I willing to gamble on that scale?

It’s a new world, and a new model to adapt to in order to manage files on a more accelerated schedule for migration, refreshing, obsolescence, and continual change. There are increased risks with cloud-based storage for preservation, and that the simple cost of storage only is not the whole story — bandwidth costs, a layer of file management, and digital preservation services such as file integrity checks are just part of the additional required costs beyond storage. Of course this is the same with the long term preservation of physical objects where facilities costs, environmental monitoring, staff time to physically manage and retrieve/reshelf materials, and regular conservation work are all costs above just buying a big room to stick things in.

And in that same way we need to work with Facilities to ensure that materials, HVAC, cleaning and maintenance, lightbulb replacement, and other physical storage design and upkeep activities meet archival policy standards, we also need to be aware of those issues with digital storage. This includes knowing how to argue for certain decisions under review by others, knowing how to review the decisions of others to ensure they are suitable prior to implementation, and being able to create or adjust policies to be flexible enough to fit new scenarios or make the best of situations that are not 100% under our control. Being a part of a larger institution means that we do not always have complete control over decisions, and existing infrastructures may be under continual review by others from aspects that have very little to do with preservation or best practices.

Within archiving and preservation we can get caught up in or obsessed with materials, with stuff — the boxes, the mylar sleeves, the hardware, the software, the Clip Kloppers — but in the end these are mere tools. It’s always good to step back and remember that, even in the floating world of digital, preservation and collection care is very much about people, about communicating and collaborating and figuring out how to make the best use of the tools available within the given situation.


Joshua Ranger