Preservation Health Check: Monitoring Threats to Digital Repository Content

Contents:

  • Acknowledgement
  • Introduction
  • Preservation Metadata: Costs and Benefits
  • Preservation Metadata for Threat Assessment
  • Methodology
  • Next Steps
  • Notes
  • References

OCLC has launched a Preservation Health Check Pilot program in order to develop a general approach to monitor the health of repository content based on its associated preservation metadata. One of its goals is to essentially confirm the value of collecting and maintaining preservation metadata and its use in monitoring threats to digital content. This report provides background on the problem addressed by the PHC project, their approach for operationalizing the concept of a preservation health check, some preliminary findings, and next steps. It explores the opportunities for using preservation metadata to support threat assessment exercises and in particular, evaluating the utility of PREMIS preservation metadata as an evidence base for such assessments.  The threat assessment model chosen for the study is the Simple Property-Oriented Threat (SPOT) Model. Its reasoning: the PREMIS Data Dictionary essentially represents an evidence base of information that potentially is of use in making assessments of the immanency of threats to the archived content in a digital repository and The SPOT Model serves as a framework for organizing and assessing the evidence supplied by PREMIS-based preservation metadata. The SPOT Model identifies six essential properties for digital objects; this report uses the property ‘persistence’ to illustrate the pilot project’s preliminary mapping and reasoning.  The paper concludes that there is reason to believe that the PREMIS metadata can support threat assessment. In Phase 2 of the project, they hope to explore the issue further by constructing additional generalized logic sequences/diagrams that demonstrate how PREMIS metadata can be used to assess the threats defined in the SPOT Model. Once these examples are constructed, they will test their efficacy on a data set of “real-world” preservation metadata.

For those tasked with justifying the creation/collection and maintenance of extensive preservation metadata, OCLC’s pilot project is something to follow closely. A lot of literature on the theory of preservation metadata and its role in reducing risk exists but it remains theory. This report begins to clarify in more concrete terms, how preservation metadata could be used to assess threats to digital content on an operational level.  The paper is important for anyone involved with defining, implementing and promoting the use of preservation metadata and for those trying to get a handle on how preservation metadata works with threat models. It does not provide all the answers as it only reports on phase one of the project. However, it provides some interesting and promising first results.