D3.1 Open research challenges and research roadmap for SCAPE

Contents:

  • 1 Introduction
  • 2 Research in Digital Preservation: 2.1 Digital Preservation Research roadmaps
  • 2.2 Current Research Questions in DP
  • 2.3 Research Goals in SCAPE: 2.3.1 Overview and method
  • 2.3.2 Scalable platform
  • 2.3.3 Scalable planning and watch
  • 2.3.4 Scalable components
  • 2.3.5 Additional research data testbed goals
  • 2.4 Emerging Topics
  • 3 Community involvement: 3.1 Digital information models
  • 3.2 Value, utility, cost, risk and benefit
  • 3.3 Organizational aspects
  • 3.4 Experimentation, simulation, and prediction
  • 3.5 Changing paradigms, shift, evolution
  • 3.6 Future content and the long tail
  • 4 Digital Preservation Challenges
  • 4.1 Future preservation infrastructures
  • 4.2 Advanced simulation and prediction models
  • 4.3 Information models and benchmarking
  • 4.4 Identification of emerging topics
  • 5 Conclusions and Outlook

This report outlines the research roadmap of the SCAPE project, which focuses on the scalability of preservation systems in terms of storing and processing as well as decision making and control. It outlines the key goals of the R&D work packages in SCAPE, grouped according to sub-projects (preservation components, preservation platform, and preservation planning and watch) and places them within the European digital preservation research landscape. Each research goal shortly outlines the state of art, key contributions, and open issues.The document furthermore reports on the results of a workshop on Open Research Challenges, organized at IPRES 2012. It includes a summary of discussions held around six topics: 1. Digital information models 2. Value, utility, cost, risk and benefit 3. Organizational aspects 4. Experimentation, simulation, and prediction 5. Changing paradigms, shift, evolution 6. Future content and the long tailThe report goes on to identify and outline common gaps and openings for future research and finally, identifies three emerging critical research topics that arise from the cross-section of identified open problem. These are, broadly speaking: 1. Future preservation infrastructures. 2. Advanced simulation and prediction models. 3. Information models and benchmarking.

This report is especially good in how it describes a complex project with ambitious goals; the goals are clearly defined and put into context (state of the art, SCAPE’s intended contributions and open issues), ultimately providing the reader with a good understanding of some of the most difficult issues repositories face when trying to implement preservation services. The project focuses on scalability- confronted with huge collections of complex objects, how to deal with capacity management and quality of service; and implementing automated preservation planning: how to help organizations move from high-level strategic planning and decision making to the execution of automated preservation plans.Its consideration of an organization’s maturity level in its ability to take on such challenges is a welcome one. The second half of the report, summarizing a session on research challenges at IPRES 2012 provides readers with a nice overview of issues confronting the field – from the philosophical to the practical -how does digital data differ (does it?), and the ongoing evolution of digital preservation concepts.It concludes with some interesting opinions on continued critical research topics, including whether, given the complexity of operating trustworthy digital preservation systems (TDR), individual institutions are smart to go it alone. The report certainly motivates the reader to stay informed with what the SCAPE project plans to deliver. It is interesting reading for anyone involved in digital preservation.