D24.2 Implementation and Testing of an Authenticity Protocol on a Specific Domain

Contents:

  • 1. Introduction
  • 2. Health-care Data Repository in Vicenza
  • 3. Social Science Data Repository at the UK Data Archive
  • 4. Scientific Experimental Data Repository in HEP
  • 5. Articulation with the rest APARSEN WPS and Tasks
  • 6. Integration and Outreach
  • 7. Conclusions
  • References
  • Appendix – Ingest at UK Digital Archive

The APARSEN project has developed an “authenticity management model” described in deliverable D24.1. This model is based on the principle of performing controls and collecting authenticity evidence in connection to specific events in a digital resource lifecycle. The model is complemented by a set of operational guidelines that allow an institution to set up an Authenticity Management Policy, i.e. to identify the relevant transformations in the lifecycle and to specify which controls should be performed and which authenticity evidence should be collected in connection with these transformations. The aim of this deliverable was to test the model and the guidelines at operational level when dealing with the concrete problem of setting up or improving a LTDP repository in a given specific environment, to get to the definition of an adequate authenticity management policy.The analysis was undertaken in multiple test environments provided by APARSEN partners. The repository of the health care system in Vicenza (Italy) is discussed in section 2. The guidelines have been tested in the Vicenza case study (at least for two of the workflows) to their full extent, i.e. from the preliminary analysis, to the identification of the relevant lifecycle events, to the detailed specification of the AERs. Moreover in one case the process has been carried out to the formal definition of the authenticity management policy, down to the specification of the authenticity protocol.A second case study is presented in section 3 and deals with the social science and humanities repository at the UK Data Archive at the University of Essex. The Archive ingests extremely heterogeneous collections with limited influence over actions in pre-ingest keeping systems and they limit their detailed responses to the SUBMIT and INGEST events. Like the CERN case study (described below) they exist within a highly connected, long-standing relationship with their depositors (across the governmental and academic sectors) but unlike the other case studies their workflow involves extensive curation and enrichment (mainly for standardization and context) during AIP creation relying on complex manual processing by specialized teams. The last case study is discussed in section 4 and is devoted to the scientific experimental data management at CERN, and more generally to the High Energy Physics community, which manages an immense data flow, and has to face the considerable complexity and diversity of the research data output. Section 5 describes how this work is related with the other work packages and tasks of APARSEN. In section 6 the integration of the activity in WP 24 with other projects and how the results of the RTD activity could be actually translated into practice is discussed. Finally, in section 7 are some concluding remarks.

This deliverable provides interesting real-world feedback on the applicability of APARSEN’s authenticity management model. Readers will have to choose which case study may be most relevant to read in their situation. The UK Data Archive case study is particularly interesting because it describes their interest in moving from a more manual, procedures based management scenario to a more granular system of event-based information capture, while at the same time emphasizing the need to consider the associated financial and time investments such a move will incur. They are currently in transition from systems that don’t log all events; all event data is not machine readable/actionable; and it has little control over pre-ingest producer related activity that may affect an object’s provenance and authenticity. This situation is a very common one and their feedback provides some interesting approaches for gradually introducing the needed protocols, balancing authenticity requirements with real world archive scenarios.