PrestoPrime

The Project

Objectives

Audiovisual content collections are transforming from archives of analogue materials to very large stores of digital data. As time-based digital media and their related metadata are edited, re-used and re-formatted in a continuously evolving environment, the concept of the unique original loses its meaning and we require new dynamic preservation processes. The purpose of PrestoPRIME is therefore to research and develop practical solutions for the long-term preservation of digital audiovisual objects, programmes and collections, and to find ways to increase access by making media archives available within the framework of European on-line digital libraries. This will result in a range of tools and services made available through a networked Competence Centre.

Audiovisual recordings preserve the history of the 20th Century so that the events and personalities can be seen and heard – something never before possible, and a record with unique impact and meaning to all people. Unlike documents which may last for centuries or even millennia, audiovisual recording materials have a life expectancy measured at best in decades. The EC-sponsored projects Presto and PrestoSpace projects have addressed the first stage of the problem by developing a preservation factory approach and methods for digitising audiovisual collections. As a result, the processes and technology for digitising analogue holdings are well known. Although the process is far from complete, in the region of 10 million hours of European audiovisual content have already been digitised. Audiovisual media is now being ‘born digital’, and will also be kept as electronic files. The new challenge is how to ensure the longterm viability of the content within these files. As the archives follow the production processes into the digital domain, they need to confront the issues of digital preservation. Digital preservation requires the indefinite error-free storage of digital information, with means for its retrieval and interpretation, irrespective of changes in technologies, support and data formats, or changes in the requirements of the user community

It is wrong to think that simply copying or migrating media files to a new format, whenever required, will solve the problem. Technical obsolescence, transcoding, reformatting, data movement, and physical media deterioration all introduce threats and the risk of loss of quality and content. Checking their integrity is not straightforward, and files, being invisible, are easily lost or mislaid. It is often supposed that digital copies are necessarily identical, but the size of audiovisual files increases the probability of read errors. The emergence of transcoding artefacts in the digital audiovisual domain is an obvious danger when “lossy” codecs and formats are used: more subtle risks are emerging in the audio domain, which reveals artefacts as a result of multiple transcoding with professional digital equipment that supports supposedly lossless formats. Nor would the problem be solved by the invention of some new, ‘everlasting’ data storage medium: even standard file formats disappear. MPEG-1 video, which was introduced as a digital equivalent to VHS videotape in the early 1990s, is already effectively obsolete. It is therefore clear that archives and libraries need to store not just the audiovisual files, but all the information needed to keep file content readable and usable.

Many different standards, reference models and frameworks have been suggested for managing contents in digital libraries. The OAIS Reference Model3 defines the processes required for long-term preservation and access to information objects, establishing a common language to describe them. But although the OAIS model discusses preservation through migration, it does not specify how to monitor objects or the systems they are stored in, identify when migration should take place, how it should take place, or to what an object should be migrated. Research is now in progress (for example by the National Archives and Records Administration NARA and the EU SHAMAN4 project) to create a theory of preservation5 that extends the concept from one of just sending the records into the future to one that can also preserve a description of the environment that is being used to manage and read the records. The entire information context should be described so well that the records can be migrated into an independent preservation environment without losing authenticity or integrity, so that the data remains indefinitely accessible. But while digital preservation technology offers a range of tools, processes and standards, very few of these have even been tested on audiovisual files. The use and re-use of content in different territories attracts a further level of metadata in the form of rights. A recent study recently by RAI has identified 800 different types of rights associated with its archived assets alone. This number will increase further when scaled at the European level, and the ability to manage rights will have a bearing on the provision of public access to audiovisual archive content. In the digital universe, the broadcast archives (which hold by far the largest collections of audiovisual content) are part of a wider system of library, museum and private sector media collections, each of which has its own media creation and preservation practices. Changing technologies and consumption patterns are also blurring and shifting the boundaries between personal and public media: a growing volume of user-generated content is entering the public space not only through the Internet and sites such as YouTube, but through the mainstream broadcasters. The capture and use of user-generated content raises by professional media organisations raises issues of the compatibility of user-generated metadata with library and archive norms. Any general solution for long-term audiovisual preservation needs to accommodate the different approaches of media creators, the archive and library communities, and support the unified on-line access proposed by the European Digital Library initiative.

PrestoPRIME’s goal is therefore to research and develop and methods for applying long-term preservation practice and preservation theory to the practical requirements of sound and moving image content, so that it can be safely preserved in affordable distributed and federated environments. In response to these complex needs, PrestoPRIME will research new technologies for long-term audiovisual preservation that recognise the specificities of audiovisual media on the one hand and the long-term promise of approaches based on multivalent techniques in distributed and federated environments. This requires that we take into account:
  • Complexity of Audiovisual file formats and wrapper files that include multiple signals and timedependent metadata, in formats largely unknown to digital library preservation technology.
  • Preservation of the signal: audiovisual files encode an approximation of a signal, which is reconstituted in the playback. The primary issue is preservation of the signal. Migration of the "optimal representation of the signal" is an absolute requirement for public access.
  • Large numbers of large files: all technology has failures, and mass storage technology works by getting failure rates down to acceptable levels and using redundancy and maintenance to deal with failures when they do happen. When moving from terabytes to Petabytes to Exabytes, this approach is expensive and if you turn your back for a second, data losses can be catastrophic. We need storage management technology that "understands" the audiovisual signal, can perform a "best endeavour" playback of the data that is still readable when problems do happen, and certainly doesn’t just say "sorry, read error, your content is lost forever".
  • Cost of safety: we need to optimise the trade-offs between quality, quantity, cost and safety so that archives can afford to preserve huge volumes of digital audiovisual content to acceptable standards.

PrestoPRIME will pursue the research and development with four strands of activity, each of which is associated with one principal Objective, against which progress and outcomes will be assessed:

Archives, libraries, museums and other collections. To be achieved by:

  • Comparing strategies for audiovisual preservation, including multivalent, emulation and migration approaches, and creating a standard data model for audiovisual content preservation.
  • Researching metadata solutions for long-term audiovisual preservation to manage migration between data models, metadata enrichment, the maintenance of correct relationships between audiovisual objects, the integrity of files, and a model of the preservation process itself.
  • Modelling storage and processing as a set of services with well-defined interfaces, service level agreements and quality of service metrics appropriate to preservation.
  • Producing software tools and a framework for implementing and combining both migration and multivalent preservation strategies optimised for audiovisual content.
  • Developing a service oriented storage infrastructure that enables multivalent and migration preservation strategies to be deployed on both local and federated storage and processing.
  • Creating a quality assessment and risk management framework for monitoring and optimising the long-term safety and integrity of audiovisual content when using migration/multivalent strategies.

To research and develop means of ensuring the long-term future access to audiovisual content in dynamically changing contexts. To be achieved by:

  • Establishing metadata interoperability between audiovisual archives, cultural heritage institutions, the Semantic Web and content portals, with services for metadata conversion and deployment.
  • Developing tools and services to enrich archive content with user generated metadata and align it with existing metadata.
  • Developing services for tracking audiovisual content, using fingerprinting techniques and documenting the provenance of audiovisual content items.
  • Modelling rights associated with audiovisual content and developing standard-compliant services for archive rights management.

To integrate, evaluate and demonstrate tools and processes for audiovisual digital permanence and access. To be achieved by:

  • Integrating the software modules and metadata tools in a functioning testbed.
  • Validating the preservation actions and tools in the testbed.
  • Developing test material and proving the ability of the integrated testbed to carry out the required actions for long-term preservation with very large volumes of audiovisual content.
  • Integrating the PrestoPRIME software with existing market-leading digital libraries software.

To establish a European networked Competence Centre to gather the knowledge created by PrestoPRIME and deliver advanced digital preservation advice and services in conjunction with the European Digital Library Foundation and other projects. To be achieved by:

  • Setting up the legal and operational structure of the networked Competence Centre for audiovisual preservation and the European Association of Audiovisual Archives.
  • Defining and initiating activities for Monitoring and Registering technology and best practice.
  • Issue guidance on standards relating to audiovisual content and its preservation.
  • Establishing a communication platform for the Competence Centre, to manage the Centre’s outputs.
  • Publishing economic guidance and business models for digital media preservation activities.