Search This Blog

Thursday, January 03, 2008

Institutional Repositories and Digital Preservation

Blogging on Peer-Reviewed Research

Hockx-Yu ,
. (2006). Digital preservation in the context of institutional repositories. Program: electronic library and information systems, 40(3), 232-243. DOI: 10.1108/00330330610681312

This article looks at the digital preservation dimension to developing institutional repositories. Helen Hockx-Yu was a programme manager working in this area for the JISC (Joint Information Systems Committee) so she is able to also outline the JISC view on Institutional Repositories and some of their initiatives to assist people developing them.

I am interested because De Montfort University Library is responsible for developing DORA: the De Montfort University Open Research Archive. I also administer the Library's LOCKSS Archive, which attempts to digitally preserve content from the electronic journal services to which the Library subscribes. There is not necessarily an overlap between these two activities: we are not using LOCKSS to preserve content from the Institutional Repository, though in theory, that may be possible.

Institutional repositories
The question: What is an institutional repository? is answered as 'digital collections that capture and preserve the intellectual output of a single or multi-university community'. DORA counts as at least claiming to be an attempt to capture De Montfort's intellectual output. DORA also falls within the pattern of most currently established institutional repositories in collecting 'e-prints' of scholarly publications. An IR could aim to include images, video, sound and other file formats. The wider the intake of file formats, the more problems are created from the digital preservation point of view.

Institutional Repositories are still new arrivals in the information landscape. Hockx-Yu can point to potential benefits for:
  • Authors: who gfain visibility.

  • Users: find information more easily (usually without charge).

  • Institutions: increase research profiles.

  • Funders: see wider dissemination of research outputs.
The push to develop IRs is partly from Institutions and those Research Councils who have mandated that research outputs be made freely available. Whether these motivation factors reach and mobilise the people doing the research has yet to be seen. The JISC has set up a number of projects, like SHERPA and ROMEO to help overcome some of the obstacles.

Digital preservation
Digital preservation refers to the series of managed activities necessary to ensure continued access to digital materials for as long as necessary. Backing up the files may be part of this activity, but is not the whole process. For example, you can save a file, but unless you know what application you need to open it, it will not be of much use to anyone. There may be an argument for not needing to bother with digital preservation within institutional repositories. Much of the material held is a poor version of the polished articles published in commercial journals and such journals should take on the burden of preservation as part of their service. Some publishers are more or less willing to take on this responsibility. In my view this is one of the lessons of the LOCKSS Pilot project.
There may be other materials held in a repository, such as images or research databases where the repository holds the primary preservation responsibility. Other JISC sponsored projects can help to assess the costs and practices used to preserve such material: LIFE and eSPIDA .

One question that we as institutional repository managers need to address is that of format migration. Should we accept any file format submitted or migrate those deemed to be 'non-standard' to a format based on an open standard. Hockx-Yu does not say whether PDF or the archive version of PDF is appropriate for this, though the James study 1 of file formats may offer some guidance here.

The one JISC sponsored preservation that we have been involved with does not get any detailed treatment in Hockx-Yu's article. The LOCKSS Pilot attempts to preserve material published in commercial or open access journals. Some LOCKSS members have set up private networks to share the preservation work for material they hold. That would not be covered by the Pilot sponsored by JISC, but LOCKSS may still touch upon material held in institutional repositories.

There is a debate about who has the responsibility for digital preservation: institutions or commercial publishers. One way of gathering evidence on this problem would be to look at a set of an institution's intellectual output and check on how much was being actively preserved in a way appropriate for that institution. This could be to check against the titles preserved within Portico, if the institution is a Portico member. A similar test could be done against the 'archival units' held by a LOCKSS member. You could also compare for overlaps with the content held in an institutional repository. If only the IR is making an effort to preserve material, then by default it may have found itself with the primary preservation responsibility.

1 James, H., Ruusalepp, R., Anderson, S., Pinfield, S. (2003), Feasibility and Requirements Study on Preservation of E-Print,

No comments: