D-Lib Magazine
|
|
Neil Beagrie |
IntroductionThis article provides a retrospective of digital preservation work by the Joint Information Systems Committee (JISC) [1] and sets out progress to date with the Continuing Access and Digital Preservation Strategy for the Joint Information Systems Committee (JISC) 2002-5 [2] and its implementation plan, which were approved by JISC committees in October 2002. In previous years, much had been achieved by JISC and other bodies engaged in digital preservation efforts with relatively modest investment. However, the escalating scale and complexity of digital resources to be curated and preserved, and the subsequent urgency of developing a critical mass of expertise, shared services and tools for long-term digital preservation, required a step change in investment and approaches. The Strategy set out the case for higher investment by the UK Higher and Further Education (HE and FE) sectors in digital preservation and the principles and priorities for JISC-funded activities and external partnerships to be followed over a three-year period and beyond. BackgroundThe management and preservation of digital materials are of increasing importance for a wide range of activities within education and research. Much of the knowledge base and intellectual assets of institutions and staff are now in digital form. Unless significant effort is put urgently into digital preservation and securing long-term access to these digital resources, uncertainties over archiving will continue to impede the growth and take-up of digital services and new working practices. In addition unless digital assets can be preserved over time, current investment in digitisation and digital content will only secure short-term rather than lasting benefits. The threat if the challenges in digital preservation are not addressed is very real and insidious and will eat away the future of our cultural heritage, knowledge economies, and information society. Although there are a number of well-known individual examples of loss or near loss, such as the BBC Domesday Disks [3], statistics on current losses are difficult to compile. Wider overviews are rare. In part this is because few organisations wish to publicise losses. Also, sometimes the information can be recovered or substituted in some way (e.g., substituting a paper version of an e-publication). In such cases, the loss is often more subtle: information can effectively be degraded through loss of functionality, linking, or supplementary documentation, substantially reducing its real value. There is good statistical information on the current explosive growth of digital information [4] and clear projections for a future data deluge in areas such as scientific research [5]. Instruments currently being built and experiments being planned now will, in a few years, generate more data than has previously been generated in the whole of human history up to this point. Not all of this information has constant and persistent value, but a significant proportion of it does. A serious and worsening gap has developed between our ability to create digital information and our infrastructure and capacity to manage and preserve it over time. Some commentators have referred to the likely cumulative effect of this as a future "digital dark ages". As a committee of the UK Higher and Further Education Funding Councils, the JISC serves some 200 Higher Education institutions and over 500 Further Education institutions across the UK. Its mission is to promote innovative use of ICT in tertiary education. In its criteria for funding projects and services, JISC seeks to ensure they are:
In addition to running the high-speed academic network for the UK, JISC provides national services and electronic content, and JISC therefore has a vested interest in digital preservation. It has been one of the leading institutions worldwide in undertaking research and development on the long-term preservation of digital materials through projects and services such as the Arts and Humanities Data Service [6], Cedars [7], Camileon [8], and the JISC/NPO research studies [9]. In June 2000 the author was appointed by JISC to build on this work, developing policy, guidance, and collaboration with a range of partners to address the growing challenges and the threat of digital information loss. There were three major objectives and related outcomes:
The JISC Continuing Access and Digital Preservation StrategyThe Strategy itself is both an advocacy document supporting a funding bid for the implementation plan and a road map for future work. The implementation plan sets out in more detail the specific categories of material that need to be covered, critical issues and how it is proposed these issues should be addressed, and phases and funding for this work to be progressed. For JISC and HE/FE institutions the responsibility for and the degree of influence over long-term preservation is complex. They directly create or fund the creation of digital materials but are also major licensees of commercial content or are heavily dependent on digital content created and stored by others (e.g., government, research data centres, The British Library or The National Archives). JISC's role therefore has been to support collective action on behalf of the sector and develop advice for individual institutions, to seek appropriate licensing arrangements for commercial content, and to build collaboration and action on long-term preservation with other agencies. The Strategy and implementation plan cover an extremely wide range of activities and materials, but there are some key departures in the Strategy that are worth describing here in more detail. They include:
Emphasis on supporting records management and appraisalA strong emphasis was placed within the Strategy on records management as one of the strands that can contribute to awareness and implementation of long-term preservation in institutions. Many Higher and Further Education institutions have traditionally placed relatively low value on the importance of their own internally produced information. As a result, relatively few institutions have implemented records management, archive or digital preservation programmes, and even fewer have employed staff with the relevant professional skills to address these issues. This is especially true of the smaller institutions within the UK. We recognised that implementation of the Data Protection Act and the Freedom of Information Act in the UK had major implications for institutions. Good records management would be central to compliance with existing and forthcoming legislation and potentially could be a major driver in changing existing institutional practices. Initial work with the community at the institutional level on the Strategy therefore focussed on records management and was led by Steve Bailey, the JISC Electronic Records Manager. JISC activity in this area has sought to familiarise senior management and key decision makers with the concept of the records continuum and to promote a holistic approach to the management of information from creation through current use to final disposition. The JISC has also been developing a range of toolkits designed to help institutions address their own specific information-related issues. The publication of an updated and expanded version of the Record Lifecycle Report [13], first published in 1999, represented one such tool and forms the basis for many others such as action plans and templates for conducting information audits and creating record retention schedules. Not all institutions have professional records managers or archivists, or wide experience in electronic records management amongst their staff. There was therefore a need for a wide range of training events, publications for the community, and opportunities for data exchange and supported self-help. We have aimed to support these institutions through events, a programme call, and JISC services and publications [14]. JISC itself has also taken steps to advance its own internal records management, including appointment of the JISC Electronic Records Manager and implementation of an electronic records management system, as well as developing criteria, procedures, and banners for retention and archiving of documents on its own website (see Figure 1 below).
Risk assessment and feasibility studies to identify and scope challenges and recommended actions for specific categories and classes of digital material of interest to JISC.Studies can be a double edged sword: they can be a mechanism for delaying or putting off action by funding bodies, or they can provide a mechanism for intensively assessing options and scale of challenges for future action. Digital preservation remains complex and sometimes high risk. We can learn from our mistakes as well as our successes, but risks need to be managed if overall programmes and funding are to thrive. The JISC studies have been undertaken with a commitment to moving the Strategy forward and also with a firm commitment to taking an incremental, phased approach. In this way, detailed feasibility and scoping studies and risk assessment precede pilot projects or proto-typing of new services. These major studies are intended to cover all significant areas of the JISC's and institutional digital collections. To date these studies and supporting publications include the following:
All of these publications have been subjected to rigorous peer review and public consultation on draft versions, and they have been made publicly accessible on the JISC website. Figure 2, a diagram from the Data Curation for e-science study, provides an example of content from one of these studies. It also usefully illustrates the connections that exist between the many different categories of material we have covered.
Outlining a preservation layer for the JISC Information Environment and establishing the role of a Curation Centre to support generic services and research.The JISC believes that the Strategy will be best achieved by developing and encouraging adoption of architectures, standards and practices that comply with widely adopted frameworks for creation and management of digital materials. To date the JISC has been heavily involved in the development of frameworks for life-cycle management [24] and has contributed to the development of the Reference Model for Open Archival Information Systems (OAIS) [25] and the RLG/OCLC Trusted Digital Repositories: Attributes and Responsibilities report [26]. The JISC is promoting use of the broad understandings and concepts embodied in the Reference Model for Open Archival Information Systems (OAIS) as a conceptual model for construction and management of digital archives. The high-level functional model for OAIS archives is shown in Figure 3 below.
However, the JISC is not solely concerned with archives and archival collections but also with hosting and access services, mirrored collections, and distributed national services. Some adaptation of the OAIS Model is therefore needed for deployment in the JISC Information Environment. Figure 4, taken from the Strategy, shows a JISC adaptation and simplification of the model, which focuses on the implementation of storage and preservation planning in a more complex environment, with distributed retention and preservation over varying periods. In this model JISC is seeking to ensure three elements to high professional standards: onsite storage; replication to offsite trusted third party archival repositories; and preservation planning. The professional administration of computer storage onsite combined with offsite replication is a feature of all effective business continuity planning and short-term data security. It is also a pre-requisite and precursor for long-term preservation activities. It is the existence of trusted preservation services and preservation planning functions, however, that are central to long-term preservation.
The model proposed for archival storage in the JISC Information Environment largely fits that which exists or is being put in place for JISC-funded archival services (UK Data Archive or the Arts and Humanities Data Service, which have duplicated off-site storage of their holdings). Similar arrangements on a time-limited or renewable basis would be applicable to hosted services. Institutional arrangements could also be fitted to this modelalthough potentially these may benefit most from third-party or common services being developed to support preservation planning or remote storage. Currently, JISC is evaluating LOCKSS and its potential future application to UK repositories, and is encouraging further development of preservation assessment tools, strategies, and added preservation functionality for institutional repository software through a community call for proposals [27]. The model proposed (and standards as they are developed) could equally apply to external services. Escrow arrangements and secure remote archival storage with trusted preservation services could also meet the needs of highly commercial materials. Issues and potential implementations of such arrangements were explored in a JISC feasibility study for e-journals. Further work is being investigated in this area in line with the feasibility study recommendations [28]. The area of the model that is least developed in implemented service environments is preservation planning. For institutions and repositories holding highly specialised data or a wide range of file formats, this is likely to be a particularly crucial area for future development. In addition to the three elements: onsite storage, offsite replication, and preservation planning, the Strategy envisages a range of shared services as part of the national infrastructure. Within existing provision in the UK, there are a number of discipline (or "faculty-level") specific support services for digital preservation funded by the research councils and JISC, including the constituent services within the Arts and Humanities Data Service, the UK Data Archive, and the Natural Environment Research Council data centres. The Strategy also argued the case for the creation of a Digital Curation Centre to provide generic curation and preservation services as well as new research and development. These were either missing or would be created in individual institutions with potentially wasteful duplication or inefficient re-distribution. A particular concern was that, with the massive predicted growth in both electronic research data and publications, existing organisational structures and methods would not scale; hence new approaches and organisations would need to be added to the national (perhaps ultimately the international) infrastructure. The Digital Curation Centre is now in the process of being formed by a consortium consisting of the Universities of Edinburgh and Glasgow, Council for the Central Laboratory of the Research Councils, and UKOLN, with funding from JISC and the e-science core programme [29]. How generic support services through bodies such as the Digital Curation Centre might function in relation to discipline-specific services and functions for scientific data was outlined in the Audit of e-science data in the UK study and is shown in Figure 5 below.
OutcomesTo date the JISC Continuing Access and Digital Preservation Strategy has advanced the digital preservation agenda in the UK in the following ways:
ConclusionsThere is a growing impetus behind national efforts on digital preservation in the UK and other countries. Each of these efforts rarely result from an overall strategy but often consist of a patchwork created by individual funders and institutions, with synergies and collaborations established slowly and sometimes painfully on the ground. There is also an impetus behind international efforts that can begin to address the gaps left by national initiatives, and to combine work on common issues as well as to cascade lessons from countries engaged in early work to those just starting out in the field. The education and research sectors in each country can play an important part in digital preservation through research and innovation, development of skills and training, the catalyzing of the efforts of others, and the curation and preservation of works they themselves create and of their many special collections. The unique status and role of JISC has allowed it to focus and leverage much of this activity within its sector in the UK. Work done to date by JISC has shown how important advocacy is for digital preservation and has highlighted the importance of not focussing on digital preservation as an end in itself but as a means to an end: that of longevity for and ongoing access to essential resources and digital heritage of value to many different user communities. Often the term digital preservation itself can be a barrier to some audiences who (unlike the library or archive communities) do not intuitively understand the meaning of the term. Advocacy and press campaigns to raise awareness and change perceptions has been a key factor in the success of work being done the UK to date, together with a willingness to define digital preservation very broadly or utilise other more inclusive terms such as digital curation where appropriate. Ultimately, digital preservation will be successful when it can be seen not as a stand alone institutional activity but as an activity embedded in how institutions manage and approach digital information and resources on an ongoing basis. This ultimately is what JISC has been seeking to achieve in promoting digital preservation strategies and lifecycle approaches to management of digital resources. It remains a simple objective, yet one immensely challenging to achieve. Nevertheless, encouraging progress is being made towards accomplishing it. AcknowledgementsI would particularly like to thank my colleagues Helen Hockx-yu, Alan Robiette, and Maggie Jones for commenting on a draft of this article and Philip Lord for permission to reproduce Figures 2 and 5. Development of the JISC Continuing Access and Digital Preservation Strategy benefited from comments made by many colleagues on preliminary drafts. Key parts of its implementation have also been dependent on the skills and dedication of staff in JISC services and institutions, and of contractors. I am very grateful to everyone who has contributed to the Strategy and its implementation in different ways over the past few years. References[1] Joint Information Systems Committee: <http://www.jisc.ac.uk>. [2] Beagrie, Neil, 2002, The Continuing Access and Digital Preservation Strategy for the Joint Information Systems Committee (JISC) 2002-5: <http://www.jisc.ac.uk/index.cfm?name=pres_continuing>. [3] BBC Domesday, Camileon Project: <http://www.si.umich.edu/CAMILEON/domesday/domesday.html>. [4] Lyman, Peter and Varian, Hal R., 2003, How Much Information, 2003. <http://www.sims.berkeley.edu/how-much-info-2003>. [5] Hey, Tony and Trefethen, Anne, 2002, "The Data Deluge: an e-science Perspective" in: Berman, Fran (Ed.) et al, 2003, Grid Computing: Making the Global Infrastructure a Reality, (John Wiley and Sons). Also available online at: <http://www.ecs.soton.ac.uk/~ajgh/DataDeluge(final).pdf>. [6] Arts and Humanities Data Service: <http://www.ahds.ac.uk>. [7] Cedars (CURL exemplars in digital archives) Project: <http://www.leeds.ac.uk/cedars/>. [8] CAMiLEON (Creative Archiving at Michigan & Leeds: Emulating the Old on the New) Project: <http://www.si.umich.edu/CAMILEON/>. [9] The seven JISC/NPO Preservation Studies can be accessed from: <http://www.ukoln.ac.uk/services/elib/papers/supporting/>. [10] Digital-Preservation Announcement and Information list: <http://www.jiscmail.ac.uk/lists/DIGITAL-PRESERVATION.html>. [11] Beagrie, Neil and Jones, Maggie, 2001, Preservation Management of Digital Materials: A Handbook, (The British Library: London). Also available online: <http://www.dpconline.org/graphics/handbook/>. [12] Digital Preservation Coalition: <http://www.dpconline.org>. [13] Parker, Elizabeth, 2003, Study of the Records Lifecycle (revised edition of original first published 1999) (Joint Information Systems Committee). The revised Function Activity Model (FAM) & Record Retention Schedule (RRS) are available online at: <http://www.jisc.ac.uk/index.cfm?name=srl_structure>. [14] Further information on the JISC digital preservation and records management programmes is available online at: <http://www.jisc.ac.uk/index.cfm?name=programme_preservation> and <http://www.jisc.ac.uk/index.cfm?name=programme_supporting_irm>. [15] Day, Michael, 2003, Collecting and Preserving the World Wide Web: a feasibility study undertaken for JISC and the Wellcome Trust: <http://www.jisc.ac.uk/uploaded_documents/archiving_feasibility.pdf>. [16] Charlesworth, Andrew, 2003, Legal issues relating to the archiving of Internet resources in the UK, EU, USA and Australia: a study undertaken for JISC and the Wellcome Trust: <http://www.jisc.ac.uk/uploaded_documents/archiving_legal.pdf>. [17] Jones, Maggie, 2003, Archiving E-Journals Consultancy - Final Report: <http://www.jisc.ac.uk/uploaded_documents/ejournalsfinal.pdf>. [18] Jones, Maggie, 2003, UK LOCKSS Workshop Monday 24 November 2003: Workshop Report: <http://www.jisc.ac.uk/uploaded_documents/LOCKSSWorkshopReport.doc>. [19] Lord, Philip, and Macdonald, Alison, 2003, Data curation for e-Science in the UK: an audit to establish requirements for future curation and provision: <http://www.jisc.ac.uk/uploaded_documents/e-ScienceReportFinal.pdf>. [20] Lord, Philip, and Macdonald, Alison, 2003, Data curation for e-Science in the UK: an audit to establish requirements for future curation and provision: Appendices: <http://www.jisc.ac.uk/uploaded_documents/e-scienceAppendices.pdf>. [21] James, Hamish, et al, 2003, Feasibility and Requirements Study on Preservation of E-Prints: <http://www.jisc.ac.uk/uploaded_documents/e-prints_report_final.pdf>. [22] Wheatley, Paul, 2003, Survey and assessment of sources of information on file formats and software documentation: <http://www.jisc.ac.uk/uploaded_documents/FileFormatsreport.pdf>. [23] Smith, Kevin, 2001, Joint Information Systems Committee (JISC) Records Management in JISC funded Project-Based Programmes: <http://www.jisc.ac.uk/uploaded_documents/RecordsManagement.rtf>. [24] Beagrie, Neil, and Greenstein, Daniel, 1998, A Strategic Policy Framework for Creating and Preserving Digital Collections, British Library Research and Innovation Report 107 (British Library: London). Also available online at: <http://www.ukoln.ac.uk/services/elib/papers/supporting/pdf/framework.pdf>. [25] Consultative Committee for Space Data Systems, 2002, Reference Model for an Open Archival Information System (OAIS). CCSDS 650.0-B-1: Blue Book. Issue 1. January 2002 adopted as ISO 14721:2002: <http://ssdoo.gsfc.nasa.gov/nost/wwwclassic/documents/pdf/CCSDS-650.0-B-1.pdf>. [26] RLG/OCLC, 2002, Trusted Digital Repositories: Attributes and Responsibilities, (Research Libraries Group: Mountain View, California). Also available online at: <http://www.rlg.org/longterm/repositories.pdf>. [27] JISC Invitation To Tender: Technical Appraisal of the LOCKSS system: <http://www.jisc.ac.uk/index.cfm?name=funding_lockss> and JISC Circular 4/04: Call for Projects in Supporting Institutional Digital Preservation and Asset Management: <http://www.jisc.ac.uk/index.cfm?name=funding_circular4_04>. [28] op cit [13] and [27]. Other recommendations in the report are also being reviewed and will be progressed in due course. [29] Digital Curation Centre: <http://www.dcc.ac.uk>. [30] UK Web Archiving Consortium: <http://www.webarchive.org.uk/>. Copyright © 2004 Neil Beagrie |
|
|
|
Top | Contents | |
| |
D-Lib Magazine Access Terms and Conditions doi:10.1045/july2004-beagrie
|