Digitising History
CHAPTER 5 : DOCUMENTING A DATA CREATION PROJECT

 

Guide to Good Practice Navigation Bar





























































































































Guide to Good Practice Navigation Bar


5.2 Guidelines for documenting a data creation project

5.2.1 Contents

A description of the contents of the data collection should be provided in sufficient detail to allow any potential user to assess whether it is suitable for their needs. This factual description should include, where applicable:

  • Title, which describes the contents and gives an indication of the temporal and geographic coverage.
  • Main types of information it contains.
  • Strengths and weaknesses.
  • Time period(s) covered, including details of any data which only partially cover the time period.
  • Periodicity of the data collection (e.g. monthly, annual, decennial).
  • Name(s) of the country, region, county, town or village covered. If the names or the administrative units were different during the time period covered by the data collection, document those names or administrative units and their present-day equivalents.
  • Types of spatial units that can be used to analyse the data collection.
  • Language(s) used.

5.2.2 Provenance

The provenance of a data collection needs to be documented in detail. This information should include how, why, when and by whom the data collection was created and used.

Who created the data collection and why?

A data collection's intellectual context should be documented thoroughly enough to enable someone who has not been involved in the project to understand the intellectual framework in which it was created. This information should include:

  • Other title(s) and reference number(s) that have been used to identify the data collection during the data creation process.
  • Name(s), affiliation(s) and role(s) of all the individual(s) or organisation(s) who have been involved in the data creation process.
  • Names of any organisation(s) or individual(s) that funded the creation of the data collection, with grant numbers and titles where appropriate.
  • Description and history of the research project (or other process) which gave rise to the data collection, including the main aims, objectives and topics of research.
  • Description and history of how the data collection has been used.
  • Bibliographic references for any publications based upon or about the data collection.
  • Bibliographic references to any related data collections.

How was the data collection created?

The way in which a data collection was created should be described in sufficient detail to allow any potential user to understand the steps that were taken. This information should include:

  • How and why the methods used and the structure and format of the data collection were chosen.
  • Hardware and software used to create the data collection, and whether it has at any point been converted to new systems or formats.
  • Dates relating to the creation of the data collection, including any dates when it was significantly amended.

Which sources were used to create the data collection?

Detailed information about the source(s) used to create the data collection should be provided so that any user can trace the data collection back to its original source(s) and understand the relationship between the data collection and the source(s). This information should include:

  • List of sources, including archival or bibliographic references.
  • Purpose, scope, content, provenance, administration and history of the source(s), including any unusual or inconsistent features such as the destruction or separation of parts of the source.
  • Bibliographic references to works that describe the source(s).
  • Details of how the source(s) have been converted to digital form, including: completeness of transcription, sampling and selection methods, standardisation procedures, and the use of mark-up, classification and coding schemes.
  • Details of the relationship between the data collection and the source(s) including a photocopy or image of each source, with an example showing how it is represented in the data collection.

5.2.3 Structure

It is essential that the structure, form and organisation of a data collection be described fully. This information should include:

  • List of files and tables with information about their contents, number of records and fields, and the way in which they relate both to each other and to the source.
  • List of field names used in each file with information about the characteristics of each field, including name, contents, field length, data type and any codes used, and information about the way in which the fields relate to each other and to the source, including details of derived variables.
  • Format of the data collection, including the delimiters used in delimited ASCII files.

5.2.4 Terms and conditions

It is important that all the terms and conditions that apply to the use of a data collection are fully documented. In particular, copyright and other intellectual property rights must be clearly established, and the name(s) of the copyright holder(s) both for the data collection and for the original source material must be specified. If the collection was created during your work as an employee, the copyright holder will normally be your employer under your contract of employment. In particular, give full details if copyright is held jointly, if there are multiple copyrights, or if the collection is covered by Crown copyright. For further information about copyright see AHDS and TASI (1999).

 

© Sean Townsend, Cressida Chappell, Oscar Struijvé 1999

The right of Sean Townsend, Cressida Chappell and Oscar Struijvé to be identified as the Authors of this Work has been asserted by them in accordance with the Copyright, Designs and Patents Act 1988.

All material supplied via the Arts and Humanities Data Service is protected by copyright, and duplication or sale of all or any part of it is not permitted, except that material may be duplicated by you for your personal research use or educational purposes in electronic or print form. Permission for any other use must be obtained from the Arts and Humanities Data Service.

Electronic or print copies may not be offered, whether for sale or otherwise, to any third party.

Next Bibliography Back Glossary Contents