OCLC Internet Cataloging Project Colloquium
Position Paper

Cataloger's and the Creation of Metadata Systems

A Collaborative Vision at the University of Michigan

by Kevin L. Butterfield
Assistant Librarian, Original Cataloging Unit, Hatcher Graduate Library
University of Michigan, Ann Arbor, MI 48109-1205
Tel: (313) 764-9361 E-mail: kbutterf@umich.edu


Contents


Introduction

The term metadata has been bandied about a great deal in the professional literature of late. With the advent of the Dublin Core, the Text Encoding Initiative (TEI) Guidelines, the Government Information Locator Service (GILS), and other such constructs, the concept of data about data and what to do with it has again become a discussion point in librarianship. As the methods available for describing information grow beyond MARC, it becomes increasingly apparent that librarians, and more specifically catalogers, have a role to play as mediators and creators of an increasingly diverse landscape of descriptive methods. As the choices for providing access increase, the experience and traditions that the cataloging profession can bring to the creation, standardization, and manipulation of metadata systems becomes obvious. This position paper outlines how that experience and tradition were brought to bear on digital library projects at the University of Michigan and advocates a strong and collaborative role for catalogers in the mediation and creation of metadata systems.

The Cataloger As Mediator

By mediating the use of metadata, catalogers provide for these developing systems of description and access a strong influence towards standardization. Cataloging exists as an often invisible process of order making. As constituencies develop systems for ordering in the digital world, what was once a largely invisible process becomes glaringly apparent.

A working example of this process at the University of Michigan has been the collaborative work done for the Humanities Text Initiative, a partnership between the University Library, the School of Information and Library Studies and the University of Michigan Press. Cataloging staff were brought on board to develop policies and standardize procedures for the cataloging of texts and the production of TEI headers for the project. The goals were the development of a continuing dialogue between each of the partners and the creation of a system of headers and MARC records that provided predictable and easy access to the Initiatives electronic texts. The effort is an ongoing collaboration between the humanities text, preservation, and cataloging communities within the library. As the effort progressed towards more large-scale production, skills in UNIX programming, workflow and systems analysis, and Standardized General Mark-Up Language (SGML) were developed on the part of the catalogers while a greater appreciation for the value of standardized forms of description and access were developed on the part of the other collaborators. As workflow patterns became finalized, another change was noticed as well. Increasingly, catalogers were working with the metadata first, in the form of TEI headers, and deriving MARC-based records for the local catalog from it. This was made possible by the efforts of catalogers to standardize the forms of description within the TEI tagging. By employing AACR2 as a standard and mediating the development of tag content, catalogers played an integral role in the development of a more descriptive and standardized header that allowed for greater predictability and fidelity in searching and retrieval.

The Cataloger As Creator

While the previous project illustrates the cataloger as mediator, another more expansive project provides an example of the cataloger as an active participant in the creation of a metadata system. In 1994, the University of Michigan was awarded a four-year, four-million-dollar cooperative agreement to conduct coordinated research and development to create, operate, use, and evaluate a test bed of a large-scale continually evolving multimedia digital library. The NSF/ARPA/NASA Digital Library Project is based on a set of highly specialized agents. The descriptive component of that library, the conspectus, consists of a metadata set for the description of objects, collections, and agents within the digital library. In early 1995, members of the University Library's Original Cataloging Unit were asked to join the Conspectus Working Group to provide expertise in the creation of systems for information organization, in the maintenance of large scale information systems and to advise on the ability of metainformational constructs to scale from the research to the production level.

Work has progressed over the last year and will continue to into the future to create and refine the conspectus. Catalogers actively participated in the process of defining an attribute set. The questions asked during this process included for what and for whom are we attempting to describe digital resources and at what level is it appropriate to describe them. An additional area of exploration involved investigating what the best process for registering or cataloging content for the UMDL would be. How much human intervention is necessary and what parts can be handled by intelligent agents? Is it possible to design heuristics to inform an agent where and when to stop cataloging and what information is required from the content provider to make this a workable process? A third component of the process involved the creation of an ontology to describe for the system what it would encounter during searching and for the development of a conspectus search language.

The work to date in answering these questions has resulted in a core attribute set for the description of collections, a prototype registry page to facilitate the process of describing collections, and the identification of aspects of the registry process which can be done automatically by agents. The conspectus seeks to go further than simply describing the bibliographic characteristics of an object or collection to additionally encompass rights, relationships and services also associated with the collection, object or agent being described. Each of these developments came about from a creative, collaborative process between catalogers from the university library and faculty and students from the university's schools of engineering and library and information science. In the ongoing work to create this metadata component, catalogers have employed talents largely taken for granted in the areas of description, retrieval and access as well as developing new skills in ontological description.

(Re) Emergence of Metadata

As text publishing models increasingly incorporate electronic access and delivery into their paradigm, it becomes clear that metainformation/metadata becomes included in the editorial decisions involved in the creation of the texts. This is a shift from the old model of simply publishing the text and leaving the creation of metadata description in the hands of outside agencies, such as libraries or, more specifically, catalogers. This shift of work from the bibliographic control level to the publishing level is a natural evolution of text production in the electronic information age. Whereas texts previously were gathered into the library and placed under bibliographic control via catalogs, MARC-based or otherwise, now more and more we are providing to clients actual texts whose revision exists beyond the control of the physical library. In addition, through initiatives such as OCLC's Dublin Core and the TEI, among others, we are accessing texts that contain self-describing metainformation. We are now pointing to data in motion, objects that we do not physically hold, whose description resides as a part of the object, rather than separately in a library catalog. Metadata itself is not a new thing. Catalogers have been employing it as a descriptive method for decades as MARC records in OPACs or as cards in catalogs. What is new about metadata today are the emerging multitude of methods which employ it and the arena in which it is being used. TEI, GILS and Dublin Core metadata each arose from a different community or as a collaboration of communities in order to attempt to describe a very slippery publication medium. It is not unlike the chaotic times when printing was first invented. We are grappling with an emerging and mutable publication medium for which we have few definitive answers because we have not discovered all of the questions yet.

Return to Cataloging As a Process

The opportunities for catalogers in these developments begin with a return to viewing cataloging as a process. The experiences that catalogers have taken from the digital library projects at the University of Michigan illustrate this. From both of the projects discussed earlier in this paper, catalogers took away broadened perspectives on new methods of information organization, fresh looks at old assumptions involving information retrieval and organization and a return to a basic understanding of the relationship between access, description and retrieval. The underlying theme behind each of these lessons has its emphasis on cataloging as a process. Too often we have viewed cataloging as having its culmination in a record while the largely invisible intellectual process of creating standards, vocabularies, and systems of description and classification that are inherent in that record are taken for granted.

Because of this invisibility, many of those creating libraries and systems for the new digital order assume that they are approaching these issues for the first time. By approaching these developments from a collaborative perspective, not only do librarians bring to the table our experience in creating metainformational systems, but gain new insight and experience in the processes employed by other disciplines. As these digital endeavors become more and more global, the cataloger's experience in languages, diacritics, and standard making become more and more valuable. When catalogers become involved in these creative projects, once invisible processes take on clearer form and purpose. When this happens, both the individual cataloger and the cataloging profession benefit greatly.

Conclusion

As the extent of electronic publishing grows, metadata structures increasingly assume a central focus in describing text and items. Developers of digital libraries often do not realize that librarians have been there before. A role for libraries in providing description and access for these resources will lie with the creation, maintenance and scalability of such metadata descriptions. By bringing these new metadata discussions into libraries and, more specifically, into the roles of catalogers, it becomes clear that the future of universal access will not lie with one universal system of description. The purpose is not to replace existing systems or ignore a century of tradition in description, but to continue to build upon that knowledge in the creation of new metainformational systems, to impart insight into the creation, maintenance and scalability of metadata through collaborative efforts, and to bring expert knowledge of the cataloging process and the traditions of the cataloging community to the discussions regarding metadata and standards for the digital future.

Acknowledgments

Many people in the Monographs Division here at the Hatcher Graduate Library have been very helpful in the creation and presentation of this paper. I would like to thank Jo Ann Lewis, David Richtmyer, Judith Ahronheim, and Lynn Marko for their time and patience. I would also like to thank Erik Jul and OCLC for sponsoring the Intercat project and colloquium.

Bibliography

Caplan, Priscilla. 1995. "You Call It Corn, We Call It Syntax-Independent Metadata for Document-Like Objects." The Public-Access Computer Systems Review 6, no. 4: 19-23.

Crum, Laurie. 1995. "University of Michigan Digital Library Project." Communications of the ACM 38 April 1995: 63-64.

Levy, David. 1995. "Cataloging in the Digital Order." Proceedings of Digital Libraries'95 : The Second Annual Conference on the Theory and Practice of Digital Libraries June 11-13, 1995 - Austin, Texas, USA.

Weibel, Stuart L. 1995. "Metadata: The Foundations of Resource Description." D-Lib Magazine, July 1995.
http://www.dlib.org/dlib/July95/07contents.html


Back to Top