Search   |   Back Issues   |   Author Index   |   Title Index   |   Contents

Articles

spacer

D-Lib Magazine
March/April 2009

Volume 15 Number 3/4

ISSN 1082-9873

Digitization Education

Courses Taken and Lessons Learned

 

Mats Dahlström
Swedish School of Library and Information Science <mats.dahlstrom@hb.se>

Alen Doracic
Swedish School of Library and Information Science <alen.doracic@hb.se>

Red Line

spacer

Abstract

The Swedish School of Library and Information Science (SSLIS) has for some years been offering courses in cultural heritage (CH) digitization in cooperation with some major national digitizing agents. During that work, various problematic issues and challenges, to some degree perhaps familiar to several other digitization educators, have arisen. This article describes the aims and nature of the particular CH digitization education at SSLIS, accompanied with a brief overview of Nordic CH digitization education efforts. Ten particular challenges when launching and managing the course are highlighted. By identifying such challenges and discussing possible ways to tackle them, the authors hope to encourage discussions that can serve future education planning.

Introduction

Over the past few years, several calls have been made for increased space for, and a firmer grounding in, digitization instruction within library and information science (LIS) education.1 Although the majority of those calls in essence provide overviews and discuss digital library education in general (Spink & Cool, 1999; Saracevic & Dalbello, 2001; Coleman, 2002; Choi & Rasmussen, 2006; Tammaro, 2007), some of them deal specifically with teaching the conversion and digitization of cultural heritage (CH) artefacts (Dalbello, 2002; Manzuch et al., 2005; Perry, 2005; McMenemy, 2007). Such CH artefact digitization education tends to be a meeting place for different and sometimes conflicting interests and aims, involving skills in many diverse areas while also having to address varying issues from several different interest and target groups. The Swedish School of Library and Information Science (SSLIS) in Borås (UC Borås/U. of Gothenburg) has for some years been offering such education at a bachelor's or master's level. During that work, we have come across various problematic issues and challenges probably familiar to several other digitization educators. This article describes the aims and nature of our particular CH digitization education at SSLIS accompanied with a brief overview of Nordic CH digitization education efforts, followed by a highlighting of 10 areas particularly challenging to courses such as ours. It is our hope that identifying such challenges and discussing how we have tried to tackle them, whether successfully or not, will encourage further discussions that can serve future education planning, both our own and that of others.

Context

In order to provide some contextual background for the challenges raised in this article, a brief description of educational and training initiatives in CH digitization is needed. Higher education within LIS (and to some extent archive, library and museum (ALM) studies) as well as professional trainee initiatives are considered, both in Sweden and in the neighbouring Nordic countries. While CH digitization education within LIS has been discussed in the overall European context (Manzuch et al., 2005; Perry, 2005; Tammaro, 2007), there is a lack of texts that specifically deal with the efforts within the Nordic countries.

In order to acquire an overview of educational initiatives in the Nordic LIS academic community at the undergraduate and graduate levels, we conducted a brief survey in the first trimester of 2008. Our study included a quantitative yet informal questionnaire, complemented by mail and e-mail correspondence with 12 LIS departments (four in Sweden, three in Finland, one in Iceland, one in Denmark, and three in Norway). In addition, online course descriptions and syllabi were studied. The questions we asked concerned course and module forms, teaching contents, curricula, learning outcomes and collaboration.2

Hardly any of the inquired departments offer specific courses within the topic of CH digitization, at least not explicitly connecting digitization and cultural heritage. Several departments nevertheless offer theoretical lectures on the topic as part of larger courses, and one (Swedish) LIS department has begun incorporating the area within a new master's program that started in the fall of 2008. A number of courses related to digitization and electronic publishing in general – e.g., digital text, image and sound editing, web site and server management, and text encoding and style sheet technologies – are taught at different educational levels. While such courses are obviously relevant and needed in the kind of education efforts we are discussing here, the ones we came across were not targeted specifically at CH digitization issues.

The results of our study also indicate that CH digitization courses are not a mandatory part of LIS educational programmes. Two departments teach CH digitization issues as a part of a larger mandatory course, but the most common form is as an elective single subject course. What further characterizes the courses is a strong emphasis on theory, although the particular theoretical framework varies considerably. Legal aspects (such as IPR, privacy and access issues) and preservation aspects are the most common theoretical approaches. Socio-cultural and institutional issues are addressed in the courses by four departments, while project management and economical issues are considered by three. Although individual students certainly can perform in-depth as well as hands-on studies of CH digitization as part of their bachelor's or master's theses, our study clearly suggests that hands-on approaches are not frequently found within CH digitization education in the LIS context. Finally, collaborating with institutions from the ALM professional sector has been implemented in the curriculum and teaching practices by only one LIS department, while the collaboration with other ALM educational programmes or with the corporate sector is not practiced by any LIS programme at all.

Our limited study revealed a variety of approaches in CH digitization education within the Nordic LIS educational community. Most surprising perhaps, given the current ALM community buzz on CH digitization and projects such as Google Book Search, is the fact that CH digitization has not become a mandatory module in any LIS educational programme, but rather is invariably an elective and/or single subject course or module. Further, while some (e.g., Manzuch et al., 2005) have identified an international trend towards more technical approaches in CH digitization education, our inquiry suggests a dominance, especially in Swedish LIS schools, of non-technical approaches and the teaching of principles and theory, which corroborates the observations made by Tammaro (2007). We suggest there is a need for further reflection and more thorough empirical inquiry on this issue, at least within the Nordic context.

Finally, some observations on ALM institution trainee efforts in the field of CH digitization in Sweden: Since the 1990s, in-house training initiatives in CH digitization have been offered by ALM institutions for their employees in the form of short and long-term courses, seminars and workshops (Justrell, 2003). In 1997, the national Knowledge Foundation (KK-stiftelsen) decided to devote some of their financing resources to support joint trainee efforts for ALM employees (Justrell, 2002). Those efforts were intensified in 2004 with the establishment of the coordinating Office for Archives, Libraries and Museums (ABM-centrum), which quickly became a major national educational agent within the ALM community with the trainee programmes Future in Access and Future in Access Plus, funded by the governmental employment stimulation programme in 2006. Through the programmes, 43 brief (1-2 day) CH digitization courses were offered in, e.g., the digitization process, text and image capture, and issues of selection, preservation, accessibility and IPR. A few more specialized courses were also offered, such as the digitization of moving images or sound (Slutrapport, 2008).

The scale of this ALM trainee initiative is impressive, and more often than not the courses seem to offer substantive and high-quality contents at a professional level. They remain however short-term trainee efforts, with quite brief and instant courses, specifically targeted at the ALM profession sector and its pragmatic needs. They do not offer an education that treats as a problem or situates CH digitization within a broader perspective, that provides a theoretical or conceptual framework, that enables the participants to work collaboratively and long-term with a real digitization effort in project form, or that teaches not only image-oriented but also text-oriented digitization (more on which is below).

CH digitization at SSLIS

At SSLIS we've been offering a bachelor's or master's level distance course on the digitization of cultural heritage material since 2004. The course had two origins. Firstly, we had already offered extensive courses in electronic publishing and XML/HTML text encoding from 1995 onwards, but wanted to expand the courses further to include aspects of versioning, textual criticism, preservation issues and image management. We were also interested in adapting our teaching efforts to material already actually in use and important cultural material such as manuscripts and old prints, and also to expand the text encoding area to include state-of-the-art techniques such as the Text Encoding Initiative (TEI, cf. Sperberg-McQueen & Burnard, 2004), XML, and eXtensible Stylesheet Language (XSL). Secondly, we noted a growing interest in CH digitization within the Swedish ALM community, and we saw an opportunity to try to meet some of their demands by offering digitization education within an academic LIS context. What we aimed for was not a course in digital library technologies (we offer master programmes to that end), or in programming (although techniques such as XSL transformations (XSLT) can be said to border on programming), or in the digitization of culture, information and literature in general (such as, e.g., e-books and open access journals), but on the digitization of concrete, historically significant CH artefacts. We explicitly however did not want to launch a pure training/hands-on education. Other departments, educators and labs can do this as well or arguably better than we can. But neither did we want to offer a purely theoretical and descriptive course, where you get a textbook idea of the process and only get to evaluate digitization projects already performed elsewhere. What we wanted was a combination of these theory and practice strands, as well as a combination of, on the one hand, textual scholarship, bibliography and book history, and, on the other hand, advanced techniques within current information technology, all situated within in a context dedicated to librarianship and knowledge organization.

In 2004 we therefore presented a 15-ECTS, distance-education course entitled Digitizing the cultural heritage (running a semester period). The course admitted some 20 students, both those working within ALM professions and those attending the larger campus LIS programme. The course had a second run in 2005 with about the same number of students attending. It also resulted in two derivative 7.5-ECTS courses on text digitization (mainly text encoding using TEI) and image digitization. In 2007, we offered the larger course a third time, but this time we expanded it from 15 to 30 ECTS (website in Swedish available at: <http://www.adm.hb.se/~mad/digarv/>), thus running an entire academic year. In 2008 the course, back at 15 ECTS, was offered for the fourth time, ending in August 2008 (website in Swedish at: <http://www.adm.hb.se/~mad/digarv08/>). The students congregated at SSLIS on four occasions (for the 15-ECTS level) comprised of lectures, hands-on exercises, project tutorials and seminars, each occasion lasting three full days (9 AM to 6 PM).

While the course concept addresses critical, textological, legal, historical, managerial and socio-cultural aspects, it is also heavy on the technical side, oriented towards real live digitization projects managed by the students. Basically, the students are taught the whole digitization chain from project planning, text and image capture (via scanning, photography and OCR or other transcription technologies), editing and management (including image editing, XML text encoding, annotation, and database management), and output/publishing on the web (including XSL transformation). At the end of the course, they must submit a critical report on the work they have done. The students work according to the "one input → many outputs" concept, and therefore they manage TIFF-to-JPEG conversions, create rich metadata using TEI Headers, and are also required to encode the texts extensively in TEI-XML and use XSLT to transform that markup to various output formats (such as, e.g., XHTML).

At the start we decided to seek cooperation with several national digitizing and editing institutions, such as the National Archives and The National Library of Sweden. These provide our students with additional lectures and tutorials as well as with unique source material to digitize (unpublished letters and manuscripts by major Swedish authors as well as old photographs and graphics), but students also bring source material from their respective home institutions. In project teams of 2-4 individuals, the students work with the selected materials throughout the course, and their final digitized output on the web is validated and critically examined at the end of the course. We also include a workshop in the course that focuses specifically on preservation and archiving issues (of both the analogue source material and the digital files). Several guest instructors are invited from various national and international institutions: scholarly editing societies, university and national libraries and archives, museums and various information technology centres.

The core idea is thus that students are thoroughly trained, hands-on, in qualitative CH digitization and its whole chain of actions, from planning, design, selection, via image and text capture, conversion, XML encoding over to transformation, web site production around the digitized material, publishing, and evaluation. It's small scale, obviously, but the aim is still to make the students understand how the various actions are necessary and also depend on each other. We put major emphasis on the project concept, explaining the necessities of long-term goals and principles, of using platform independent standards and formats, of looking into open source solutions, in order to support as much as possible the portability, modularity, usability and re-usability of the material in years to come, in new and perhaps unexpected educational, research and ALM contexts. We believe this, along with the emphasis not only on image editing, but on text editing and encoding using TEI and XSLT as well, is where we perhaps differ the most from the majority of the other digitization courses of which we are aware.

Although this is a course in qualitative, critical and selective digitization (as opposed to quantitative mass digitization using a more mass industry model, cf. Coyle, 2006, Dahlström & Hansson, 2008, and Dahlström 2009), the students' project results are mainly quite modest in appearance. This is, to some extent, because our focus has deliberately been on the technical XML/XSLT infrastructure preceding publication and because we have downplayed graphical and layout issues (which tend to be short-term) in favour of text encoding and editing issues (which tend to be long-term efforts).3

Challenges

Although to some extent already hinted at, some areas have emerged as particularly challenging and in need of discussion during the emergence, process and evaluation of the CH digitization courses. Let us briefly point to ten of those areas.

1. Dealing with selection and preservation issues

Issues of selection and preservation (of both the analogue source material and the digital files) tend to be neglected in digitization education, particularly in courses for which the emphasis is on technical capture and editing.

Our collaboration with institutions such as the National Library of Sweden provides us with an opportunity to raise discussions about problems like: what counts as appropriate, worthy and relevant CH material urgently in need of CH digitization, and what even counts as cultural heritage at all? Our approach to the last question has been purely pragmatic: rather than prescribing a normative list of CH candidate types, we have pretty much allowed the students themselves to define the concept through the selection of material they themselves have made. We also include a workshop in the course specifically dealing with preservation and conservation issues. In lectures and seminars, we encourage the students to ponder existing ALM digitization projects and examine what kinds of material are actually being digitized as well as what type of material is missing, and to acknowledge and discuss potential cases when and why digitization and text encoding might not be a viable option (cf. Lavagnino, 2006).

2. Making students aware of critical textual scholarship

A desire to teach "critical" rather than "mass" digitization means helping the students recognize this critical dimension and its benefits. This includes showing how, for example, textual scholarship and the skills of analytical and historical bibliography are valuable and sometimes necessary tools in the document analysis and editing phases. The more critical a digitization project turns, the more it in fact becomes a form of scholarly editing based on textual criticism. We acknowledge this potential of digitization and in some cases require the students to use technology to produce different text transcription level outputs (a diplomatic and a normalized text) from one and the same source by making intelligent and critical use of TEI and XSLT, thus also enhancing the flexibility and reusability of the material. The students are also taught the basics of textual criticism, versioning and the managing of variants.

3. Targeting and collaboration

The decision to target the course towards both the ALM sector (which meant seeking to attract ALM employees and collaborating with ALM institutions engaged in digitization) as well as LIS master campus students has had both its pros and cons. It has certainly brought realism into the course, and has made the students feel they are working with live material in projects that meet a realistic demand from the field. In addition, it is providing the SSLIS department with valuable impetus and competence from the digitizing communities. But it has also created a tension in the course between pragmatic and theoretical interests, and it has proven a challenge to make students (and teachers) appreciate the importance of each interest and understand the advantage of combining them.

There are more tangible challenges as well: how does one market the course while also differentiating it from ALM trainee courses, and how does one manage the wide variety of student categories and meet their interests within one and the same course framework? The students range between campus LIS students eager to strengthen their regular LIS education, local ALM employees wanting to get their feet wet in technology, experienced image managers or text encoders from large national digitization projects in search of theoretical and conceptual content, and textual critics engaged in ongoing scholarly editing projects. The difference in interest, competence, study habits and experience is considerable between these student categories.

4. Deciding whether to offer the course on-site or online

We found it a little more difficult to attract ALM employees to the course than we had anticipated. One practical obstacle may be that the course runs for such a long period of time and that although primarily a distance education course, it nevertheless requires students to travel to Borås on several occasions for a number of days each time to attend on-site sessions (Maroso (2005, p. 188) makes a similar observation). ALM institutions in the public sector are reluctant to allow and grant their employees such leaves, not to mention to provide travel and course expenses. A remedy would perhaps be to make the course wholly online, but that would in turn call for new pedagogical solutions to combine discussion seminars, group projects, hands-on, and extensive lab sessions.

5. Managing conflicting interests and needs

Partly as a consequence of the diversity of interests described in #3 above, it has at times been challenging to manage the differences and possible conflicts between student and educator expectations and interests, such as

  • a focus on the interface layout in the digitization project versus on the subsurface infrastructure technology,
  • a narrow conception of digitization as a specific hands-on, text and image capture affair versus a management conception of digitization as a larger chain of events,
  • a focus on practical skills versus on generalized theoretical observations

The latter means creating a balance between, on the one hand, project specific, local and short-term solutions, and, on the other hand, project general, global, platform independent and long-term standards. As a case in point, it has proven difficult to find appropriate textbooks that provide a balance between a generalized understanding and hands-on instructions tied to a particular proprietary digital library software or platform (such as Greenstone).

6. Avoiding the easy way out

Many students are prone to quick and pragmatic approaches in both the editing and the production process. For instance, they tend to view encoding elements as a trivial presentational means that is preferably hidden beneath a WYSIWYG interface, rather than as an intellectual tool that needs to be tangibly visible in an XML editing mode interface. At times, it also requires considerable effort to ensure that the students understand the blessings and possible pitfalls of working with the models of "many inputs → many outputs" and its consequence: information-dense archival files → information-poor delivery files.

Educators wanting to emphasize practices that seem cumbersome in the short term but that are most flexible over the long term need to be clear and stay coherent about the practical and pedagogical benefits, both internally within the teaching team and externally, towards students, collaborating partners, and financers.

7. Balancing between text and image digitization strategies

Most ALM CH digitization activities nurture an image-oriented strategy, i.e., treating the source material primarily as graphics to be captured, edited and enhanced using, for example, Adobe Photoshop and brought to the public as delivery facsimiles. However, the ALM CH activities are less active in text transcription, editing and descriptive encoding, delivering the digitized material to the audiences as machine-readable, searchable, transformable and editable text files, and using XML applications for extensive metadata. An explicit aim of our course has been to try to balance this by putting equal emphasis on both text and image strategies.

8. Balancing between artificiality and realism

The aim to provide students with an academic, theoretical and generalized knowledge base runs the risk of having the students perceive the education as artificial and difficult to turn into practically applicable skills. We have tried to meet this challenge by offering, together with CH digitizing institutions, live material in obvious need of digitization, by encouraging the students to bring their own material and to perform their project as part of an existing larger project, and by requiring the students to cooperate and form project groups and to take on specific tasks within the group.

A related difficulty is meeting the need, on the one hand, for expensive, state-of-the-art commercial hardware and software and, on the other hand, open source, generalized standards and freeware.

9. Supporting the idea of reusability

While there is much ado in CH digitization policy statements about making the material usable and re-usable, particularly the latter ideal is hardly ever realized, certainly not where we are talking about enabling research and education users (not to mention "deep sharing" with other memory institutions, cf. Seaman, 2004) to download rich archival material, choose segments from it and reuse them in new projects and contexts.

We try to counter this by teaching the students open access, deep access and open source approaches, and we furthermore require the students to work with and to deliver their project materials on an open access basis, encouraging them to apply a Creative Commons license to their intellectual enhancements of the digitized material.

10. Documenting, evaluating and sharing

Finally, digitization projects are infamous for neglecting the documentation and evaluation phase. Once the digitized material is out on the web, it's all over (or at least the funding is). But digitization projects are valuable not only for the digitized documents themselves, but at this point in time perhaps even more for the competence, technology, lessons, and experiences accumulated with each project. Such experiences should, we believe, be documented, evaluated, and shared between memory institutions, patrons, and other interested societal institutions – much more than is currently the case. Much toner ink has been spilled complaining that projects often tend to be isolated, reinvent the wheel or too late discover other and more rational methods and technologies than the ones they implemented. Furthermore, the particular skills and experiences accumulated in a digitization project tend to be lost when the person(s) responsible for a project leaves the job, forcing the institution to do some of their work all over again or find new ways to go about their digitizing business.

We therefore require our students to document their plans, their work, and their results in detail throughout their project and to evaluate it afterwards. The documentation (also encoded in their TEI Headers) and evaluation are handed in as a lengthy co-written report that is discussed and examined at the final seminar along with the digitization material itself. The student teams are also required to critically evaluate each other's projects, both at a discussion seminar and in the form of written comments.

Not only digitization projects, however, need to document, share and discuss their work, problems and solutions with other agents much more than is currently the case. At least within the Nordic context, we feel CH digitization educators have the same need and requirement. By providing the description of our course, and in particular by highlighting some of the challenges we faced and how we tried to manage them, we hope to prompt comments and ideas for future work.

Notes

1. This article is based on a presentation at the LIDA (Libraries in the Digital Age) 2008 conference in Dubrovnik, Croatia. Many thanks to the organizers and the participants of the conference for their helpful advice and comments.

2. Those interested are welcome to obtain the collected empirical data from the authors.

3. Three student project examples are: Kattresan <http://karlsson.ownit.nu/kattresan/index.htm>, Den Nyttige Bilder-boken <http://digitalisering.info.se/>, and letters by Gustaf Fröding (a Swedish author contemporary with August Strindberg) <http://bibl4.oru.se/froding/>.

References

Choi, Y. & Rasmussen, E. (2006). What is needed to educate future digital librarians? A study of current practice and staffing patterns in academic and research libraries. D-Lib Magazine, 12(9). Retrieved April 18, 2008, from <doi:10.1045/september2006-choi>.

Coleman, A. (2002). Interdisciplinarity: the road ahead for education in digital libraries. D-Lib Magazine 8(7/8). Retrieved April 18, 2008, from <doi:10.1045/july2002-coleman>.

Coyle, K. (2006). Mass Digitization of Books. Journal of Academic Librarianship, 32(6), 641-645.

Dahlström, M. & Hansson, J. (2008). On the relation between qualitative digitization and library institutional identity. In Proceedings of the International Society for Knowledge Organization 12. (Advances in knowledge organization; Vol. 13). 112-118.

Dahlström, M. (2009). Critical transmission. In E. Thoutenhoofd, A. van der Weel & W. Th. van Peursen (Eds.), Text Comparison and Digital Creativity. Amsterdam: Brill. (Forthcoming)

Dalbello, M. (2002). 'Is there a text in this library?' History of the book and digital continuity. Journal of Education in Library and Information Science, 43(3), 197-204.

Justrell, B. (2002). Sweden. In Coordinating digitalisation in Europe: Progress report of the National Representative Group coordination mechanisms for digitisation policies and programmes 2002. European Commission: The Information Society Directorate-General. Retrieved April 12, 2008, from <http://www.minervaeurope.org/publications/globalreport/globalreppdf02/svezia.pdf>.

Justrell, B. (2003). Sweden. In Coordinating digitalisation in Europe: Progress report of the National Representative Group coordination mechanisms for digitisation policies and programmes 2003, European Commission: The Information Society Directorate-General. Retrieved April 12, 2008, from <http://www.minervaeurope.org/publications/globalreport/globalrepdf03/sweden.pdf>.

Lavagnino, J. (2006). When not to use TEI. In L. Burnard, K. O'Brien O'Keeffe & J. Unsworth (Eds.), Electronic Textual Editing. New York: MLA/TEI. Retrieved April 18, 2008, from <http://www.tei-c.org/About/Archive_new/ETE/Preview/lavagnino.xml>.

Manzuch, Z., Huvila, I. & Aparac-Jelusic, T. (2005). Digitization of cultural heritage. In L. Kajberg & L. Lørring (Eds.), European Curriculum Reflections on Library and Information Science Education. Copenhagen: Royal School of Library and Information Science. 37-64. Retrieved April 18, 2008, from <http://www.asis.org/Bulletin/Dec-06/EuropeanLIS.pdf>.

Maroso, A. L. (2005). Educating Future Digitizers. Library Hi Tech, 23(2), 187-204.

McMenemy, D. (2007). Less conversation, more action: Putting digital content creation at the heart of modern librarianship. Library Review, 56(7), 537-541.

Minerva (2004). Good Practices Handbook. Version 1.3. Retrieved January 4, 2008, from <http://www.minervaeurope.org>.

Perry, C. A. (2005). Education for digitization: How do we prepare? The Journal of Academic Librarianship, 31(6), 523-532.

Saracevic, T. & Dalbello, M. (2001). A survey of digital library education. In Proceedings of the American Society for Information Science and Technology, 38, 209-223.

Seaman, D. (2003). Deep sharing: A case for the federated digital library. Educause Review, 38(4), 10-11. Retrieved January 4, 2008, from <http://www.educause.edu/ir/library/pdf/erm0348.pdf>.

Slutrapport Framtid i Access & Framtid i Access Plus (2008). ABM-centrum. Retrieved April 9, 2008, from <http://www.abm-centrum.se/utbildning/default.asp>.

Sperberg-McQueen, C. M. & Burnard, L. (2004). Text Encoding Initiative: The XML version of the TEI Guidelines. The TEI Consortium. Retrieved January 4, 2008, from <http://www.tei-c.org/P4X/>.

Spink, A. & Cool, C. (1999). Education for digital libraries. D-Lib Magazine, 5(5). Retrieved January 4, 2008, from <doi:10.1045/may99-spink>.

Tammaro, A. M. (2007). A curriculum for digital librarians: a reflection on the European debate. New Library World, 108(5/6), 229-246.

Copyright © 2009 Mats Dahlström and Alen Doracic
spacer
spacer

Top | Contents
Search | Author Index | Title Index | Back Issues
Previous Article | Next Article
Home | E-mail the Editor

spacer
spacer

D-Lib Magazine Access Terms and Conditions

doi:10.1045/march2009-dahlstrom