Search   |   Back Issues   |   Author Index   |   Title Index   |   Contents

Articles

spacer

D-Lib Magazine
October 2006

Volume 12 Number 10

ISSN 1082-9873

Strategies and Frameworks for Institutional Repositories and the New Support Infrastructure for Scholarly Communications

 

Tyler O. Walters
Associate Director, Technology and Resource Services
Georgia Institute of Technology Library and Information Center
<tyler.walters@library.gatech.edu>

Red Line

spacer

Introduction

Institutional repositories (IRs) are proliferating as they become an indispensable component for information and knowledge sharing in the scholarly world [1]. As their numbers increase worldwide, a new phase of IR development is emerging. Moving beyond their initial functions, IRs no longer serve solely as a place to store, organize, and access content. With rapidly changing technologies, users now desire and expect transportable content that can be utilized within various digital environments and reused in multiple formats, and they need forums for the rapid exchange of ideas with both on-campus and external communities. In response, universities and the libraries hosting IRs are looking for ways to weave their repositories into the "information fabric" of their campuses' academic and business processes and catalyze changes in scholarly communications more broadly.

This article will examine emerging IR developments and explore how IRs can help create a new infrastructure to support scholarly communications and digital research. The experiences of the Georgia Institute of Technology (GT) Library and Information Center while building its IR, SMARTech, and designing related services will be reviewed as an exemplar university.

In a digital age where "content is king," we must recognize that everyone has content, not just the libraries. To be successful information service providers, libraries need to develop services that allow content creators, content managers, and end users to manipulate the content in ways they desire. Faculty want to create bibliographies from IR holdings, and they need to generate document outputs like faculty promotion and tenure documents, annual faculty profile updates, online resumes, and curriculum vitae. They also want to create "MyRepository" portals where students and colleagues can locate their scholarly works. Students and faculty alike want effective content integration between IRs and course management systems like BlackBoard, WebCT, Desire2Learn, Sakai, and other learning technologies [2]. Faculty and researchers want to expose their content to user communities and see it persist.

In order to support these and other user demands, content managers, such as librarians, sponsored programs administrators, thesis office personnel, communications staff, web site managers, and IT professionals, must be able to send/receive, store, organize, and archive content. Therefore, scholarly content needs to be easy to find and broadcast as well as be interoperable from one system to the next. Content users, creators, and managers increasingly will link, copy, move, integrate, transfer, harvest, and possibly even revise scholarly content in digital environments other than the content's original "digital home." As librarians decide how to enhance IRs with value-added services, they need to do so based on a guiding principle: first determine university goals and faculty needs and then develop products, services, and capabilities with these in mind. If the university community sees the IR as adding an indispensable component to the educational activities of the campus, then IRs will gain support and thrive. The "growth industry" for IRs may very well depend upon identifying and implementing creative ways for researchers, students, and other campus professionals to use the scholarly information these repositories contain.

Developing IR Services

Using one approach, IR managers are engaging other facets of the scholarly communications process and tying services together. This merging occurs when IR services staff become involved in activities such as:

  • Supporting technical production processes and hosting the final intellectual output of scholarly conferences. In this scenario, the IR acts as the host site for the conference proceedings and other forms of conference scholarly output (i.e., audio/video recordings of speaker presentations and discussions with attendees). Conference output is preserved and accessed in the IR [3].
  • Supporting technical production processes for open-access e-journals and hosting the final content. The IR acts as host site for the journal. Preservation and access activities regarding journal content take place within the IR. This is actually not a new activity. One such long-standing example is Dermatology Online Journal. Founded in 1995, it is part of the California Digital Library's eScholarship repository [4].
  • Digitally capturing scholarly discussions at symposia and presentations made by lecture series-sponsored speakers. The IR acts as host site for audio/visual recordings of the event, and they are preserved and accessed in the IR [5].
  • Ingesting blogs, wikis, discussion group forums, and e-mail lists that augment ongoing, digital scholarly "conversations."

IR managers are also exploring ways to make effective links between their IRs and other university information systems that manage or capture digital intellectual output. Creative ways to collect and ingest this information into the IR are beginning to emerge. Adopters of this trend, such as Georgia Tech, seek to institutionalize the library's IR and integrate it into the university's academic and business processes. Through this approach, the IR becomes a daily staple for many academicians and administrative support personnel who depend on access to campus scholarly output and the related information they must manage.

A third emerging approach, representing a 180º turn, involves pushing content out. IR content needs to be syndicated and integrated easily into digital environments containing subject-specific information. These environments include university and personal portals, department web sites, discipline-based repositories, and web sites for newspapers, newsletters, and other online publications that may pull content automatically from digital sources via RSS feeds and other technologies. The world of IRs will begin joining the world of Web 2.0 as users borrow, "mash up," and integrate scholarly content into other Web-based environments.

Re-examining the Nature of Scholarly Communications

Given all of this growth of IR-related services, perhaps examining the definitions of "scholarly communications" and "institutional repositories" will help us to understand better the kinds of activities and outputs that fall within the former's realm, and to think of the latter's role within it. The overall paradigm for scholarly communications is:

"the system through which research and other scholarly writings are created, evaluated for quality, disseminated to the scholarly community, and preserved for future use. The system includes both formal means of communication, such as publication in peer-reviewed journals, and informal channels such as electronic listservs." [6]

The definition identifies two broad categories of scholarly communication, formal and informal. Historically, librarians have been most concerned with the formal (i.e., journals, technical papers, conference proceedings, white papers, research reports). However, a growing body of informal modes, such as blogs, wikis, listservs, and other social software content, are being utilized by scholars and their students. Increasingly, new knowledge is exchanged through both formal and informal means.

Raym Crow defined an IR as a:

"...digital archive of intellectual product created by the faculty, research staff, and students of an institution and accessible to end users both within and outside of the institution, with few if any barriers to access. The content... is institutionally defined, scholarly, cumulative and perpetual, and open and interoperable." [7]

Crow's definition does not specify the form that "intellectual product" takes and consequently includes myriad methods of scholarly communications – be they formal or informal. It is important to design IRs and their services according to this broad understanding of scholarly production. The nature of scholarly communications, and the products that result from them, should guide the nature of both IR content and services. With this new understanding, libraries will discover ways to enhance their IRs by offering users additional capabilities, such as the ability to:

  • Syndicate learning objects to create courses and integrate with courseware
  • Easily copy bibliographic citations from a faculty member's publications onto the faculty member's web page or into a curriculum vitae or annual profile update
  • Import a researcher's reports and technical papers into a portal or research center web site or link them to a larger, discipline-based repository
  • Syndicate an IR-stored wiki or blog in a new virtual community hosting a multi-university research team so that the team may draw upon its content to continue current, informal discussions or write a new research report.

The potential uses of IR content in other digital environments are countless. IR content does not need to "sit" in the IR only. It should be brought into new digital environments where it can be easily consulted, represented, and integrated with other current research and educational activities. IR content also should be collected in creative ways and not rely solely on being submitted by individual faculty. These new concepts for IRs are being explored at the Georgia Institute of Technology (GT). This university's experiences as it develops its IR into a more meaningful information service will be examined.

Seeking Opportunities to Create a New Infrastructure – Program-building around "Scholarly Communications Services" at Georgia Tech

SMARTech, or Scholarly Materials And Research at Georgia Tech, is an institutional repository for capturing the Institute's intellectual output to support its teaching and research missions. The SMARTech website states that it "connects stockpiles of digital materials currently in existence throughout campus to create a cohesive, useful, sustainable repository available to Georgia Tech and the world." The GT Library publicly launched SMARTech using the DSpace software platform on August 1, 2004 [8]. SMARTech began with 3,000 documents, including dissertations, annual research project reports, final project reports, and a technical papers series from the Institute of Paper Science and Technology, a research center and graduate school that formally merged with GT in 2003 [9]. As of August 2006, SMARTech hosts over 9,000 items in 130 collections, spread across more than 70 communities and subcommunities. During the past fiscal year (July 2005 - June 2006), 1,000,791 item records were viewed, 489,292 items were downloaded, and 50,434 searches were made. SMARTech's holdings have tripled in its first two years; this growth and active use of SMARTech makes it one of the world's largest DSpace-based single-institutional repositories [10].

GT's experience building SMARTech and erecting parts of a new scholarly communications infrastructure are representative of what research libraries are seeing across academe. It is eminently applicable to other universities and therefore provides sound lessons for those embarking on a similar road to develop their own IR services. As the GT Library changed internally to meet management and service needs associated with SMARTech, we began to expand our horizons. Through our observations, interviews, and interactions with faculty, we recognized that they and others (i.e., GT researchers, administrative staff, and students) would benefit from support services that assist them as they create digital intellectual output, prior to depositing it in the IR [11]. Technical assistance to produce intellectual works in digital form and capture live events where scholars exchange ideas became our focus. In 2005/06, we developed a program around a set of services that would facilitate these needs. Under this program, the Digital Initiatives Work Group began to provide technical support services to produce traditional scholarly communications in digital form. Initially, the Group worked to:

  • Develop SMARTech as the center of program services, building content services and integrating the IR with other information systems containing GT-born digital intellectual output
  • Provide technology/production support for GT-hosted scholarly conferences' intellectual output (i.e., proceedings and conference sessions' audio/video recordings)
  • Provide technology/production support services to create open access e-journals and other e-publications, providing preservation and accessibility via SMARTech
  • Provide technology support to capture audio/video of live scholarly communications from GT-hosted symposia and sponsored lecture series, preserving resulting digital products and providing accessibility via SMARTech

The GT Library has made noteworthy progress, particularly in the areas of conference and e-journal support and the creative integration of SMARTech with other information systems on campus.

Supporting Conference-based Intellectual Output

GT's academic programs are concentrated heavily in the engineering disciplines, which utilize conferences as a major means of communicating research developments. Hence, the Digital Initiatives Work Group saw an opportunity to devise and promote services to support conference-based intellectual output.

The School of Aerospace Engineering (AE), in particular, actively utilizes conferences to share research information. As of August 2006, SMARTech is hosting AE's Space Systems Engineering Conference, along with two other campus-based conferences, the Recycling of Fibrous Textile and Carpet Waste Conference and the Electronic Resources & Libraries Conference (ER&L) [12]. According to U.S. News and World Report, AE is the fourth highest ranked graduate program in its field [13]. AE first interacted with SMARTech by depositing a working paper series to which it retains copyrights. Previously, the library collected technical papers series from both their Space Systems Design and Aerospace Systems Design Laboratories. Working together, the Digital Initiatives staff and the library's subject librarian facilitated this relationship with AE. This partnership between librarians has proven very useful for maximizing existing relationships between academics and librarians to establish the repository and its services.

We envisioned the future of the digital initiatives program being involved in not only hosting the final output of conferences but also formulating the terms under which content creators and GT, as keepers of the output, will use it in the future. This involvement would include assisting with technologies to communicate, capture, and produce conference output. The first opportunity came when an enterprising GT librarian began to coordinate the brand new ER&L Conference. Digital Initiatives staff discussed the assistance they could give the conference-coordinating librarian to produce and host the conference output; she was pleased to collaborate. Digital Initiatives staff (and the author) assisted with writing a statement of intellectual property rights. It became the "click through" terms of agreement presenters selected when submitting their session proposals. Under the terms, presenters kept their copyrights and the right to use the material in any way they wish indefinitely. GT would disseminate and preserve the content via SMARTech, non-exclusively. Every presenter accepted this arrangement.

Some of the conference technologies were selected prior to Digital Initiatives' involvement; however, much was learned about conference production from the ER&L experience that is being applied to future projects. Staff reviewed conference production systems such as the Open Conference System (OCS) [14], an open source software developed by the University of British Columbia's Public Knowledge Project, and the DPubs software [15], a joint open source software project of Cornell and Penn State Universities. In the end, ER&L conference organizers used Moodle, an open source course management system that offered features such online discussion groups. Moodle was originally designed to host courses and had to be adapted to host conference sessions. The online discussion group feature was used to capture pre- and post-conference discussions related to a particular session. Presentations (usually in MS PowerPoint) and MP3 session audio recordings were made and became part of the intellectual output now accessible in SMARTech. While this experience seems a bit incestuous (i.e., librarians offering services to librarians), it gave Digital Initiatives staff a more thorough example of the kinds of conference support services they can offer GT faculty who are organizing scholarly conferences (e.g., supporting the production, dissemination, and preservation of conference-related intellectual output.)

Supporting e-Journal Publishing

Similar to its proactive approach regarding conference-based output, the library "advertised" what it could do to aid faculty in publishing electronic journals. The first experience with open-access (OA) e-journal publishing came with the journal, Information Technologies and International Development (ITID) [16]. This opportunity came via the IT manager for the college of arts and sciences, who previously worked as the systems administrator in the library. The Digital Initiatives manager maintained contact with the IT manager when he left the library for the college. Eventually, one of his faculty, Michael Best, co-editor of ITID, began searching for technical support for his journal. Knowing of the library's interest in supporting e-journals, the IT manager referred the professor to the library and a three-way collaboration to support the journal began. The Digital Initiatives staff worked with the GT co-editor, the other co-editor and editorial assistant at the University of Maryland, and MIT Press as the publisher. At the same time, ITID also converted from a print and subscription-based journal to an electronic and open-access journal, the first OA journal offered by MIT Press. While the GT Library is not the host of the published output, it is the technical production and technology support unit for the journal. This provided a great learning opportunity for the Digital Initiatives staff, who now promote their e-journal support services to other faculty and student groups.

"Institutionalizing" Intellectual Output Transfer in IRs

While digitally hosting and capturing conference proceedings and supporting e-journal production both focus on disseminating scholarly output, other IR-related activities at GT have drawn intellectual content into SMARTech. The Digital Initiatives staff is working to integrate GT information systems by transferring intellectual output from other departments or units into SMARTech. After identifying campus systems that handle research output, they pursued discussions regarding automated document transfer. The first such effort is a familiar one – the electronic thesis and dissertation process and transferring the research output from that system into the IR. However, there are many other university information systems used to report faculty and student intellectual output, such as research reports managed through a university's office of sponsored programs, a university's student portfolio system, and college- and department-level means for faculty to inform public relations/communications staff of their scholarly activities.

Integrating Reports from the Office of Sponsored Programs

Collecting scholarly output into an IR need not consist solely of faculty-authored journal articles and technical papers. As pointed out above, intellectual works reside in many areas at a university. Faculty and their research groups typically gain funding for their projects from grants given by government agencies, corporations, and private foundations. In many cases, a project progress report is required, including the status of the research as well as a final report containing results. The office of sponsored programs (OSP) manages the business aspects of these projects, and their related documents reside in OSP information systems. In 2002, the library and the GT OSP began to discuss moving this largely paper-based project reporting system to an electronic one. Currently, OSP is finalizing this migration and re-engaging discussions with the library about transferring final reports to SMARTech via automated means. Through library review of the OSP report management system, WebWISE, library staff found that basic metadata on each project was being gathered into the system, including the type of report being submitted (research status report, quarterly expenditures statement, financial summary, etc.). OSP and library staff decided to add one field, asking principal investigators (PI) if the report(s) they submit are the project's final deliverable. If checked yes, all final report deliverables are marked and will be slated for an automated, nightly transfer to SMARTech. This, of course, will occur only when the PI identifies the project as open research as opposed to confidential or a restricted project. OSP staff review and manage access to confidential/restricted projects as well as progress reports of open projects via the GT Library's campus records management program.

Unburdening the GT Student Portfolio System

CareerTech, a GT student portfolio system begun in 2006, "enables students to build a portfolio gradually over time...posting digital versions of papers, reports, presentations, speeches, and other products of learning activities." Under current policy, the "portfolio will be available online two semesters beyond the last term that the student is actively enrolled in classes" [17]. The Office of Career Services, Office of Information Technology, and the GT Library support an additional policy for transferring inactive portfolio content to SMARTech two semesters after the student's departure from GT. These three units support this archiving approach for two main reasons. Future storage space will become limited in the student portfolio system, a system designed to manage current portfolios only. Inactive portfolios need to be placed into an archival system, such as an IR, to ensure some measure of access to and preservation of student intellectual and creative endeavors without burdening the portfolio system. The library, interested in capturing and preserving this output, has become a strategic partner in the process, providing an archival system for inactive portfolios via the IR, SMARTech.

Augmenting the College of Architecture Faculty Digest

The College of Architecture (CoA) Faculty Digest is a tool used by the faculty of the College of Architecture to inform communications staff of their scholarly activities. The web-based system allows them to describe their scholarly event, assign metadata to it, and send that information to appropriate communications offices of the CoA and greater university. The library's Digital Initiatives staff and the CoA's IT staff began comparing notes at the university's Information Technology Advisory Committee (ITAC) meetings and determined that it would be fruitful to work together to augment the CoA's faculty digest system. The goal was to add a feature allowing faculty to attach their scholarly source content to their Faculty Digest announcements, having it and the metadata submitted automatically to SMARTech. The project's staff mapped the submitted metadata to the qualified Dublin Core database of SMARTech. This benefits the CoA faculty because they will not have to learn a new procedure to submit their intellectual works to SMARTech via its DSpace submission routine. They only have to add one step to their faculty digest process, attaching the content to the message, much like attaching a file to an e-mail. This is a "win-win" situation, further integrating the IR into a college's academic processes and making a digital environment already used by faculty even more effective.

Conclusion

Additional work is required to weave IRs into the fabric of our institutions and scholarly disciplines. Many other services are being considered, such as media and format migration (e.g., Cornell, Tennessee), data management (e.g., Purdue), personalization/researcher pages (e.g., Rochester), restricted access management for proprietary and embargoed information, citation analysis, statistics displays, improved searching, and more. These services will all play an important role in developing IR functions. Other ongoing work at GT includes personalizing SMARTech via researcher's pages as well as syndicating SMARTech content and applying templates to produce tenure documents, faculty profiles, vitae, and bibliographies on departments' and research centers' web sites. These kinds of developments will be necessary for IRs to become scholarly information services that faculty, staff, and students find indispensable. The GT Library engaged other members of its campus to bring added value to its IR. This could not have been achieved by complacently accepting the conventional storing/organizing/accessing model of an IR. To develop each new service, the GT Library needed only to collaborate with one faculty member or campus professional to complete a project and begin marketing the service to others. Perhaps the main lesson in IR development is to build those all-important relationships with faculty, student groups, IT, and other academic professionals, and maintain them diligently. When creating new IR services, we should also be mindful that no single approach works for every university. There is no need to be rigid – adopt an approach that will work on your campus, utilize your existing relationships, and deliver some small projects. These steps will pave the way to engendering a broader understanding of how the library can support modern scholarly communications and help manage digital intellectual output.

Notes

[1] Lynch, Clifford A. and Joan Lippincott, "Institutional Repository Development in the United States as of Early 2005," D-Lib Magazine, 11:9 (September 2005). <doi:10.1045/september2005-lynch> and van Westrienen, Gerard and Clifford A. Lynch, "Academic Institutional Repositories: Deployment Status in 13 Nations as of Mid 2005," D-Lib Magazine, 11:9 (September 2005). <doi:10.1045/september2005-westrienen>.

[2] Lynch, Clifford A. and Neil McLean, "Interoperability between Information and Learning Environments – Bridging the Gaps," a white paper produced jointly by IMS Global Learning Consortium and the Coalition for Networked Information. (PDF file, May 2004).

[3] For examples of conference intellectual outputs, see: <http://smartech.gatech.edu/handle/1853/10159> and <http://smartech.gatech.edu/handle/1853/8028>.

[4] See <http://dermatology.cdlib.org>.

[5] For an example of digital output of a lecture series, see: <http://smartech.gatech.edu/handle/1853/9431>.

[6] See Scholarly Communications definition at: <http://www.ala.org/ala/acrl/acrlpubs/whitepapers/principlesstrategies.htm>.

[7] Crow, Raym, "The Case for Institutional Repositories: A SPARC Position Paper." 2002. <http://www.arl.org/sparc/IR/ir.html>.

[8] Smith, MacKenzie, Mary Barton, Mick Bass, Margret Branschofsky, Greg McClellan, Dave Stuve, Robert Tansley, and Julie Harford Walker, "DSpace: An Open Source Dynamic Digital Repository," D-Lib Magazine, 9:1 (January 2003). <doi:10.1045/january2003-smith>.

[9] Walters, Tyler O. "Moving Libraries into Modern Knowledge Services," 2001: An Information Odyssey: Seizing the Competitive Advantage. Washington, D.C.: Special Libraries Association, 2001. <http://smartech.gatech.edu/handle/1853/7426>.

[10] Thomas, Charles F., Robert H. McDonald, Anthony D. Smith, Tyler O. Walters, "The New Frontier of Institutional Repositories: A Common Destination with Different Paths," New Review of Information Networking 11:1 (May 2005): 65-82.

[11] For an example of how other universities are gathering information on faculty research processes and information use, see Foster, Nancy Fried and Susan Gibbons, "Understanding Faculty to Improve Content Recruitment for Institutional Repositories," D-Lib Magazine, 11:1 (January 2005). <doi:10.1045/january2005-foster>.

[12] <http://smartech.gatech.edu/handle/1853/9435>, <http://smartech.gatech.edu/handle/1853/7946>, <http://smartech.gatech.edu/handle/1853/10063>, respectively.

[13] See references to Georgia Tech's 2007 rankings in U.S. News & World Report: <http://www.gatech.edu/news-room/release.php?id=920>.

[14] For information about OCS, see: <http://pkp.sfu.ca/?q=ocs>.

[15] For information about DPubs, see: <http://dpubs.org/>.

[16] See ITID at: <http://www.mitpressjournals.org/loi/itid>.

[17] See <http://www.oit.gatech.edu/comm_files/nl06s_h.html>.

Copyright © 2006 Tyler O. Walters
spacer
spacer

Top | Contents
Search | Author Index | Title Index | Back Issues
Previous Article | Next Article
Home | E-mail the Editor

spacer
spacer

D-Lib Magazine Access Terms and Conditions

doi:10.1045/october2006-walters