Search   |   Back Issues   |   Author Index   |   Title Index   |   Contents

Articles

spacer

D-Lib Magazine
July/August 2005

Volume 11 Number 7/8

ISSN 1082-9873

Funding for Digital Libraries Research

Past and Present

 

Stephen M. Griffin 1
Program Director, Division of Information, and Intelligent Systems
National Science Foundation
<sgriffin@nsf.gov>

Red Line

spacer

Introduction

In the past several decades, rapid developments in computing and communications technologies have created a globally distributed information environment that significantly extends the scope of applications to be explored. The emergence and growth of the Internet and high bandwidth connectivity, combined with low cost processors and memory, have encouraged the creation and use of digital content on a grand scale. The situation today is one of information-driven endeavors spanning broad areas of human activities.

The Interagency Digital Libraries Initiative (DLI) was an opportune program in coinciding with transformational changes in the larger technology environment. The DLI proceeded through 10 years of funding projects and activities inspired and articulated by the broad community of researchers and practitioners the program was meant to serve. Projects funded through the DLI were instrumental in inspiring new areas of information technology research and creative project models.

This period of time was also one in which altogether new social networks and organizations were formed as a consequence of new technological environments; diverse communities found common ground and cause to pursue knowledge-making and knowledge management using digital resources. As a result, cross-disciplinary and multi-sector collaboration became a fixture of digital libraries research efforts.

From the mid-90s to the present, all things related to the Internet can only be described in terms of exponential change, from capabilities of base technologies (see Figure 1) to the creation and accumulation of digital content to the number of users and possibilities for use. The cost of components continues to plummet as capabilities increase by orders of magnitude, lowering the barriers to acquisition and operation of high performance computing and communications resources. The ability to create, manipulate, manage, and share digital content is now within the financial means of small organizations and individuals. A self-reinforcing cycle of technology advances and increasing public demand for more is in evidence across many sectors of human activity.

Chart of metacenter circa 1993

Figure 1. 1993 Metacenter

Figures 1. The technology changes in the 1990s and early 2000s were dramatic and have been described in many ways. To illustrate advances in processor, memory, storage and networking technologies Figure 1 above shows the NSF Metacenter configuration in 1993. The Supercomputer Centers budget was ˜$60,000,000 with equipment and support a significant fraction of that figure. In 2005, some would argue that similar capabilities can be achieved with networked personal computers, costing about $5,000 each.

Digital content has become the driver for Internet growth. New advances are no longer being shaped entirely by the designs of scientific research but by a variety of information producers and user desires. The Internet landscape and features are being drawn as much by consumer and commercial forces as by research and education interests. World citizens continue to express a strong desire for new and more digital content, better means for access to information, and new means for personal expression and entertainment.

Digital Libraries research today is seen as grounded in computer science and engineering research, informed by domain research across disciplines, applicable to a broad set of scientific and non-scientific problem domains and characterized by novel collaborative efforts focused on the creation, collection, organization, use, and preservation of large volumes of digital information in a rapidly changing, globally linked knowledge environment.

The phrase "digital libraries" continues to be a powerful metaphor for discussion in the Internet age – one that continues to inspire innovative thinking about large-scale, distributed information environments, and inventive communication and learning practices. The definition of "digital libraries" continues to evolve in step with research accomplishments, new technologies, and Internet-based capabilities and resources. The goal of making globally distributed information-of-value accessible to large, diverse user populations desiring knowledge for many purposes is steadily being achieved.

Context

In the late 1980s, NSF began concerted funding of networking research and infrastructure, primarily in support of the supercomputing program. The supercomputing program itself was created in response to a series of reports in the early 1980s encouraging rapid development of computational science capabilities. Computational science was seen as a rapidly growing multidisciplinary field that used advanced computing capabilities to understand and solve complex problems which otherwise could not be addressed. National supercomputer centers for large-scale computing resources were viewed as the means to achieve computational science goals.

NSFNET (not yet named) was then viewed as a means to aggregate and share computational resources (processors, memory) at the supercomputer centers in order to bring more capability to bear on Grand Challenge application problems and to expand access to the centers via regional networks. As the NSFNET backbone increased in bandwidth, regional networks multiplied along with the number of users.

With the connection of international networks and transition from government-sponsored systems (ARPANET, SATNET, PRNET, and NSFNET), the concept of the Internet as a world network of networks began to be realized. What was motivated by a demand for secure communication in times of global conflict (early work on packet switched networks sponsored by DARPA in the 1960s and 1970s), and a means for connecting major centers of computational resources in order to harness their collective power (1980s), rapidly became a means for connecting people to each other and to universal stores of knowledge. What had been envisioned as a resource for research and education quickly transformed into a popular new media for the general public.

By 1991, Tim Berners-Lee, working at CERN, had created a hypertext, GUI browser and editor that he called the "World Wide Web" (WWW) [2]. Berners-Lee made the WWW system freely available on the Internet that same year. The first web server outside of Europe was installed at the Stanford Linear Accelerator Center. Shortly thereafter, other US institutions, including the National Center for Supercomputing Applications (NCSA) at the University of Illinois deployed web servers.

By 1993, fewer than 100 web sites existed, but the potential and need for access to network data collections had been recognized. In early 1993, at a seminar at CERN, Marc Andreesen from NCSA released the first version of Mosaic. Mosaic was the first program to make "browsing" the World Wide Web easy and user-friendly. It was commercialized as Netscape shortly thereafter.

In 1995, Sun Microsystems formally announced Java, and Netscape announced its intention to use Java in the Netscape browser. Java significantly enhanced web page functionality.

Web tools and resources rapidly emerged. Yahoo! – a table of contents for Web sites – was started by two Stanford students. Search engines such as Lycos, AltaVista, and WebCrawler appeared. These involved automatically indexing Web pages using keyword-based techniques for ranking. One of the graduate students funded under the NSF-supported DLI project at Stanford, Larry Page, took an interest in the Web as a "collection". He and Sergey Brin, another Stanford graduate student working on the DLI project and supported by an NSF Graduate Student Fellowship, constructed a prototype search engine based on another technical approach – examining link origins and destinations – reasoning that these might capture collective human judgment. Together, Page and Brin constructed a search engine called BackRub. This work, along with that of the 15 other graduate students supported through the Stanford DLI grant, was described to a DLI site-visit team in 1997. The site review team expressed approval for all the good work being done on the Stanford DLI project. Later, Backrub was renamed Google, and in 1998, Google Inc. opened for business. Google allowed fast and easy indexing, searching and exceptionally accurate ranking of web content – albeit a small percentage of the total web content.

Brief History of the Digital Libraries Initiative

The Digital Libraries Initiative (DLI) began with a modest investment from NSF, DARPA, and NASA for the period 1994 -1998. At this time, the newly enacted Federal High Performance Computing Program (HPCC) was attempting to coordinate and stimulate collaboration among agency computing and networking activities. The DLI jointly funded six university-based projects focusing on fundamental information technology (IT) research, technology testbeds and partnership building (see Table 1). The six projects were similar in size and structure ($1M/yr each), techno-centric, highly collaborative, and heavily leveraged (>100% cost-sharing of the $24M total federal funding). Each addressed different aspects of the DL research agenda as it was understood at the time. Support was not provided for creation or conversion of digital content for the purposes of collection building, or for implementation and evaluation of fully operational systems of scale. Complete information on the projects can be found at <http://www.dli2.nsf.gov>.

Digital Libraries Initiative - Phase 1
University Project Research Focus
Carnegie Mellon University Digital Video Libraries Speech, image and natural language technologies integration
University of California, Santa Barbara Geographic Information Systems Spatially-indexed data; content-based retrieval; image-compression; metadata
University of Michigan Intelligent Agent Architectures Software agents; resource federation; artificial service market economies; educational impact
University of Illinois Intelligent Search and the Net Large-scale information retrieval across knowledge domains; semantic search; SGML; user/usage studies
Stanford University Uniform Access Large-scale information retrieval across knowledge domains; semantic search; SGML; user/usage studies
University of California, Berkeley Media Integration and Access New models of "documents"; natural language processing; content-based image retrieval; innovative interface design

Table 1. The six projects funded under DLI-1.

The planning of the DLI program was thorough and inclusive, and working papers and reports were developed that outlined program goals. The working papers and reports are still available at <http://fox.cs.vt.edu/DLSB.html>.

A program kick-off meeting was held immediately following the announcement of the DLI-1 awards. This meeting established a set of directions for research and inquiry, many of which are still guiding efforts today. The report from this meeting is available at <http://diglib.stanford.edu/diglib/pub/reports/iita-dlw/main.html>.

During the course of the Digital Libraries Program (DLI-1), communities of researchers, practitioners and users continually inspired and informed the program's direction and management. The Principal Investigators themselves were instrumental, not only in guiding their own projects but in working with each other to build a single, coordinated and balanced program. The agency managers provided oversight and properly located the program within the larger structure of the Federal HPCC program, and they labored to identify and secure additional funding, and structured project reporting and reviews so as to satisfy agency requirements, avoid unnecessary project overhead and distraction, and produce informative materials for the larger and growing DL community.

It was determined that capturing the intellectual output of the individual projects and disseminating results was particularly important. D-Lib Magazine was funded by DARPA in 1995 and became an accurate, integral part of the DLI program's value. D-Lib was established as a professional publication – an indispensable resource and the journal for reporting research results and discussion of broader issues for the DL community. In addition, a DLI web site was established and funded as part the University of Illinois project. The web site grew to become an accurate record of DLI activities, containing event information, papers, reports, and presentations from the projects and activities in the broader community. Both D-Lib and the University of Illinois web site were integral to community building and program success.

Based on the widely acknowledged success of DLI and the enthusiasm for the rapidly expanding Internet environment, a follow-on program, Digital Libraries Initiative Phase-2 (DLI-2), was planned. To avoid inevitable interruptions in their offices, a small group of agency program managers held early brainstorming sessions at a French Bistro within walking distance of NSF and DARPA. The notes from these brainstorming sessions became input to a subsequent planning workshop organized by Dan Atkins that was held in the spring of 1997. The focus of this workshop, which was named The Santa Fe Planning Workshop (http://www.si.umich.edu/SantaFe), was to construct the specific intellectual agenda for DLI-2.

DLI-2 was envisioned as distinctly broader than DLI-1, extending well beyond traditional IT research issues into domain informatics, content development, use and usability in a variety of work contexts. Additional agencies sponsorship was solicited and obtained. In addition to NSF, DARPA and NASA, the National Library of Medicine, Library of Congress, and National Endowment for the Humanities pledged support in various forms. The NSF Division of Undergraduate Education was a major contributor, using DLI-2 to begin exploring the resources for the National Science Digital Library. Still other agencies participated as partners, joining in planning and working group discussions and All-Project meetings. These agencies included the Institute for Museum and Library Services, the Smithsonian Institution, and the National Archives and Records Administration. It was becoming increasingly clear that digital resources would be a primary part of these agencies' programs and assets.

DLI-2 (1998-2002) involved two separate competitions. Funding was ˜approximately $8-10 million per year for 5 years and about 50 projects were sponsored, representing a full-spectrum of activities, including: fundamental research, content and collections development, domain applications, testbeds, and operational environments. The projects addressed topics spanning the entire information lifecycle – creation, access, dissemination, use, and preservation – and placed additional emphasis on measures of impact. This was to be accomplished via a modular, open program structure that would enable new sponsors to participate, to add new performers or projects at any time, and to build on and enhance existing projects. Program intellectual goals were to enlarge the topical scope and place more emphasis on content issues as well as technology development and applications, to keep pace with advances in the development and the use of distributed, networked information resources of all types throughout the nation and around the world.

A great strength of the DLI-2 program was the interdisciplinary richness of the various projects, the high levels of interaction among them and extensive partnering with private sector corporations and other organizations. On the agencies' side, program management proceeded in an atmosphere of enthusiasm, good will, and cooperation.

Image from a slide showing the history of DL community input

Figure 2. Major Community Planning Input.

 

 

Table 2. DLI Program Implementation History
NSF 93-141    Research on Digital Libraries [Digital Libraries Initiative - Phase 1]
Agency Sponsors: NSF, DARPA, NASA
Submission Date: February 4, 1994
Proposals Received: 76
Awards: 6 totaling $25M (FY94-FY98)
NSF 98-63    Digital Libraries Initiative - Phase 2
Agency Sponsors: NSF, DARPA, NASA, NIH/NLM, NEH
Agency Partners: IMLS, Smithsonian, NARA
Submission Date 1: July 15, 1998; Date 2: May 17, 1999
Proposals Received: ~300
Awards: 34 totaling $48M (FY98-FY03)
NSF 99-6    International Digital Libraries Cooperative Research Initiative
Submission Dates (target): January 15, 1999, 2000, 2001
Proposals Received: ~60
Awards: 16 totaling $6M (1999-2003)
[Includes NSF/JISC = 6 awards; NSF/DFG = 4 awards]
NSF 02-085    International Digital Libraries Cooperative Research Initiative and Applications Testbeds
Submission Dates (target): April 15, 2002, 2003, 2004, ... (Program Terminated 2003)
Proposals Received: ~50
Awards: NSF/JISC = 4 awards; NSF/DFG = 2 awards; NSF/ED = 1 award
space

 

 

Table 3. Agency Program Managers Involvement
Digital Libraries Initiative Phase 1        Digital Libraries Initiative Phase 2
YT Chien, NSF
Larry Rosenberg, NSF
Gio Weiderhold, DARPA
Su-Shing Chen, NSF
Steve Griffin, NSF
Priscilla Huston, NSF
Darleen Fisher, NSF
William Bainbridge, NSF
Ron Overman, NSF
John Cherniavsky, NSF
Barry Leiner, DARPA
Bob Neches, DARPA
Paul Hunter, NASA
Eugene Miya, NASA
Nand Lal, NASA
Ron Larsen, DARPA
YT Chien, NSF
Steve Griffin, NSF
Michael Lesk, NSF
William Bainridge, NSF
Ron Overman, NSF
Norman Fortenberry, NSF
Lee Zia, NSF
James Lightbourne, NSF
Ron Larsen, DARPA
Jean Scholtz, DARPA
Eugene Miya, NASA
Laura Campbell, LoC
Alexa McCary, NLM
Milton Corn, NLM
George Farr, NEH
Joyce Ray, IMLS
space

 

DLI-2 projects revealed transformative research methods and practices for many subject areas. In addition to continuing and adding to core computer and information science projects, DLI-2 funded projects beyond the sciences. The projects demonstrated conclusively the value of digital facsimiles and globally linked collections. For many areas of study, fundamental assets expanded from rare and mostly inaccessible physical artifacts and collections to high-resolution digital models and representations of these, allowing shared, collaborative use, and non-intrusive analysis not possible otherwise. New interdisciplinary areas of study emerged and became established, such as cultural heritage informatics, computational humanities, digital archaeology, music information retrieval, to name but a few. Digital imaging transformed other projects to significantly extend their reach: ancient documents were illuminated and made legible; damaged artifacts were digitally restored; and archaeological structures and sites were reconstructed. Other subject areas were transformed and enriched through geographic information systems technology (GIS) and rapidly growing corpora of spatially indexed data. It is seen as essential to locate artifacts, events, and phenomenon in space and over time to understand them better. Those fields that have adapted GIS technologies into their core research practices include many of the social sciences and humanities. The Alexandria Digital Library provided inspiration and content for this work. Text analysis and examination of large corpora benefited from new techniques developed by teams at Columbia, the University of Arizona and other projects. These projects have made major new resources available for a wide variety of purposes from management of medical records, to law enforcement and counterterrorism information collection.

Throughout the DLI Initiatives, Division managers Y.T. Chien and Michael Lesk, and their counterparts at the other funding agencies, provided brilliant leadership and oversight of the overall program, giving program managers considerable flexibility to adjust and add to funded projects in response to emerging technologies, and changes in the organizational and social environments in which the program operated.

A third digital libraries program was planned in response to recommendations and guidance offered in insightful reports such as the PITAC report Digital Libraries: Universal Access to Human Knowledge, the National Science Foundation Blue Ribbon Advisory Panel on Cyberinfrastructure report, and the report Knowledge Lost in Information – Research Directions For Digital Libraries (the latter from the Chatham Workshop on digital libraries futures).

There were also a large number of technical workshops and international working groups that continued to define critical emerging research areas and suggest specific research agendas. Prominent among these were the NSF/EU working groups, co-sponsored by the EC-funded DELOS Network of Excellence. (Links to these reports are included in this article.) Among the recommendations emerging from these extensive efforts was to significantly increase federal programs investment in Digital Libraries research and applications. There was also unanimous advice to correct a shortcoming of the earlier DLI program models, a lack of sustained, stable support for building large-scale operational systems allowing evaluation along numerous technical and social dimensions.

In the fall of 2003, federal programmatic priorities shifted, and DLI funding became controversial. There was a perception by some that DL activities had matured and therefore did not require research funding. However, proposals for DL-related activities continued to be funded through the Information Technology Research (ITR) program.

In FY2005, a research program on Digital Archiving and Preservation was established with the Library of Congress, and awards were announced in the spring of this year (http://www.digitalpreservation.gov). In addition, a new CISE/IIS solicitation named digital libraries research and archives an application focus area. The proposals received in response to this solicitation are being reviewed, and awards will be announced early in FY2006.

Critique of the DLI Program

The Digital Libraries Initiative can be thought of in terms of a funding program (the agencies part) and as an intellectual program (the performers part).

The intellectual program performed well beyond agency expectations – if not always in the directions that the funders had anticipated. The program funding models did not work optimally, particularly for the mid-size, longer-term, interdisciplinary research and testbed projects.

A variety of constraints, agency policies and circumstances resulted in limited funding flexibility. The program could not support all highly rated projects, and could not adequately respond to important new opportunities and discoveries that appeared during the program term, by offering additional support. Securing and maintaining stable funding was a challenge. By FY1999, more than 20 separate agency programs contributed funds. The funds came from the base budgets of programs. Program and agency co-sponsor contributions were far from reliable, due to reprogramming or restructuring within individual agencies (every 3-4 years in some cases), and new staff and shifting priorities at the Division, Directorate, Agency and Executive Branch levels.

In terms of the lifecycle for digital libraries projects, funding was generally limited to the "research and testbed/prototype building" stage. There was little available for the evaluation and testing that would create feedback loops at important stages. In the DLI projects, it was often the case that latter stages of the research program – implementation and deployment – revealed important new research questions. It was noted over the term of DLI-1 and DLI-2 that for the larger research and testbed projects, productivity increased with project maturity – stable and organized research environments were able to perform more consistently and creatively, with better results.

Interdisciplinarity

Progress in Digital Libraries research is dependent upon collaboration across the disciplines. DLI was highly interdisciplinary and prospered intellectually accordingly.

The Chatham Workshop Report commented that:

"Over the last few years, digital library research has become the most interdisciplinary area at NSF, including researchers from 35 different academic departments. The program has also engaged international partners, with several U.S. projects coordinated with counterpart projects in the United Kingdom and Germany, as well as with broader international projects involving the European Union and Asian countries. Moreover, the kinds of information created and examined has moved well beyond text and book-like objects to include CT-scans of fossils, images of dolphin fins, cuneiform tablets, and videos of human motion, potentially enabling more sophisticated analysis in domains that range from archaeology and paleontology to physiology, while exploring the engineering problems that such investigations expose."

It was shown in many projects that digital representations of real objects could yield much more knowledge than the originals. Once digital, these assets can be copied and distributed with great ease. Projects such as the Digital Michelangelo, Forma Urbis Romae, and the Digital Morphology Project, and others using digital modeling and analysis techniques, demonstrated this convincingly. In the Digital Morphology Project (http://www.digimorph.org/), x-ray CT-scans of fossils allow detailed examination of the external and internal structures of rare specimens. In more than one case, the project either authenticated or proved fake discoveries of "new" creatures. More recently, the same technique is proving to be effective in reading certain ancient codex without opening the covers and digitally "unrolling" scrolls in order to reveal inscriptions.

The arts and humanities offered challenging new problems for digital libraries research because of the natural complexity of questions posed, the variety and distributed nature of relevant information, the qualitative nature of evidence, and multiplicity of possible interpretations. Information technologies proved to be stressed as much if not more by these types of inquiry as by those relying on scientific methodologies. The work informed information technology research and development, and provided impetus for new technologies. DLI projects such as the Perseus Project (http://www.perseus.tufts.edu/), the National Gallery of the Spoken Word (http://www.ngsw.org/), the Cuneiform Digital Library (http://cdli.ucla.edu/), Variations (http://www.dlib.indiana.edu/variations/), and the Digital Atheneum (http://www.digitalatheneum.org/) are just a few examples. In almost all these cases, the projects involved computer and information scientists and domain scholars.

A larger question being asked is whether knowledge should still be thought of in terms of disciplines. Some say the "digital revolution" of the past two decades has significantly eroded the intellectual division between science, the humanities and the arts. Disciplinary boundaries are seen by many as artificial – drawn by institutional and political concerns as much as by topical substance and intellectual content. In this view, disciplinary communities have become, over time, political communities as well as scholarly and intellectual communities, spawning interest groups to preserve the status-quo. While disciplinary departments in universities are viewed as playing an important role in preserving the larger ideals and practices of the Academy such as academic freedom, tenure, etc., at the same time, departmental research agendas can be overly scripted and rigid, limiting the individual freedoms of faculty (especially untenured junior faculty) to pursue topics outside of departmental boundaries.

International Activities

As the decade of the 1990s progressed, the Internet was making national boundaries transparent, and a nascent global knowledge environment had emerged and was growing by leaps and bounds. International collaboration, always encouraged in the science agencies, became even more attractive and beneficial. The Digital Libraries Initiative placed high value on international collaboration.

The reasons are apparent. Knowledge on a particular topic is not located or neatly organized in one place, and expertise in information technology and domain specialty areas is spread around the globe. Concerted coordinated international efforts to steer the development trajectories of distributed repository architectures, content representation, access frameworks, and delivery services were essential to progress. Otherwise the future global information environment would likely be a larger version of the current one, characterized by increasingly abundant data of many types, chaotic in representation and organization, unstable over time, uneven in quality, and difficult to find, retrieve, and put to productive use. International efforts were begun to build consensus, plan, develop, and implement new technology frameworks and content development practices.

Two modest DL international collaborative programs were launched and became immensely popular. The programs resulted in successful, highly leveraged projects with exceptionally broad participation and positive impact. The arrangements for these were negotiated and arranged at the program levels of the funders' organizations. NSF guidelines precluded funding for non-US organizations (except in very specific circumstances), so the strategy used for collaborative projects had two or more projects submitting a single proposal to the funders with separate budgets for each national team. The funding strategy was unusual, and for a time it was held up as an effective model for avoiding the arcane obstacles that routinely appeared as international arrangements moved up the organizational hierarchies inevitably delaying and at times halting implementation.

The United Kingdom's Joint Information Systems Committee (JISC) was a natural first international partner for the Digital Libraries Initiative. Its activities and management approaches complemented those of the DLI in the US. JISC funded not only research, but academic information infrastructure as well, and placed significant emphasis on project management and evaluation. The JISC/DLI partnership produced two coordinated Calls for Proposals, resulting in 10 research and learning testbed projects. More information on these can be found at <http://www.jisc.ac.uk/>

The success of the collaboration with JISC prompted additional successful collaborations with the Deutsche Forschungsgemeinschaft (DFG) of Germany, the Cultural Heritage Applications Unit of the European Commission and individual organizations from several Asian and African countries. Of particular note was the funding of more than a dozen NSF/EU working groups that met regularly over 5 years and produced authoritative reports on a variety of technical, social, cultural, and economic aspects of digital libraries. The DELOS Network of Excellence for Digital Libraries coordinated these working groups. The final reports of the working groups can be found at <http://delos-noe.iei.pi.cnr.it/>, and summaries of the reports will be published in a special issue of the International Journal of Digital Libraries this summer. The DLI international program ended in FY2003.

Impact

The impact of the DLI activities is generally viewed as broad and deep. The return on federal investment was high by any measure, and the DLI projects and activities pushed the intellectual boundaries of inquiry well beyond expectations. The question persists as to what metrics should be used to measure the value and impact of such programs. Financial payoff in the form of new products, businesses and services; contributing to the social good in terms of making life better for individuals and communities; advancing scholarship and research capabilities in terms of new tools and resources for knowledge-making; creating new forms of scholarly communication; and capturing and preserving the human record are a few that might be used.

Strong cases could be made that the DLI projects succeeded in all of the above. Innovative achievements in science, humanities, and the arts have demonstrated the potential of digital libraries technologies to advance domain research and scholarship across the disciplinary spectrum. Support for open-access, open-source materials has catalyzed new thinking and approaches for scholarly communication. Support for open systems, metadata development, persistent identifiers, resource description frameworks, ontologies, thesauri, and other related efforts advanced functional capabilities, interoperability, scaling, and federation of Internet collections. Established academic subject areas were transformed and enriched through availability of new resources and tools. One example is geographical information systems technology (GIS) and the rapidly growing corpora of spatially indexed data. Those fields that have adapted GIS technologies into their core research practices include many of the social sciences and humanities where research might depend on locating events and objects in space and observing changes over time. The Alexandria Digital Library project provided inspiration and access to GIS resources. In very recent years, the federal government has been declassifying large amounts of satellite imagery, which allows for new data to be integrated in a variety of subject areas. Several new enterprises such as World Wind (NASA) and Google Earth are demonstrating the potential of some of these data.

A number of the DLI projects captured the interest of the national and world press. A few of these are noted in Figure 3.

Image listing newsclips about DLI projects funded in Phase 2 Image listing newsclips about DLI projects funded in Phase 2

Figure 3. DLI Project Thumbnails.

Interaction was an important component of the program. All-project meetings were held yearly and hosted by various projects. These were energetic, festive events, celebrating new research accomplishments and creativity. DLI-2 was an open program, reflecting the Internet culture. DLI meetings were often held in conjunction with larger meetings, such as CNI and JCDL, and invited participation from other agency projects, such as those of the Institute for Museum and Library Services as well as international partners.

Digital Libraries and Cyberinfrastructure

In the late 1990s, the term cyberinfrastructure was coined, referring, initially, to the assemblage of high performance computing and networking resources generally available to researchers and educators. The National Science Foundation convened a Blue Ribbon Panel to articulate what should be meant by cyberinfrastructure in specific terms, and to lay out alternative plans for agency investment in new programs of support. The Panel succeeded extraordinarily in doing both.

A primary question to be answered was how expansively the term should be used. High performance computing and communications systems and services were central to the concept, but it was unclear whether large stores of curated, digital content, new tools and services produced as a result of digital libraries research and activities should also be included. An "information layer" had emerged as a consequence of widespread use of information infrastructure. Another view of cyberinfrastructure offered was one in which knowledge environments in the form of organized, highly-contextualized information would be the most permanent, valued and widely used resources. New generations of computing and communications technologies would flow through knowledge environments, enriching and adding content, enhancing functionality, and revealing new applications and uses which in turn would reveal the next wave of technologies required.

A chart showing how digital content is considered infrastructure in the cyberinfrastructure model

Figure 4. A layered model has often been used to describe "information infrastructure." Originally information infrastructure (sometimes referred to in different terms) included computing hardware, networking and telecommunications, memory media and the operational software to make these work together. In this geological conception of interconnected global information and telecommunications resources, the lower layers of computing systems and networks enable the higher layers of middleware, collections of digital content, and applications. This graphical representation suggests that digital content itself can and should be considered infrastructure of comparable or superior status to that accorded to computing hardware and networking.

Future Directions

The DLI projects addressed issues related to content creation, access, and use of Internet-based digital resources, and suggested and demonstrated altogether new approaches to information technologies and domain research. They have set the stage, through examples, for a renaissance in research methods and practices, scientific and cultural communication and creative representation and expression of ideas.

For this to progress, it is necessary to continue to provide support for digital libraries and domain informatics. The Atkins' report also commented on the potential of digital libraries research to transform disciplines.

"NSF digital library initiatives have created new infrastructure and content of value to specific disciplines (including many in the humanities). It is important to continue such efforts through ongoing research, prototyping and experimentation with digital library technologies, development and deployment of proven solutions, and support for specific digital library repositories in disciplines represented at NSF. The potential has been barely tapped, and there is an opportunity to find and implement new mechanisms for sharing, annotating, reviewing, and disseminating knowledge."

In June 2003, the Division of Information and Intelligent Systems sponsored a workshop in Chatham, Massachusetts, involving recognized national and international scholars and researchers to help frame the long-term agenda for future digital libraries research and infrastructure activities. The Chatham workshop, organized by Ron Larsen and Howard Wactlar, produced complementary findings on the status and value of digital libraries research and recommendations on how to proceed:

"While major progress has been made in indexing, searching, streaming, analysis, summarization, and interpretation of multimedia data, the more that is accomplished exposes the more that remains to be done. Interactive environments for knowledge creation, use, and discovery need to move out of the laboratory and be broadly deployed in society. Systems for information access, delivery, and presentation are in a continual state of catch-up as they scale to the ever-increasing generative capabilities of ... information sources. Increasing demands are being placed on knowledge access, creation, use, and discovery across disciplines, and of content interpretation across linguistic, cultural, and geographic boundaries. The opportunities are unlimited, but they will remain only challenges unless a continued commitment by [funders] sustains and accelerates research into the most fundamental of our intellectual assets – information."
"In 8-10 years, a systematic science with strong theoretical underpinnings for digital library knowledge creation can be developed and validated. With appropriate rigor in the underlying science, it should be possible to make major progress toward a multilingual, multi-media, mobile, and semantics-based digital library knowledge network."

Many see it as a disturbing trend in recent years for the science agencies, including NSF, to focus on short-term rests measured strictly in terms of economic gain. Both the Atkins' and Chatham workshop reports point to a continued need for long-term investment support of more complete project lifecycles (beyond research and prototypes) to include large-scale implementation and evaluation.

We are just beginning to explore the possibilities for productive use of the Internet. Highly contextualized data objects and annotated collections resulting from automated processing, and sometimes years of concerted, expert human labors, are proliferating, and the capabilities of base technologies of computation, storage, and communication devices continues to increase at exponential rates while costs of acquiring and implementing these decrease. There are qualitatively altogether new types of opportunities associated with creation, access, and use of large-scale, distributed, digital content stores that can be exploited by advanced networking and computing technologies. Better tools and more robust access frameworks are needed to realized these, and discussion and resolution of intellectual, social, and legal issues associated with selecting content and making it available must proceed in a constructive fashion.

The Internet communicates and stores human expression in its many forms: written and spoken languages, imagery, sounds, and mixed and multimedia of all sorts. It also carries the ever-increasing output of scientific instruments and sensors that monitor astronomical events, earth systems, microscopic processes, and people's lives. It does so with notorious indifference to the substance and meaning of the content communicated. All is treated in the same fashion, be it sublime, instructional, practical, inane or offensive.

The Chatham Report states that:

"Digital libraries were envisioned a decade ago as the answer to networked knowledge environments. Much progress has been made towards that goal, but in its pursuit, the goal has transformed into something much more dynamic than was originally envisioned."

Digital libraries research and related activities are key in determining the future content, value, and use of global knowledge infrastructure.

Notes

[1] The views expressed are those of the author and do not reflect NSF Official Policy.

[2] Two technologies already developed had stimulated his thinking. In 1945, Vannevar Bush wrote an article entitled, "As We May Think," which discussed the Memex system for storing information and relationships. Others pursued Bush's ideas and developed hypertext systems for new, nonlinear formatted documents. These pioneers included Ted Nelson and Doug Englebart.

References

HPCC/IITA Digital Libraries Workshop
Interoperability, Scaling, and the Digital Libraries
Research Agenda [C. Lynch, H.Garcia-Molina]
[http://diglib.stanford.edu/diglib/pub/reports/iita-dlw/main.html]

Santa Fe Planning Workshop on Research Agenda for Distributed Knowledge Work Environments
Santa Fe, NM [D. Atkins]
[http://www.si.umich.edu/SantaFe/]

Report to the President on Digital Libraries: Universal Access to Human Knowledge (February 9, 2001)
[http://www.nitrd.gov/pubs/pitac/pitac-dl-9feb01.pdf]

Revolutionizing Science and Engineering Through Cyberinfrastructure: Report of the National Science Foundation Blue-Ribbon Advisory Panel on Cyberinfrastructure
[D. Atkins, Chair]
[http://www.communitytechnology.org/nsf_ci_report/]

Knowledge Lost in Information - Research Directions For Digital Libraries
Chatham, MA [R. Larsen, H. Wactlar]
[http://www.sis.pitt.edu/~dlwkshop/report.pdf]

On July 19, 2005, the reference for the PITAC report and the link to the report were corrected.

spacer
spacer

Top | Contents
Search | Author Index | Title Index | Back Issues
Previous Article | Next article
Home | E-mail the Editor

spacer
spacer

D-Lib Magazine Access Terms and Conditions

doi:10.1045/july2005-griffin