New Center at Columbia University for Digital Library Research
Fostering Interdisciplinary Research and Bridging Cultural Clashes
Judith L. Klavans
Director, Center for Research on Information Access
Columbia University
klavans@cs.columbia.edu
D-Lib Magazine, March 1996
ISSN 1082-9873
The purpose of this article on the Center for Research on Information
Access (CRIA) is to present Columbia University's creative solutions
to ongoing problems which, in the view of many, are blocking the
success of digital libraries as an interdisciplinary effort. Columbia
University has provided a unique and, we believe, visionary resolution
to this predicament. The thrust of this article is to relate the reasoning
behind our solutions. Other articles in
D-Lib will give reports on the extensive set of successful digital
library projects at Columbia University. Columbia University houses
one of the largest and most effective production digital library
efforts in the country and among the most advanced technology research
efforts as well. However, the topic of this article is
research on the integration of production oriented and research
oriented digital library projects.
Goals of the Center for Research on Information Access
The Center for Research on Information
Access (CRIA) at Columbia
University was founded in January 1995 with the goals of providing an
integrating function for ongoing digital library projects, both within
the university itself, and between the university and industry,
educational, and foundation partners.
The new center
has been actively involved in building
further integration among existing digital library
projects already underway throughout Columbia University and in
initiating new interdisciplinary projects reaching both within and
outside Columbia University. Participants represent the University
Libraries, the Academic Computing division, the School of Engineering
and Applied Science, the Lamont-Doherty Earth Observatory, the
Columbia-Presbyterian Medical Center, the Office of the Provost, the
Institute for Learning Technologies in Teacher's College, and several
Academic departments.
We are
currently establishing corporate partners, as well as seeking
government and foundation funding.
The Center for Research on Information Access
(CRIA) is initially funded through the Columbia University Provost's
Investment Fund and the Strategic Initiative Fund as a university-wide
resource to establish collaborations between departments and divisions
working on similar projects.
The Center for Research on Information Access is committed to
facilitating connections between projects for improving instruction,
for developing new electronically available resources, and for
exploring new technologies. CRIA is housed within the Columbia University
Libraries with close links to the
Computer Science Department.
Why Establish a New Center?
One of the most fundamental problems inhibiting the deployment of
digital libraries is lack of integration. This problem is not
surprising, given the fact that a successful digital library must
reach across established disciplines representing different cultures
that have not traditionally been partners. This problem is not a
secret; in fact, it is a well-known although often unacknowledged
issue. Indeed, the keynote address at the 1995 Annual ACM SIGIR
conference on Research and Development in Information Retrieval
featured a provocative and controversial keynote
address by Professor Terry Winograd, from Stanford University's
Department of Computer Science, entitled ``Digital vs. Libraries: Bridging
the Two Cultures'' addressing some of these points. Some of the
controversy, as Winograd suggests, is embodied in dispute underway in
the URC/URN community and the libraries standardization community.
But some of the controversy runs deeper, involving deeply-rooted and
often discrepant attitudes toward the organization of information, the
presentation of information, the development and use of ontologies and
other classification schemata, and the role of the human in the
information access process.
A realistic look towards successful digital library projects reflects
this rift. Certain projects have moved forth from communities
functioning independently of the knowledge and resources of other
disciplines, and the result is a narrowly successful project. This is
not the place to list the successes and failures, since most of us
working within the digital library community could easily create a list
for ourselves. Rather, the purpose is to highlight the fact that
the collaborations are, as Winograd stated (op cit p. 2) ``fruitful
and also at times frustrating.''
The key question to ask is whether such differences are unfathomable.
If so, then there may be no hope of achieving the goal of making as
much digital information available to as many people as possible, with
effective user interfaces, useful access tools, acceptable networking
speed, and so on. But if such differences are surmountable,
then with some effort at integration between researchers and
practitioners, there is promise of achieving these goals.
The premise of CRIA is that integration between projects, across
cultures, is essential to success. The objectives of CRIA are to
enable seamless coordination, and to minimize cultural differences,
for the purpose of achieving the common goal. As such, CRIA is
organized with four related committees:
- The Research Advisory Committee
The purpose of the Research Advisory Committee is to suggest research
directions for CRIA, to oversee the research, and to participate in
finding funding for this research. This last point involves
identifying appropriate funding sources, identifying people who should
be involved in proposal efforts, and identifying appropriate topics
for funding. The chair of the Research Advisory Committee is
Professor Kathleen McKeown. Members of the committee include representatives
from libraries administration, academic computing, the department of
computer science, the department of electrical engineering, the Columbia-
Presbyterian Medical Center, the Lamont Doherty Earch Observatory, and
Teacher's College Institute for Learning Technologies.
- The Intellectual Property Committee
The role of the Intellectual Property Committee is to bring together
related projects at Columbia, and to develop a strategy to manage the
difficult questions, both theoretical and practical, arising from
digital data distribution. The Intellectual Property committee is
chaired by James Hoover, Professor of Law and Law Librarian at Columbia
University. Members represent a wide range of research and applications
interests at Columbia, including legal counsel, libraries, academic departments
such as art history, instructional staff, academic computing, and the
central administration.
- The Internal Advisory Committee
The role of the Internal Advisory Board is to act as an administrative
guide within the Columbia University Community. This committee is
chaired by Dr. Elaine Sloan, Vice President for Information Services
and University Librarian.
- The External Advisory Committee
The role of the external advisory board is to
provide input on research projects, and to
suggest future research directions. There is an annual, one-day meeting on the
campus of Columbia University for presentations and discussion of CRIA
activities, and for input on future research directions.
The external advisory board attends along with the three committees of the Center.
The first meeting will be in June 1996. Members are drawn from academia, industry,
government, and professional organizations with diverse backgrounds and interests
including librarianship, computer science, engineering, and law.
Digital Library Projects at Columbia University
Columbia University houses a range of digital library projects
throughout the university. The University Libraries and the Academic
Computing division has a working digital collection and has sets of
production services integrated within the university. Columbia has a
significant operational effort, and is advancing the technology and
applications with a set of digital library research projects; indeed,
this is the function of CRIA within the information services division
of the University. The School of Engineering and Applied Sciences has
also research in new technologies necessary to achieve the digital
library. Projects span natural language generation for summarization, feature-based image and video
search and retrieval, visualization of complex data, and improved search and retrieval algorithms. CRIA
has taken an active role in initiating new projects
linking key research in the School of Engineering with ongoing operations
oriented research throughout the University.
Relevant URL's include:
Projects abound in numerous academic departments, which can be found on the
the above pages.
CRIA and Digital Library Projects
Since CRIA is an integrating organization, our role is to ensure
interdisciplinary involvement both between ongoing projects, as
well as newly initiated projects. CRIA has been involved in
a wide range of new and ongoing
research and development
projects,
many of which bridge the various cultures
that come together in achieving successful digital libraries.
Some of the projects CRIA has taken an active role in
initiating include:
- Columbia Multimedia Presentation System (COMPRESS) - to develop
novel techniques for the efficient extraction, timely delivery, and
effective presentation of information from increasingly large numbers
of heterogeneous knowledge repositories. The initial testbed will be
in a health care delivery domain, with subsequent development for and
application to earth sciences education. (Partners: Department of Computer
Science, Columbia Presbyterian Medical Center, CRIA)
- JANUS Digital Library - to continue development of project for
the content development of legal documents. (Partners: CRIA, Law Library,
Academic Computing)
- Using Electronic Library Resources in Education - to develop new
strategies for teachers to utilize the resources now available
electronically; project within high schools in the Harlem
Environmental Access Program. (Partners: CRIA, Lamont-Doherty Earth
Observatory, Teachers College)
- Yiddish Language Data Base - to convert the contents of the
largest archive of recorded data of Ashkenazic Yiddish into
digitally available format. (Partners: Libraries, Academic Computing,
Department of Germanic Languages, CRIA)
- Domain Independent Summarization -
to develop a multi-level process of summarization for the presentation of
information available during browsing or search. (Partners:
Department of Computer Science and CRIA)
- Search engine evaluation and improvement - to establish the
infrastructure for technology developments in algorithms to be
tested within the Columbia University production systems. (Partners:
Department of Computer Science, Academic Computing, CRIA)
Although Columbia University has always enjoyed strong ties between disciplines
and divisions, the unique challenge presented by the digital library has been
effectively addressed by having established a unifying Research Center.
For more information, see our web pages at http://www.cs.columbia.edu/~klavans/cria.html.
Copyright © 1996 Judith L. Klavans
hdl://cnri.dlib/march96-klavans