D-Lib Magazine
|
Enhancing Infrastructure for OAI: the DLGrid
Contributed by: In the DLGrid project supported by The Andrew W. Mellon Foundation we plan to enhance key infrastructure components of the OAI (Open Archives Initiative) framework by building a 'Grid' of digital libraries and a cluster computer to support a high performance OAI-federated search service. The Grid is an emerging technology for infrastructure that enables the integrated, collaborative use of high-end computers, networks, and databases owned by multiple organizations. Recently, there has been interest in using the Grid for managing large data sets by creating and storing descriptive metadata, which is used for discovery [Sin03]. Google does an incredible job at providing discovery services of the 'shallow' web' to the general public; we envision a similar quality, sustainable, free discovery service for students and researchers for parts of the 'deep' web [Ber01]. The parts of the deep web we refer to in this vision are digital libraries and collections that are exposing their metadata using OAI-PMH (Protocol for Metadata Harvesting). A high performance federated search service that exploits the resources of a Grid will make available a large amount of information that is distributed amongst heterogeneous digital libraries. A search user will be able to access a research paper, preprint, a technical report, an image of a renowned painting, or a musical performance in a few seconds from thousands of libraries scattered all over the world. Assuming that a rapid increase (e.g., several orders of magnitude) in the adoption of OAI-PMH occurs, we now have a different problem: how to efficiently discover, harvest and index the burgeoning OAI-PMH corpus. Since grid nodes by definition have unused capacity, no new hardware needs to be acquired and we can, in essence, piggyback the onus of maintaining the infrastructure on the efforts to maintain the Grid. The second advantage of this approach is availability of the service. The current Arc, a service federating existing OAI data providers, is running on a single processor without any redundancy and it has rapidly deteriorated over the last year in performance due to an increase in the number of archives it harvests. In the new approach, we plan to use hardware redundancy by exploiting the Grid technology for harvesting. For searching, we plan to exploit parallelism by partitioning the indices amongst a cluster of PCs. A user query will be executed in parallel across these partitions resulting in high performance. For supporting parallel indexing and searching, we will extend the open source Apache Jakarta Lucene search engine. This project has just started and is expected to be completed with a dozen gridnodes in a feasibility testbed that will harvest the current Arc members and a search cluster of 16 nodes that will provide parallel indexing and query merging. More details can be found at our ODU DL research group site (http://dlib.cs.odu.edu) under the DLGrid project. References [Berg01] Bergman, "The Deep Web: Surfacing Hidden Value", Journal of Electronic Publishing, 7(1), <http://www.press.umich.edu/jep/07-01/bergman.html> [Sin03] A Metadata Catalog Service for Data Intensive Applications. G. Singh, S. Bharathi, A. Chervenak, E. Deelman, C. Kesselman, M. Mahohar, S. Pail, L. Pearlman. To appear in Proceedings of Supercomputing 2003 (SC2003), November 2003. The Lawpaths Project
Contributed by: Lawpaths (http://library.kent.ac.uk/library/lawpaths/default.htm) is a three year project (2002-2005) funded by the JISC (http://www.jisc.ac.uk/) under the Exchange for Learning (X4L) programme (http://www.jisc.ac.uk/index.cfm?name=programme_x4l). Lawpaths intends to provide a resource bank of legal information skills materials such as tutorials, workbooks and pathfinders for use by law librarians and law teachers. Lawpaths' project partners are the University of Kent, the Institute of Advanced Legal Studies and the UK Centre for Legal Education, in collaboration with the University of Bristol, Cardiff University and the Institute for Learning and Research Technology. The project director, Sarah Carter, is law librarian at the University of Kent. In Higher Education (HE) institutions, law librarians are required to teach students legal information skills, which enable students to conduct legal research. In Further Education (FE), librarians or law teachers may undertake this instruction. In particular, students face the problem of knowing which of the many available law resources (hard copy and electronic) to select. Law librarians and law teachers have been creating legal information skills guides (e.g., how to locate cases) for students locally; effectively their efforts are being duplicated as staff around the country are creating similar types of guides. These guides may be in hard copy, or increasingly, in electronic format. They need to be updated frequently, as the currency of legal information is vital, which is time consuming, Further, law librarians may themselves not have a legal background, so they will have to acquire legal knowledge. In many institutions, librarians may also have responsibility for other subjects, as well as law. Lawpaths will support the teaching of legal information skills by providing a resource bank of legal information skills guides. The materials are aimed at law librarians and law teachers in HE and, to a lesser extent, FE. The project will explore the potential to make these materials customisable so that institutions can adapt them for use within their local learning environments. Customisation could include inserting institutional logos or details of local holdings and subscriptions. This will reduce the need for law librarians and law teachers to produce guides themselves. The resource bank of guides will be freely accessible through a searchable database (currently in development) on the Lawpaths website. A user community database is also being developed; users of Lawpaths' resources will be required to register on this. We anticipate the user community database will have three benefits. Firstly, registered users will be sent updates on Lawpaths. Secondly, the project will be able track user statistics, which will allow us to evaluate and remodel our service accordingly. Thirdly, we hope that users themselves will contribute guides they consider to demonstrate good practice in teaching legal information skills. Finally, electronic datasets, such as LexisNexis and Westlaw, are important resources for law students. Many of the legal information skills guides produced locally by law librarians and law teachers that we found through our research were guides to such services. Lawpaths has approached the service providers of three legal datasets to create guides to their products. These would be freely available, maintained by the services providers and accessible without needing to access the datasets. The response from the service providers has been encouraging so far. Our research amongst the user community in HE, in particular, indicates that this development would be a significant output from the Lawpaths project.SURF/DARE Funding for 'Community Website for SCHOLAR(S)' Project
Contributed by: The SURF Foundation has awarded funding to the 'Community website for SCHOLAR(S)' project of the Universiteit van Amsterdam (UvA). This is a result of a competitive call for tender for service projects of the national Dutch DARE (Digital Academic Repositories) initiative. See the SURF/DARE website: <http://www.surf.nl/en/themas/index2.php?oid=7>. A community website for scholars A community website for SCHOLAR
Building on the Digital Academic Repository of the UvA The project starts in September 2004 and ends in February 2006 and is directed by the University Library of Amsterdam (UvA).
For more information, please see: Visual Arts Image Collections Now Online: AHDS Visual Arts
Contributed by: AHDS Visual Arts has recently added four new image collections to its catalogue of high quality online resources. The collections feature a range of visual arts subject areas including painting, textiles, crafts, printmaking and photography.
These collections are in addition to over 23,000 high-quality images available for educational use at AHDS Visual Arts. Search the AHDS Visual Arts catalogue at <http://vads.ahds.ac.uk/>. In the NewsExcerpts from Recent Press Releases and AnnouncementsDigital Preservation Program Launches Research Grants InitiativeLibrary of Congress Partners with National Science Foundation to Fund Advanced Research into Preservation of Digital Materials "The National Digital Information Infrastructure and Preservation Program of the Library of Congress (NDIIPP) is partnering with the National Science Foundation (NSF) to establish the first research grants program to specifically address the preservation of digital materials. NSF will administer the program, which will fund cutting-edge research to support the long-term management of digital information.This effort is part of the Library's collaborative program to implement a national digital preservation strategy." "'One of the most critical issues we face in the preservation of digital materials is a need for better technology and methods to manage these objects over long periods of time,' said Associate Librarian for Strategic Initiatives Laura E. Campbell, who is directing this initiative for the Library. 'We are very pleased to be working with the National Science Foundation to encourage important research breakthroughs. This will help the Library of Congress, as well as our network of partners who are working with us, to preserve America's digital heritage for future generations.'" "The research program announcement coincides with the signing today of a memorandum of understanding between the Library of Congress and NSF to collaborate over the next decade in a broad set of research activities related to digital libraries and digital archives. The formalized collaboration arose from a joint Library of Congress and NSF workshop in April 2002 that developed a research agenda in these areas. Through their leadership, NSF and the Library will encourage other government agencies to continue research support for improving the state of knowledge and practice of digital libraries and digital archiving." "The new Digital Archiving and Long-Term Preservation research program, which expects to make approximately $2 million in initial awards using NDIIPP funds, has three main focus areas for which proposals are sought:
The NSF Directorate for Computer and Information Science and Engineering, Division of Information and Intelligent Systems, will issue a call for proposals shortly; check the NSF Web site at <http://www.cise.nsf.gov/div/index.cfm?div=iis> for current information." For more information, please see the press release at <http://www.loc.gov/today/pr/2004/04-125.html>. JHU Press Grants Rights to AuthorsJune 14, 2004 - Johns Hopkins University (JHU) Press"JHU Press is making revisions to its standard journal agreement, clarifying some personal re-use rights. The new statement will tell authors: You have the following nonexclusive rights: (1) to use the Article in your own teaching activities; (2) to publish the Article, or permit its publication, as a part of any book you may write; (3) to include the Article in your own personal or departmental database or on-line site; (4) to include the Article in your institutional database provided the database does not directly compete with either the Johns Hopkins University Press or Project Muse, is non-commercial, is institution specific and not a repository that is disciplined based and/or accepts contributions from outside the institution. For use (4), you agree to request prior permission from the Press. For all rights granted in this paragraph, you agree to credit the Press as publisher and copyright holder." For more information, please see <http://openaccess.jhmi.edu/news/index.cfm>. Library of Congress Announces Joint Digital Preservation Project with Four UniversitiesLibrary to Work with Old Dominion, Johns Hopkins, Stanford and Harvard Universities June 9, 2004 - "The Library of Congress has entered into a joint digital preservation project with Old Dominion University, Department of Computer Science; The Johns Hopkins University, Sheridan Libraries; Stanford University Libraries & Academic Information Resources; and Harvard University Library to explore strategies for the ingest and preservation of digital archives. The project is supported by Information Systems Support Inc." "The Archive Ingest and Handling Test (AIHT), is designed to identify, document and disseminate working methods for preserving the nation's increasingly important digital cultural materials, as well as to identify areas that may require further research or development. The AIHT is part of an initiative, led by the Library of Congress, to build a network of preservation partners through the National Digital Information Infrastructure and Preservation Program (NDIIPP)." " The AIHT participants are investigating and applying various digital preservation strategies, using a digital archive donated to the Library by the Center for History and New Media at George Mason University. The archive is a collection of 57,000 digital images, text, audio and video related to the Sept. 11, 2001 events. The transfer of these 12 gigabytes of digital content is being used to emulate the problems that arise in digital preservation and to test possible solutions." "Participants in the AIHT range from fully operational repositories to an advanced research project investigating methods for preserving digital objects; additionally one institution is comparing multiple technical solutions within one environment. A broad array of current open-source and proprietary digital-object management and preservation technologies are deployed in the test...." "...At the end of the 12-month test, the Library and its partners will publish a final report detailing both current practices for digital preservation and future areas of research. Further information about the project will be posted periodically at <http://www.digitalpreservation.gov>." For more information, please see the press release at <http://www.digitalpreservation.gov/about/pr_060904.html>. CrossRef and Atypon Announce Forward Linking ServiceJune 8, 2004 - "CrossRef, the cross-publisher reference linking service, and technology partner Atypon announced today the launch of CrossRef's new Forward Linking service." "In addition to using CrossRef to create outbound links from their references, CrossRef member publishers can now retrieve "cited-by" linkslinks to other articles that cite their content. This new service is being offered as an optional tool to allow CrossRef members to display cited-by links in the primary content that they publish." For more information, please see the full press release at <http://www.crossref.org/01company/pr/press20040608.html>. ALCTS Presidential Citation awarded to "Pattern Recognition," the 2003 OCLC Environmental ScanJune 8, 2004 - "Cathy De Rosa, Lorcan Dempsey and Alane Wilson have been awarded the Presidential Citation of the Association for Library Collections & Technical Services for "Pattern Recognition," the 2003 OCLC Environmental Scan. The award will be presented by Brian Schottlaender, ALCTS President, at the June 27 ALCTS Membership Meeting and Awards Ceremony during the American Library Association Annual Conference in Orlando, Florida." "Each year the ALCTS President has the opportunity to award citations to individuals whose contributions do not fit the criteria used for other standing awards. Mr. Schottlaender termed "Pattern Recognition" as 'far more than the 'typical' environmental scan...[it] is destined to become a classic in the field. I join my ALCTS colleagues in extending our congratulations to Lorcan and his OCLC collegues on a most excellent piece of work!'" For more information, please see the press release at <http://www.oclc.org/research/announcements/2004-06-08.htm>. Cataloger's Desktop Now Available on the WebJune 7, 2004 - "The Cataloging Distribution Service (CDS) of the Library of Congress has introduced a Web-accessible version of its widely used cataloging tool, Cataloger's Desktop. The new fee-for-service subscription product for catalogers can be now accessed 24 hours a day, seven days a week, and provides the most widely used cataloging documentation resources in an integrated, online system. Also, the new product includes the current version of the Anglo-American Cataloguing Rules, 2nd edition (AACR2)." "'Cataloger's Desktop on the Web represents the Library of Congress' successful initiative to offer catalogers everywhere a highly useful and reliable Web cataloging tool," said Kathryn Mendenhall, chief of CDS. "It incorporates all of the indispensable cataloging publications catalogers use daily, but were previously available only in a large number of individual print editions or as a quarterly CD-ROM disc from CDS."..." "...The Cataloging Distribution Service of the Library of Congress has provided technical library publications and services to the international cataloging community on a cost-recovery basis for more than 100 years." For more information, please see <http://www.loc.gov/today/pr/2004/04-118.html>. ISO Publication of the MPEG Rights Data Dictionary Standard, ISO/IEC 21000-6June 3, 2004 - "The Contecs:DD Consortium (comprising the International DOI Foundation, Melodies and Memories Global Ltd, the Motion Picture Association, the Recording Industry Association of America and Rightscom Ltd), the International Publishers Association, the Association of American Publishers, The International Scientific, Technical and Medical Publishers, the and the Publishers Association (UK) and EDItEUR welcome the publication by ISO of the MPEG Rights Data Dictionary standard, ISO/IEC 21000-6." "The ISO MPEG Rights Data Dictionary (RDD) provides the basis for a resource to create widely understood, consistent meaning for Digital Rights Management (DRM) systems. The RDD will be available to all those involved in building and deploying DRM systems to ensure that the terms used in permissions granted by the systems can be interpreted consistently." "The RDD specification is intended to support the ISO MPEG Rights Expression Language. Together, the two standards equip the DRM world with interoperable ways of expressing rules and interoperable ways of communicating the meaning of these rules. This is to the benefit of rights holders, who can be confident that business rules can be interpreted consistently and to consumers, who can be sure that usage permissions will not vary unpredictably across different systems. Manufacturers can also benefit because their systems can interoperate without detailed negotiation on the interpretation of permissions." For more information, please see the full press release at <http://www.doi.org/news/contecsdd_standard_pr.html>. Elsevier permits postprint archivingJune 2, 2004 - Excerpt from SPARC Open Access Newsletter, Issue 74. "" "In a May 27 email to Stevan Harnad, Karen Hunter announced an important change of policy at Elsevier. The world's largest publisher of scientific and scholarly journals now permits postprint archiving. Elsevier authors may now provide open access to the final editions of their full-text articles by posting them to their personal web sites or their institutional repositories. They may not deposit them to repositories elsewhere. The archived or OA edition must be author-made, not Elsevier's PDF or HTML, and must include a link either to the journal's home page or the article's DOI. Hunter is Elsevier's Senior Vice President for Strategy...." "...See Stevan Harnad's listserv announcement, quoting Karen Hunter's email with comments of his own. <https://mx2.arl.org/Lists/SPARC-OAForum/Message/759.html>." For more information, please see SPARC Open Access Newsletter, Issue 74 at <http://www.earlham.edu/~peters/fos/newsletter/06-02-04.htm>. Nancy Davenport Named President of CLIRJune 3, 2004 -"The Board of the Council on Library and Information Resources (CLIR) is pleased to announce the appointment of Nancy Davenport as its president, effective July 5, 2004." "Ms. Davenport has served for twenty-six years in the Library of Congress, where she has had very broad experience and held several leadership positions. She is currently Director of Acquisitions at the Library. Previously, she was Head of the Congressional Research Service Inquiry Section, Coordinator of Member and Committee Relations for the CRS, and Director of Special Programs. Over the years, the Library has turned to her to direct the divisions of Rare Books and Special Collections, Prints and Photographs, and the CRS while the Library searched for permanent directors. From 1990 to 1997, Nancy directed a training program for librarians in the new democratic states of Central and Eastern Europe which was sponsored by Congress and carried out by the Library of Congress." For more information, please see the full press release at <http://www.clir.org/pubs/press/04davenportpr.html>. Copyright 2004 © Corporation for National Research InitiativesTop |
Contents doi:10.1045/june2004-inbrief |