D-Lib Magazine
|
|
Andreas Rauber |
The 7th International Workshop on Web Archiving (IWAW 2007) was held as part of JCDL, the Joint Conference on Digital Library, in Vancouver, Canada, on June 29, 2007. This was the first time the IWAW workshop was held outside Europe, which was done in order to better reflect the needs of the widely distributed international community. Almost 40 participants attended IWAW this year, with participants coming predominantly from the USA, but also from Europe and Asia. The workshop program featured updates on the current state of development on a range of open-source tools for Web archiving and best practice reports, as well as novel research results and work in progress presentations. Following the opening and welcome by workshop co-chair, Julien Masanès, Brad Tofel presented detailed background information on the WayBack Machine, software to access and navigate through web archives. He presented both the software as well as the internal system set-up for large scale web archives. Paul Wu presented new results on how to annotate and catalog web archive content. Based on the Singapoore Web Archive holdings, modules for change detection are integrated and combined with WERA search and WAWI annotation services. Megan Dougherty presented the Wayfinder interface to Webarchives, showing examples from a web archive on political web campaigning in the US, and Jan Askhoej presented an approach to combine content management systems and records management systems, including necessary conversions to ease the creation of records archiving systems. An extension to the Warrik crawler, which can be used to recover websites from a range of web caches such as various search engines was presented by Frank McCown. The extension, called Brass, allows jobs to be queued and scheduled from a central site. A case study on how to enrich repository data from the NASA Langley Research Center Atmospheric Sciences Data Center by web material was presented by Martin Klein. This was followed by a presentation by Joan Smith, who proposed a new metadata model for self-describing web ressources, based on server-side modules to provide meta-information on request As web archive holdings tend to be massive, David Minor presented approaches they have taken at the San Diego Supercomputing Center to improve the indexing of these collections. Different configurations of Wayback instances were evaluated in a case study with the Library of Congress, as well as the use of SRB containers rather than ARC files for storage in a project within the National Science Digital Library Program. In the last session two case studies of web archives were presented, namely the UCLA Online Campaign Literature Archive by Gabriella Gray and Scott Martin, and an approach on preserving the upcomming 2008 presidential election campaign videos, tackling the difficult challenge of harvesting videos as well as storing the relevant context. The scientific sessions of the workshop were followed by a session featuring system demos and updates on projects. Specifically, Gordon Mohr presented current developments in various open source tools such as Heritrix, the Wayback Machine and NutchWax, as well as on the current status of the initiative to establish the WARC format as an ISO standard. Tracy Seneca presented the Web Archiving Services developed as part of the WebAtRisk project, while Mark Middleton demonstrated the Web Archiving Services offerd by HanzoArchives. All presentations and papers from the 7th International Workshop on Web Archiving are available at the workshop website (http://www.iwaw.net/07/). The author and his workshop co-chair would like to thank all participants for their active participation and the intensive discussion during the Workshop. We also invite those who are interested in web archiving to attend the 8th International Workshop on Web Archiving, which will be held again in Europe, most probably in September 2008. Details of that workshop will appear at the IWAW web site (http://www.iwaw.net/) at a later date. Copyright © 2007 Andreas Rauber |
|
|
|
Top
| Contents |
|
|
|
D-Lib Magazine Access Terms and Conditions doi:10.1045/september2007-rauber
|