Issues in Science and Technology Librarianship
| Winter 1999
|
---|
Stocking Your GIS Data Library
Jennifer Stone
Geographic Information Systems Librarian
Map Collection and Cartographic Information Services
University of Washington
jnstone@u.washington.edu
Abstract
So, now that you've got the machines and the software, where does the data
come from? Data discovery and acquisition can be the most time-consuming
part of GIS projects, whether hunting down and purchasing already existing
data or creating your own. There are also the issues of documentation and
metadata to consider before making an acquisition. A clear understanding
of your user group is necessary to know what, exactly, to stock your
library with. By using a combination of data from local, state, and
federal government sources, plus data created locally and produced by
vendors, your collection can be rounded out to serve a diverse user base.
The University of Washington will be used as an example. The article will
also look at some of the collection development literature concerning both
traditional and digital formats.
Introduction
It is commonly stated in the library literature that use of geographic information
systems in libraries is experiencing strong growth. There are a few libraries that
have been using GIS or collecting digital geospatial data for several years, which
many of us aspire to emulate. There is also an increasing number of librarians who
have been thrown into the "GIS fire," so to speak: those who have purchased or been
given the machinery and software to offer GIS services in the library, and must
begin offering services. The acquisition of hardware and software is the easier
part of establishing GIS services in your library, however. Figuring out what data
to collect for your users can be daunting. There is an overwhelming amount of GIS
data available, with more made available almost daily. This article presents a
series of issues to consider when determining how and what to collect for your
library.
The best place to start is with traditional collection development policies for map
libraries, or libraries collecting cartographic materials. Typically, digital
materials should reflect the traditional collection. At the University of
Washington, the Map Collection is global, with emphasis on the United States, the
Pacific Northwest, Washington State, Western Washington, and the Puget Sound area.
Our digital collection reflects the same geographic emphasis. (Highlights from the
collection development policy for the University of Washington Map Collection are
online at http://www.lib.washington.edu/maps/colldev.html).
In the third edition of Map Librarianship: An Introduction, Mary
Larsgaard talks about materials that can be easily obtained in both digital and
traditional formats: reference and thematic maps of the world, maps of continents
and nations, topographic maps of the world, world atlases, state atlases, aerial
photos, monographs and serials; as well as outline and basemaps (Larsgaard 1998).
When establishing GIS operations, this kind of basic information will provide a
solid foundation.
The User Community
To figure out which of the basic materials mentioned above are a good starting
point, it is helpful to be familiar with the user community. The librarian in
charge of GIS needs to understand who in the institution is using GIS, what they
are trying to accomplish, and what services they need help with. Larsgaard talks
about paying attention to users' needs and use patterns to help determine what data
to collect. Being active in local user groups and talking to individual departments
within the institution will provide an initial understanding of the library's
patrons. Larsgaard recommends writing and updating an acquisition policy or a
collection development policy, "based on the type of clientele (which is dictated
by the kind of institution), the information needs of the users, and miscellaneous
matters such as consortial agreements and the proximity or lack thereof to other
collections" (Larsgaard 1998).
In some institutions, there is also going to be a wide range of users, from
beginners (which may include yourself) to those who have helped write some of the
existing programs on the market. Arc/Info, after all -- one of the industry's key
software packages -- is 18 years old, and the digital data collection will have to
support these expert users as well as the novices.
At the University of Washington, the librarians working with GIS data have for
years been members of the University of Washington Consortium on Geographic
Information and Analysis (UWCGIA). This group includes GIS users from all over the
campus, and meets on a regular basis to discuss various research applications of
GIS. Non-campus users also come talk to the group, and there is a listserv
available, which has members from campus and around the Seattle area. The listserv
is an excellent way to disseminate information to a widespread group of people -- it
is used for meeting announcements, to poll campus users, to post jobs both on
campus and off, and serves as a technical support outlet for hardware- and
software-related questions as well as for those seeking data. The librarians have
also been involved with individual classes and departments, helping develop class
assignments and giving workshops on finding and using digital geospatial data.
Input from these workshops has helped the library understand users' activities and
needs.
Data Discovery
Once it is understood who is using GIS and what they are trying to accomplish, it
will become clearer where to develop the collection. Melissa Lamont lists several
sources for GIS data, including the United States government, state and local
governments, researchers on campus, local GIS firms, utility companies, real estate
firms, and the Internet (Lamont 1997). U.S. government agencies produce a
wealth of information, much of which is freely available over the Internet, that is
standard for use in GIS projects. Local and regional governments offer the benefit
of larger-scale datasets for the immediate area. Local and regional agencies will
be the ones most likely to produce data showing features such as bus stops or bike
paths, as opposed to smaller-scale, statewide features such as national parks. For
those in an academic environment, it is possible that various campus departments
have been using geographic information systems for years -- these departments and
researchers are likely to have a wealth of information, much of which is able to be
shared. These are also the people who are well connected in the local GIS
community; talking with them frequently will help keep the GIS staff abreast of
local GIS activity.
Another good source of data and GIS information is through partnerships with other
agencies and groups. Carolyn Argentati states that "Partnerships and grants linking
libraries with governmental and commercial organizations have offered opportunities
for collaboration on service models and the development of large data collections
and new access tools. These extensive collections of digital spatial data are being
organized and made available via the Internet frequently with a regional or local
focus that is relevant to a library's primary constituency." She continues: "One
strong partnership often leads to others and to additional contacts with people and
organizations engaged or interested in GIS" (Argentati 1997). Being involved in
the local community is a good way to start in on these partnerships.
Cost, Delivery, Format
Regardless of the source, cost is always an issue. Federal government sources tend
to provide data less expensively than commercial sources, but this is not always
the case. Different Internet sites offer data for free, for a small fee, or a large
fee -- following the same structure as non-electronic acquisition. If it applies,
request the data as an educational institution and inquire about other possible
discounts. Negotiate to receive the data for free -- the University of Washington
has had success with local and regional governments providing the Libraries with
data for free or very low cost. The UW Libraries have also acquired data by
providing blank recording media in return. Another issue to consider is delivery:
can it be sent via post, or do you have to download it? If you must download it, be
prepared to encounter different compression formats, space limitations, and perhaps
a long download time. Having a directory already specified, knowing space
limitations and keeping drives uncluttered will make it easier to retrieve data
from the web.
Larsgaard (Larsgaard 1997) provides a checklist of issues to consider when
acquiring data, regardless of format. She suggests considering that the
information:
- Is from a database not already available through the library's consortial
agreements;
- Has acceptable licensing and use restrictions;
- Has print counterparts;
- Has reasonable customer support;
- Comes with "clear and thorough" documentation;
- Provides a trial version;
- Includes a tutorial;
- Is relatively easy to use, with menus, prompts, on-screen contextual help, error
messages that actually give the user a clue as to how to get out of some mess,
examples of operation;
- Provides easy installation; and
- Offers easy printing and downloading.
Format is another factor to consider. ESRI and MapInfo formats have become somewhat
de facto industry standards for data, and the U.S. government has created the
Spatial Data Transfer Standard (SDTS). Many software packages can read other
formats or perform conversions, however, increasing the amount of data available.
Make sure to understand the formats and their differences, and how the importing
functions work before choosing to acquire an unfamiliar format.
Hardware and Software Issues
A library's hardware and software issues will depend on the institution. The
institution may have an infrastructure plan already in place, or the library may be
involved in establishing such a plan. Site licenses, too, must be considered. For
expensive software packages such as GIS, a large institution may find it much more
economical to have a site-wide license for the software, rather than individual
installations in many departments. What a library purchases will also depend on the
staff's experience and education, as well as that of the user base.
Even if a library decides on one major GIS package, there will no doubt be multiple
pieces of software that the library staff will need to be familiar with. The big
GIS packages often work in conjunction with other software; ESRI has a free data
viewer that many users are likely to want to use; and each electronic atlas,
gazetteer or mapping software will be structured differently. One or many of the
library's staff will have to become familiar with these software packages, at least
familiar enough to be able to walk a user through setup and the help files. For the
frequently used or older (perhaps DOS-based) packages, additional user guides may
have to be written.
The University of Washington Map Collection has hundreds of CD databases in-house,
with many other CDs in other library branches, and an abundance of data offered on
the web. There is no way one person can know everything in the collection. Luckily,
standards are emerging, both from the U.S. government, and from commercial,
proprietary sources. These standards include everything from file formats (SDTS,
mentioned above); metadata (Federal Geographic Data Committee); and similar
software interfaces (ArcView and MapInfo look very similar, and the next version of
Arc/Info will be highly based on graphical user interfaces, as opposed to the
traditional command-line interface it is famous and infamous for). This is another
reason that written user guides are important, so that the institution's GIS
knowledge is easily shared.
The focus of a library, however, is not on hardware and software, but data
provision. "These infrastructure issues are secondary, however, to the even larger
and more important responsibilities of collection, organization, and dissemination
of geospatial data. ... Beyond hardware and software issues, any management
discussion must address collection, describing, and accessing spatial data"
(Lamont 1997).
License Agreements and Usage Restrictions
The issue that can stop an acquisition in its tracks is license agreements and
usage restrictions. Many products, including stand-alone software and individual
data, have very strict usage rules set up by the publisher. Some examples of the
variety of agreements from the University of Washington's collection include:
- The data can be freely distributed to the general public;
- The data are restricted to UW patrons only (faculty, staff and students);
- The data are available to the general public, but can only be loaded on one
machine in the Libraries; and
- The data can be used by anyone, but only UW patrons may take the data out of the
Library; general public patrons can only print maps made with the data.
The University of Washington checks license agreements before purchasing to make
sure the product will be usable by the widest number of users. In some cases the
university has been able to negotiate a separate license -- in some cases that
negotiation involved months of legal wrangling over an official agreement, and in
other cases the Libraries has promised that we will refer all non-UW patrons back
to the data provider. To enforce the agreements the Map Collection has
click-through agreements on its web site, and hands out paper agreements to patrons
coming in to the library.
If the dataset is from a local government, for example, then updating and currency
become factors in the acquisition decision. The library may be fine with a one-time
purchase, or may want to receive quarterly updates. Local datasets may also involve
privacy issues, especially when dealing with data from assessor's offices, but
privacy is something that should be ironed out with the license or usage agreement.
The University of Washington has had good luck with organizations being willing to
add themselves on to our existing agreements. For example, UW provides a standard
data use agreement (available at http://www.lib.washington.edu/maps/datause.html)
-- several providers have provided their data with the understanding that this
agreement will be given to users, rather than negotiating a separate agreement.
Documentation and Metadata
Good descriptions of data (as well as software and hardware) are essential to
serving as an effective data provider. Patrons will want to know where the data
came from, how it was created, its lineage, when it was last updated, and by whom
-- all of this information and more is necessary to understand the bias and error
inherent in a dataset. The federal government has recognized the need for this
information, and a national standard for digital datasets has been created,
information on which is available at http://www.fgdc.gov/. This standard, although
somewhat daunting at over 200 fields, is the most complete standard for data
description established -- and it has begun to be adopted by other countries as
well. The government has also created the National Spatial Data Infrastructure to
share this metadata and data with one another (http://www.fgdc.gov/nsdi/nsdi.html).
Nodes have been set up all over the globe to host metadata (and in some cases, the
data itself), and the nodes are searchable. The University of Washington worked
with the Washington State Geographic Information Council to establish one of these
nodes for the State of Washington -- for information, see http://wa-node.gis.washington.edu/.
If metadata and documentation aren't available for a dataset, the data has lost a
great deal of value -- much like using an unattributed quote in an article suggests
laziness or falsehood, lack of data description suggests error. If metadata or
documentation does not exist for a dataset of interest, the library staff must
consider whether they want to spend the time creating the metadata, pursuing
metadata from the source, or deciding not to purchase the data at all.
Conclusion
There are several important factors to consider when stocking your GIS data
library. Library staff in charge of GIS need to know about basic geographic
information, details about the user community, issues about data discovery, cost,
delivery and format, hardware and software, license agreements and usage
restrictions, and documentation and metadata. Considering each of these issues will
help library staff make the best GIS data acquisition decisions for the library.
References:
Argentati, Carolyn D. 1997. Expanding Horizons for GIS Services in
Academic Libraries. Journal of Academic Librarianship V23, No6.
(Special issue on GIS in libraries)
Larsgaard, Mary Lynette. 1998. Map Librarianship: An
Introduction. Libraries Unlimited Inc., Englewood, Colorado.
Lamont, Melissa. 1997. Managing Geospatial Data and Services.
Journal of Academic Librarianship V23, No6.
We welcome your comments about this article. Please
fill out this
form for possible inclusion in a future issue.