cartome.org
3 September 2001
Distributed Geolibraries
Spatial
Information Resources
Summary of a Workshop Panel
on Distributed Geolibraries
Mapping Science
Committee
Board on Earth Sciences and Resources
Commission on Geosciences, Environment, and Resources
National Research Council
National Academy Press
Washington, D.C.
1999
CONTENTS
| Notice | |
| Panel | |
| Acknowledgment | |
| Preface | |
| NAS Statement | |
| EXECUTIVE SUMMARY | |
| Characteristics and Benefits of Distributed Geolibraries | |
| The National Spatial Data Infrastructure | |
| Contents, Services, and Functions of Distributed Geolibraries | |
| Architecture of Distributed Geolibraries | |
| Intellectual Property Issues | |
| Organizational Issues | |
| 1 INTRODUCTION | |
| Examples | |
| Emergency Response | |
| Housing Relocation | |
| Public Health | |
| Natural Resource Planning | |
| A Common Theme | |
| 2 A VISION FOR DISTRIBUTED GEOLIBRARIES | |
| Recent Developments | |
| A Library Vision | |
| Defining a Distributed Geolibrary | |
| A Distributed Library | |
| Geoinformation | |
| Characteristics of a Distributed Geolibrary | |
| Distributed Geolibraries and the NSDI | |
| Distributed Geolibraries and Digital Earth | |
| 3 THE DISTRIBUTED GEOLIBRARY IN SOCIETAL AND INSTITUTIONAL CONTEXT | |
| Local Focus | |
| Library Considerations | |
| The Library as an Institution | |
| Economic Considerations | |
| Distributed Geolibraries and the | |
| Existing Library Institution | |
| Data, Information, and Knowledge | |
| Intellectual Property Concerns | |
| Uses of Data, Information, and Knowledge | |
| Access | |
| Summary and Additional Issues | |
| 4 SERVICES AND FUNCTIONS | |
| Library Services | |
| Distributed Geolibrary Services | |
| The Need for Distributed Geolibrary Services | |
| Services as Collections of Function | |
| Necessary Distributed Geolibrary Functions | |
| Search by Geographical Location | |
| Search by Place Name | |
| Search by Subject Theme or Time Period | |
| Item Display and Description | |
| Collection Creation and Maintenance | |
| Searching over Distributed Assets | |
| Integration, Analysis, and Manipulation | |
| Assisting Users | |
| Assessment and Feedback | |
| Options for the Delivery of Distributed Geolibrary | |
| Services | |
| 5 BUILDING DISTRIBUTED GEOLIBRARIES | |
| Requirements | |
| Standards and Protocols | |
| Data Sets | |
| Georeferencing | |
| Cataloging | |
| Visualization | |
| Knowledge Construction | |
| Research Needs | |
| Institutional Needs | |
| Measuring Progress | |
| 6 CONCLUSIONS | |
| Revisiting the Rationale for Distributed Geolibraries | |
| Distributed Geolibraries in Context | |
| REFERENCES | |
| APPENDIXES | |
| Appendix A: Workshop Participants | |
| Appendix B: Contributed White Papers | |
| Appendix C: Workshop Agenda | |
| Appendix D: Example Prototypes | |
| Appendix E: Biographical Sketches of Panel Members | |
NOTICE: The project that is the subject of this report was approved by the Governing Board of the National Research Council, whose members are drawn from the councils of the National Academy of Sciences, the National Academy of Engineering, and the Institute of Medicine. The members of the committee responsible for the report were chosen for their special competences and with regard for appropriate balance.
Support specifically for this project was provided by the National Science Foundation and the Defense Advanced Research Projects Agency. The project also utilized resources provided to the Mapping Science Committee by the National Imagery and Mapping Agency, the U.S. Geological Survey and the Federal Geographic Data Committee, the Bureau of Transportation Statistics, the National Oceanic and Atmospheric Administration, the Bureau of Land Management, and the Bureau of the Census. Any opinions, findings, conclusions, or recommendations expressed in this publication are those of the authors and do not necessarily reflect the views of the agencies that provided support for this project.
International Standard Book Number (ISBN) 0-309-06540-2
Copies of this report are available from
Mapping Science Committee
Board on Earth Sciences and Resources
National Research Council
2101 Constitution Avenue, NW
Washington, DC 20418
Cover: Backdrop for the collage is a digital orthophoto of the Boston, Massachusetts, area. The figure was downloaded from the Internet from MIT/MassGIS Digital Orthophoto Project (see Appendix D)
Copyright 1999 by the National Academy of Sciences. All rights reserved.
Printed in the United States of America
PANEL
ON DISTRIBUTED GEOLIBRARIES
MICHAEL F. GOODCHILD (Chair) University of California, Santa Barbara
PRUDENCE S. ADLER, Association of Research Libraries, Washington, D.C.
BARBARA P. BUTTENFIELD, University of Colorado, Boulder
ROBERT E. KAHN, Corporation for National Research Initiatives, Reston, Virginia
ANNETTE J. KRYGIEL, National Defense University, Ft. Lesley J. McNair, Washington, D.C.
HARLAN J. ONSRUD, University
of Maine, Orono
NRC Staff
THOMAS M. USSELMAN, Senior Staff Officer
JENNIFER T. ESTEP, Administrative Assistant
MAPPING SCIENCE COMMITTEE
MICHAEL F. GOODCHILD (Chair) University of California, Santa Barbara
KAREN C. SIDERELIS (Vice-Chair) North Carolina Center for Geographic Information and Analysis, Raleigh
BRIAN J. L. BERRY, The University of Texas at Dallas
CLIFFORD A. BEHRENS, + Telcordia Technologies, Morristown, New Jersey
BARBARA P. BUTTENFIELD, * University of Colorado, Boulder
NICHOLAS CHRISMAN, University of Washington, Seattle
DAVID J. COLEMAN, University of New Brunswick, Fredericton
MICHAEL J. FOLK, * University of Illinois, Urbana
HENRY L. GARIE, New Jersey Department of Environmental Protection, Trenton
BARRY GLICK, Carillon Consulting, Arlington, Virginia
NINA S-N. LAM, Louisiana State University, Baton Rouge
JOEL L. MORRISON, + Ohio State University, Columbus
HARLAN J. ONSRUD, University of Maine, Orono
C. STEPHEN SMYTH, Microsoft Corporation, Redmond, Washington
REX W. TRACY, GDE Systems, Inc., San Diego, California
A. KEITH TURNER, Colorado School of Mines, Golden
LYNA L. WIGGINS, Rutgers
University, New Brunswick, New Jersey
NRC Staff
THOMAS M. USSELMAN, Senior Staff Officer
JENNIFER T. ESTEP, Administrative Assistant
* Term of appointment ended December 31, 1998.
+ Term of appointment began in 1999.
BOARD ON EARTH SCIENCES
AND RESOURCES
J. FREEMAN GILBERT (Chair) University of California, San Diego
JOHN J. AMORUSO, Amoruso Petroleum Company, Houston, Texas
PAUL B. BARTON, JR., Emeritus, U.S. Geological Survey, Reston, Virginia
KENNETH I. DAUGHERTY, Marconi Information Systems, Reston, Virginia
BARBARA L. DUTROW, Louisiana State University, Baton Rouge
RICHARD S. FISKE, Smithsonian Institution, Washington, D.C.
JAMES M. FUNK, Shell Continental Companies, Houston, Texas
WILLIAM L. GRAF, Arizona State University, Tempe
RAYMOND JEANLOZ, University of California, Berkeley
SUSAN M. KIDWELL, University of Chicago, Chicago, Illinois
SUSAN KIEFFER, Kieffer & Woo, Inc., Palgrave, Ontario
PAMELA LUTTRELL, Mobil Corporation, Dallas, Texas
ALEXANDRA NAVROTSKY, University of California, Davis
DIANNE R. NIELSON, Utah Department of Environmental Quality, Salt Lake City
JILL D. PASTERIS, Washington University, St. Louis, Missouri
EDWARD M. STOLPER, California Institute of Technology, Pasadena
JOHN R. G. TOWNSHEND, University of Maryland, College Park
MILTON H. WARD, Cyprus Amax
Minerals Company, Engelwood, Colorado
NRC Staff
ANTHONY R. de SOUZA, Director
TAMARA L. DICKINSON, Senior Program Officer
ANNE M. LINN, Senior Program Officer
THOMAS M. USSELMAN, Senior Program Officer
VERNA J. BOWEN, Administrative Assistant
JENNIFER T. ESTEP, Administrative Assistant
JUDITH L. ESTEP, Administrative Assistant
COMMISSION ON GEOSCIENCES, ENVIRONMENT, AND RESOURCES
GEORGE M. HORNBERGER (Chair) University of Virginia, Charlottesville
RICHARD A. CONWAY, Union Carbide Corporation (Retired), S. Charleston, West Virginia
THOMAS E. GRAEDEL, Yale University, New Haven, Connecticut
THOMAS J. GRAFF, Environmental Defense Fund, Oakland, California
EUGENIA KALNAY, University of Oklahoma, Norman
DEBRA KNOPMAN, Progressive Policy Institute, Washington, D.C.
KAI N. LEE, Williams College, Williamstown, Massachusetts
RICHARD A. MESERVE, Covington & Burling, Washington, D.C.
JOHN B. MOONEY, JR., J. Brad Mooney Associates, Ltd., Arlington, Virginia
HUGH C. MORRIS, El Dorado Gold Corporation, Vancouver, British Columbia
H. RONALD PULLIAM, University of Georgia, Athens
MILTON RUSSELL, University of Tennessee, Knoxville
THOMAS C. SCHELLING, University of Maryland, College Park
ANDREW R. SOLOW, Woods Hole Oceanographic Institution, Woods Hole, Massachusetts
VICTORIA J. TSCHINKEL, Landers and Parsons, Tallahassee, Florida
E-AN ZEN, University of Maryland, College Park
MARY LOU ZOBACK, U.S. Geological
Survey, Menlo Park, California
NRC Staff
ROBERT M. HAMILTON, Executive Director
GREGORY H. SYMMES, Associate Executive Director
CRAIG SCHIFFRIES, Associate Executive Director for Special Programs
JEANETTE SPOON, Administrative and Financial Officer
SANDI FITZPATRICK, Administrative Associate
MARQUITA SMITH, Administrative Assistant/Technology Analyst
This report has been reviewed by individuals chosen for their diverse perspectives and technical expertise in accordance with procedures approved by the NRC's Report Review Committee. The purpose of this independent review is to provide candid and critical comments that will assist the authors and the NRC in making their published report as sound as possible and to ensure that the report meets institutional standards for objectivity, evidence, and responsiveness to the study charge. The content of the review comments and draft manuscript remain confidential to protect the integrity of the deliberative process. We wish to thank the following individuals for their participation in the review of this report:
Christine L. Borgman
Presidential Chair in Information Studies
University of California, Los Angeles
Edward A. Fox
Department of Computer Science
Virginia Polytechnic Institute and State University
Blacksburg
Kenneth D. Gardels
Research Program in Environmental Planning
and Geographic Information Systems
College of Environmental Design
University of California, Berkeley
John L. King
Department of Information and Computer Science
University of California, Irvine
Xavier R. Lopez
Spatial Products/Data Server Division
Oracle Corporation
Nashua, New Hampshire
Clifford A. Lynch
Executive Director
Coalition for Networked Information
Washington, D.C.
Hugh C. Morris
El Dorado Gold Corporation
Vancouver, British Columbia
Jane Smith Patterson
Senior Advisor for Science and Technology
Office of the Governor
Raleigh, North Carolina
James F. Williams II
Dean of Libraries
University of Colorado, Boulder
While the individuals listed above have provided many constructive comments and suggestions, responsibility for the final content of this report rests solely with the authoring committee and the NRC.
The Mapping Science Committee serves as a focus for external advice to federal agencies on scientific and technical matters related to spatial data handling and analysis. The purpose of the committee is to provide advice on the development of a robust national spatial data infrastructure for making informed decisions at all levels of government and throughout society in general.
The concept of a national spatial data infrastructure (NSDI) was first advanced by the Mapping Science Committee (MSC) in its 1993 report, Toward a Coordinated Spatial Data Infrastructure for the Nation. Subsequent MSC reports have addressed specific components of the NSDI, including partnerships (Promoting the National Spatial Data Infrastructure Through Partnerships, 1994), basic data types (A Data Foundation for the National Spatial Data Infrastructure, 1995), and future trends (The Future of Spatial Data and Society, 1997).
When the NSDI was defined in 1993, few users or producers of geospatial data * made much use of the Internet or the World-Wide Web (WWW). Although there was emphasis on digital geospatial data, the primary method of dissemination was by magnetic tape. There were virtually no digital online catalogs of geospatial data or methods for searching for data across computer networks. Moreover, since most useful geospatial data were produced by a small number of federal agencies, there was little problem locating the appropriate source. Today, the WWW has grown into an enormously successful tool and has had a profound impact on the entire environment for geospatial data acquisition. At the same time, it has presented a growing problem as the number of potential suppliers has mushroomed, in its inability to deal effectively with the task of discovering what geoinformation exists and of locating an appropriate source.
This report can be understood therefore as an updating of the MSC's concept of the NSDI in the era of the WWW. In organizing this effort and producing this report, the committee is expressing its view that the WWW has added a new and radically different dimension to its earlier conception of NSDI, one that is much more user oriented, much more effective in maximizing the value of the nation's geospatial data assets, and much more cost effective as a data dissemination mechanism. Distributed geolibraries reflect the same basic thinking about the future of geospatial data, which emphasizes sharing, universal access, and productivity but in the context of a technology that was almost impossible to anticipate prior to 1993.
A panel under the aegis of the MSC convened a workshop to explore the following topics:
By clarifying the vision of distributed geolibraries and identifying some of the key issues, it is hoped that the workshop and this report will provide a common focus for the many efforts already under way and will stimulate new and expanded efforts. The workshop was only a first step in this process, and many issues remain to be clarified by further discussions, research, and development of prototypes.
The report makes extensive use of the traditional library as a framework for discussion because it is so familiar and well understood. Undoubtedly, much future work in researching and developing distributed geolibraries will occur within this framework, but the framework will also be constraining in some respects. Exactly how distributed geolibraries develop and how closely they follow the metaphor of the library remain to be seen. Moreover, the metaphor is used selectively, since many of the functions of libraries that may have no equivalent in distributed geolibraries were not discussed at the workshop, and may not be relevant.
The workshop began on Monday, June 15, 1998, and followed the agenda given in Appendix C. Workshop participants were selected in such a way that all major sectors of the NSDI community and geospatial data activity were represented by their respective stakeholders, with an appropriate balance among them. Of the participants, 35 percent were from federal and state government, 39 percent were from academia, 12 percent were from the private sector, and 14 percent were from other sectors (e.g., associations). See Appendix A for a list of participants. Another way of considering the participants is by their primary focus--44 percent with a geospatial background, 36 percent from computing science and engineering, 12 percent from the library sciences, and 8 percent "other."
The Panel on Distributed Geolibraries coordinated the prepar-ation of a series of white papers in advance of the workshop to stimulate discussion on certain key issues. These were posted on the WWW several weeks prior to the workshop and were available to participants and others who happened across them. Titles of the white papers for the workshop are given in Appendix B.
This report reflects the consensus of the panel regarding the discussions that took place at the workshop, the issues that arose there and in the white papers, and the workshop's broader context.
* The report follows evolving practice in the NSDI community by adopting the term geospatial to refer to maps and images of the Earth's surface and near surface and their digital equivalents. The terms geographic and spatial are often used almost synonymously but are avoided here.
The National Academy of Sciences is a private, nonprofit, self-perpetuating society of distinguished scholars engaged in scientific and engineering research, dedicated to the furtherance of science and technology and to their use for the general welfare. Upon the authority of the charter granted to it by the Congress in 1863, the Academy has a mandate that requires it to advise the federal government on scientific and technical matters. Dr. Bruce Alberts is president of the National Academy of Sciences.
The National Academy of Engineering was established in 1964, under the charter of the National Academy of Sciences, as a parallel organization of outstanding engineers. It is autonomous in its administration and in the selection of its members, sharing with the National Academy of Sciences the responsibility for advising the federal government. The National Academy of Engineering also sponsors engineering programs aimed at meeting national needs, encourages education and research, and recognizes the superior achievements of engineers. Dr. William A. Wulf is interim president of the National Academy of Engineering.
The Institute of Medicine was established in 1970 by the National Academy of Sciences to secure the services of eminent members of appropriate professions in the examination of policy matters pertaining to the health of the public. The Institute acts under the responsibility given to the National Academy of Sciences by its congressional charter to be an adviser to the federal government and, upon its own initiative, to identify issues of medical care, research, and education. Dr. Kenneth I. Shine is president of the Institute of Medicine.
The National Research Council was organized by the National Academy of Sciences in 1916 to associate the broad community of science and technology with the Academy's purposes of furthering knowledge and advising the federal government. Functioning in accordance with general policies determined by the Academy, the Council has become the principal operating agency of both the National Academy of Sciences and the National Academy of Engineering in providing services to the government, the public, and the scientific and engineering communities. The Council is administered jointly by both Academies and the Institute of Medicine. Dr. Bruce Alberts and Dr. William A. Wulf are chairman and interim vice-chairman, respectively, of the National Research Council.
A distributed geolibrary is a vision for the future. It would permit users to quickly and easily obtain all existing information available about a place that is relevant to a defined need. It is modeled on the operations of a traditional library, updated to a digital networked world, and focused on something that has never been possible in the traditional library: the supply of information in response to a geographically defined need. It would integrate the resources of the Internet and the World Wide Web into a simple mechanism for searching and retrieving information relevant to a wide range of problems, including natural disasters, emergencies, community planning, and environmental quality. A geolibrary is a digital library filled with geoinformation--information associated with a distinct area or footprint on the Earth's surface--and for which the primary search mechanism is place. A geolibrary is distributed if its users, services, metadata, and information assets can be integrated among many distinct locations.
This report presents the findings of the Workshop on Distributed Geolibraries: Spatial Information Resources, convened by the Mapping Science Committee of the National Research Council in June 1998. The report is a vision for distributed geolibraries, not a blueprint. Developing a distributed geolibrary involves a series of technical challenges as well as institutional and social issues, which are addressed relative to the vision.
A wide variety of human activities could benefit from the services of distributed geolibraries. The activities include many for which the timely provision of information could minimize loss of life or result in more timely and effective use of existing information resources.
The contents of a distributed geolibrary are not limited to information normally associated with maps or images of the Earth's surface but include any information that can be associated with a geographic location. In this sense the vision thus extends far beyond the context of the National Spatial Data Infrastructure (NSDI).
New technological developments make it possible for people to gather data germane to their own needs more readily, extract data from online and other electronic repositories, develop the information products they need, use the products for decision making, and contribute their locally gathered geoinformation and derived products to libraries or other repositories. Developing the technical and institutional means to support incorporation of local knowledge into networked repositories presents a novel challenge.
Although many projects currently exhibit elements of the vision of distributed geolibraries, the lack of a clear statement of that vision impedes coordination and leads to duplication of effort. A clear statement can provide a sense of common purpose.
New technological initiatives such as the Next Generation Internet and Internet II are likely to provide extensions to Internet and World Wide Web (WWW) protocols and orders-of-magnitude increases in bandwidth. Many of these developments are expected to be relevant to distributed geolibraries.
THE NATIONAL SPATIAL DATA INFRASTRUCTURE
The vision of the NSDI as expressed by the Mapping Science Committee in 1993 (NRC, 1993) did not anticipate the enormous impact and potential of the Internet and WWW. By emphasizing the problems of production of digital geoinformation, it underemphasized the importance of effective processes of dissemination to users. User communities are growing rapidly and are likely to grow even more rapidly if current difficulties associated with finding geoinformation on the Internet can be addressed.
Distributed geolibraries provide a useful framework for discussion of the issues of dissemination associated with the NSDI in addition to organization and access issues. The vision is readily extendible to a global context.
An essential component of a distributed geolibrary is a comprehensive gazetteer, linking named places and geographic locations. A national gazetteer would be a valuable addition to the framework data sets of the NSDI. These framework data sets are being coordinated by the Federal Geographic Data Committee (FGDC), which also has the responsibility for associated standards and protocols. Production and maintenance of the national gazetteer could be through the National Mapping Division of the U.S. Geological Survey (USGS) in collaboration with other agencies and could be an extension of the USGS's Geographic Names Information System.
CONTENTS, SERVICES, AND FUNCTIONS OF DISTRIBUTED GEOLIBRARIES
A distributed geolibrary would allow users (and computers) to specify a requirement, search across the resources of the Internet for suitable information, assess the fitness of that information for use, retrieve and integrate it with other information, and perform various forms of manipulation and analysis. A distributed geolibrary would thus integrate the browsing functions of the WWW with those of geographic information systems and related technologies.
In addition, a distributed geolibrary would support collaborative work, such as multidisciplinary research by teams, decision making by groups of stakeholders, and classroom projects by groups of students. It would provide mechanisms for capturing the knowledge that results from such work and making it accessible to others as appropriate. It could also provide mechanisms for storing and archiving such knowledge.
Many important applications of distributed geolibraries are best located in the field, using portable systems and wireless communications. Delivery of services to the field is important in emergency management, agriculture, natural resource management, and many other applications.
The United States possesses vast archives of information that could be incorporated into distributed geolibraries and made accessible to users whose need for information is defined by geographic location. Linking much of this information to geographic location--in other words, to transform it to geoinformation--would be valuable within a geolibrary context.
Significant research problems will have to be solved to enable the vision of distributed geolibraries. Research needs include problems of indexing, visualization, scaling, automated search and abstracting, and data conflation. In addition, there are a variety of social and institutional issues that need further investigation. Research on these issues targeted to improve access to integrated geoinformation might be pursued by the National Science Foundation and other agencies sponsoring basic science, as well as by the National Mapping Division of the USGS, and the National Imagery and Mapping Agency.
ARCHITECTURE OF DISTRIBUTED GEOLIBRARIES
There are several alternative architectures for distributed geolibraries, including a single enterprise sponsored by a well-resourced agency, analogous to a national library; a network of enterprises with their own sponsors, analogous to a network or federation of libraries; and a loose network held together by shared protocols, analogous to the WWW.
INTELLECTUAL PROPERTY ISSUES
The development of distributed geolibraries will need to consider issues related to intellectual property rights. These need to be considered in the broader international debates about the nature of electronic information and databases as intellectual property. A distinction with respect to intellectual property rights needs to be drawn between raw data and knowledge works as they appear very differently from the perspective of the functions and services of a library. Strong arguments are presented for focusing distributed geolibraries on knowledge, rather than merely providing access to raw data.
ORGANIZATIONAL ISSUES
While traditional production of geospatial data has been relatively centralized, the vision of distributed geolibraries represents a broadly based restructuring of past institutional arrangements for the dissemination of geospatial data, one that is much more bottom-up, decentralized, and voluntary.
Many prototypes that include elements of a distributed geolibrary already exist, but it will take many years to realize the full vision, and it will be important to be able to measure and monitor progress. The vision of distributed geolibraries has distinct aspects that may not be addressed effectively by current programs aimed at digital libraries in general. The success of a distributed geolibrary is largely dependent on the ability to integrate information available about a place. That ability is severely impeded today by differences in formats and standards, access mechanisms, and organizational structures. Integration is a formidable problem for today's users of geospatial data.
Introduction
The Internet and World Wide Web (WWW) provide users with unprecedented access to information resources. In many ways they emulate the functions of traditional libraries, by making it possible to search and locate information using simple tools. But the potential is far greater in areas such as electronic commerce and in supporting new ways of finding information that go far beyond the services of the traditional library. One such possibility is the distributed geolibrary, the subject of this report. A distributed geolibrary would allow its users to search the resources of the WWW for information about a place, 1 to evaluate the information, and to retrieve and work with it as appropriate.
| A geolibrary is a digital library filled with geoinformation and for which the primary search mechanism is place. Geoinformation is information associated with a distinct area or footprint on the Earth's surface. A geolibrary is distributed if its users, services, metadata, and information assets can be integrated among many distinct locations. Chapter 2 develops a more detailed vision for geolibraries. |
This report begins with a series of four examples to illustrate the range and importance of the practical problems that could be addressed by the services of distributed geolibraries. The following chapters discuss the full vision, social and institutional context, and steps that will need to be taken to make distributed geolibraries a reality. Because this is the first discussion of the topic, it falls short of a complete blueprint, and much more exploration will be needed. But this report is perhaps the first step in that direction.
Place is a common theme in many events, activities, emergencies, and issues. Terrorist acts like the World Trade Center bombing and natural disasters like Hurricane Andrew affect specific locations on the Earth's surface and call for relief efforts that must occur quickly and that are sharply focused in space. Accurate knowledge of the place at which an emergency occurs and of surrounding conditions is of critical importance in dispatching ambulances and other forms of relief. Place is important in learning about the world and in understanding its environment.
Distributed geolibraries are intended to provide new kinds of place-based information services that are not available from the traditional library or from the current WWW. The user of a distributed geolibrary should not be required to be an information retrieval expert, to be proficient in computer technology, or to live in a metropolitan area. The distributed geolibrary envisioned in this report could be an information service for every American--for students and teachers, scientists, community members, government officials, business men and women, and families--by allowing ready access to available information about any place on the Earth's surface. The following hypothetical examples illustrate some of the potential uses and the critical importance of distributed geolibraries.
EXAMPLES
Emergency Response
A tanker truck carrying hazardous chemicals is traveling on the highway around a major metropolitan area. Just as the driver approaches a bridge his truck collides with the car in front of him. The truck flips, pinning both drivers inside their vehicles and rupturing the tanker. From the debris a plume slowly rises from the chemical spill and is carried by the wind into the surrounding neighborhood. A liquid chemical drips over the bridge into the water below.
To deal with the emergency, metropolitan officials need to alert schools, residences, and businesses in the neighborhoods nearby. There has been a recent building boom, and new roads have been constructed. Local maps are out of date. Evacuations must be discussed and planned; routes need to be determined and reassessed; and the effects of weather on the plume need to be monitored continuously. Will it drift to the nearby airport as well? Meanwhile the spill must be contained, the traffic rerouted from the accident scene, and the way cleared for medical assistance. What human health hazards might be related to the contaminant? Hospitals and medical centers in the affected area must be put on alert.
Dealing with the potential contamination of the river requires considerable attention as well. What is the current rate of flow and level of the water? Who and what will be affected? Information is immediately required on towns, public and private sites, and beaches and harbors along the river. What access to these sites is possible? How can containment be achieved? The fast-running river passes many small communities and runs between two states. Data from many sources must be integrated and used in order for officials to deal with the effects of the accident. Other needs will emerge after the emergency is contained, such as dealing with the effects on wildlife habitat along the river and the fishing interests that flourish in the area.
But the immediate information needs are critical. Although emergency officials have access to their own local sources, they know some of their own maps are not current, so other data should also be checked. And the small towns along the river have limited information resources. The officials need services that allow them to access and browse available imagery, thematic maps, current public and private data resources, and even services available through commercial subscriptions. They need to reach other libraries and online sites that specialize in key information, including contaminants. They need to identify personnel in other cities who have dealt with similar spills. In short, they need access to the best information available to cope with the emergency.
Information resources through distributed geolibraries could greatly assist rapid response to such emergencies and longer-term efforts aimed at prevention and mitigation. Moreover, it is important that information be available where it is needed most, which in many instances will be at the location of the emergency or in a local command center. The tools to access and work with information may have to operate in difficult environments using specialized field computers (palmtops, portables, or pen computers) and wireless communication. New sensors may be brought to the site, supplying data that will have to be integrated with existing data. Decision makers will want access to powerful aids for decision support and for rapid simulation of future scenarios.
Housing Relocation
A family is relocating to Southern California. They want to find a home in a suitable environment. They are concerned about earthquake hazards and want information that might help them avoid vulnerable areas and fault lines. After having identified several possible home sites, they further refine their search by excluding undesirable areas--such as high-crime districts or hazardous materials storage sites. They have read newspaper stories about brush fires. Has there been a history of such brush fires in any of the neighborhoods they are considering? They look at maps for the locations of churches, schools, shops, and parks. Special medical services are needed for one family member. What services are close? They consider distances to workplaces. They also worry about the wisdom of such a large investment. Will their home retain value? What are the neighborhood's economic trends?
The family wants to know about the place where they will live, work, and play. As responsible citizens they want to be informed about issues affecting their neighborhood. If such information is readily accessible, it could make a significant difference in their choice of where to live. Today they might not have the resources, skills, or special education to find the answers to all of these questions, whereas most of this information would be available through the services of distributed geolibraries. In the future, however, they may be able to access information using wireless links directly to their vehicle as they explore potential neighborhoods.
Public Health
A researcher begins the task of analyzing the association of environment and disease in a particular urban area. She needs access to housing information and population characteristics, as well as health and medical histories in the geographic area of interest. She needs to examine health care facilities, types of buildings, disease rates, even summer heat fatalities, as well as environmental aspects, all over several decades. Incidents with contaminants and pollutants in the area must be located, assessed, and factored into her research. Finding the information will require searches through countless government institutions, media reports, and scientific journals.
She begins her work by visiting the local library; contacting responsible local, state, and federal agencies, talking with colleagues; and using search engines on the WWW. Finding the appropriate information, dealing with issues of confidentiality of health data, and putting the information into a form that can be integrated with other data about a given place can be time consuming; eventual success depends heavily on her background, technical training, and experience. Paradoxically, a request that can be expressed in very simple terms ("give me everything available about environment and disease in this place") turns out to be enormously and unreasonably complex, using the limited tools available today, and to consume the vast majority of the resources available to the project. Better tools for data access and management would allow more time to be spent on data analysis.
Natural Resource Planning
The year is 2010. More than 1,000 summer homes have been built within 10 miles of the boundaries of Yellowstone National Park, Grand Teton National Park, and the Bridger-Teton Wilderness Area. Numerous pets have been killed by grizzly bears, wolves, and coyotes, particularly in the early summer of 2009, when heavy snowpacks kept many wild animals from moving into the high country. The conflicts were capped by the deaths of a brother and sister, ages 7 and 8, following an attack by a grizzly bear, which was subsequently killed by wildlife authorities.
The National Park Service and the Fish and Wildlife Service are concerned about ever-increasing conflicts between wildlife and humans. Pressure from new residents and from ranchers has led to the death of 20 percent of the reintroduced wolves. Counties, once hungry for the economic growth brought by the construction of luxury summer homes, are now concerned about degradation of water quality and the demands of new residents that their assets be protected from wildlife. Fire management has become an increasing concern at multiple levels of government; officials recognize the need for frequent exposure of forests to fires in order to reduce fuel load, but with greatly increased private property near the forest they have found it increasingly difficult to allow fires to burn without risk to structures.
Local and federal agencies recognize the need to draw on common data resources that describe terrain, vegetation, and wildlife habitat in order to solve common problems of resource management. These data must be integrated across many different themes, topics, and disciplines and must be readily available to users needing to assess and plan effectively based on place.
The distributed geolibraries available to these stakeholders in 2010 allow them to assemble quickly information in the archives of the various levels of government, nongovernmental organizations, and citizen groups that are relevant to an issue centered at a particular place on the Earth's surface. Through distributed geolibraries, decision makers also may learn quickly what information is not available elsewhere and therefore may need to be collected. Additional tools support the decisions and choices that need to be made. With these new tools, development of long-range plans that allow growth while minimizing conflicts with fire and wildlife is progressing after long delays. Several developments have now been completed in places where fire and wildlife conflicts are minimized and where drainage and sewage management have provided excellent protection of water quality.
A COMMON THEME
A common theme in these examples is the current inability to locate and integrate information quickly and simply based on place. Although place is the definitive element in many issues, it is currently easier to find information about a named individual, an agency, or a field of scientific knowledge than about a place on the Earth's surface. This report explores opportunities that will improve our ability to find, access, integrate, and use information by exploiting the technologies of the Internet, the WWW, geographic information systems, and digital computers.
| Finding 1 |
|
A wide variety of human activities could benefit from the services of distributed geolibraries. They include many where the timely provision of information could minimize loss of life or result in more timely and effective use of existing information resources and others where the costs of bad decisions could be avoided. |
Distributed geolibraries could provide information services directed specifically at the needs of communities. In a speech given at the Brookings Institution on September 2, 1998, Vice President Gore argued that increased public access to information through mechanisms such as those discussed in this report will put "more control, more information, more decision-making power into the hands of families, communities, and regions, to give them all the freedom and flexibility they need to reclaim their unique place in the world." The services of distributed geolibraries that are discussed and elaborated in this report could enhance education, improve the quality of day-to-day living, and provide economic benefits. They could support scientific research by furnishing new tools for search, analysis, data fusion, and visualization. They could provide the means by which officials cope with emergencies, address issues of health and social services, troubleshoot crime, and accomplish urban planning. They could help provide economic benefits by enabling people to research, manage, market, and grow their business ventures.
Many of the components of distributed geolibraries already exist or are being developed, and many existing WWW sites offer some limited form of distributed geolibrary services. This report goes beyond the present to articulate a vision of what might be, with the objective of providing a common target and of pulling disparate threads together into a unified effort to achieve that vision in the not too distant future.
Note 1
The term place is used throughout this report to refer to a location
of interest on or near the Earth's surface. It might be a single point or an
extended area or a volume above or below the surface; it might be defined by
name or by coordinates, and it might be exact or ill-defined.
RECENT DEVELOPMENTS
The past two decades have seen rapid developments in information technology. Hardware components have become smaller and more powerful, enabling the development of the personal computer and bringing the ability to process information to field environments that are far removed from the office and desktop. Software has grown more sophisticated, empowering individuals with little technical training to make effective use of computers in ways that would have been inconceivable 25 years ago. Developments in wireless communications allow networked access virtually anywhere. Most recently, applications of the Internet and World Wide Web (WWW) have captured the popular imagination and spawned entire industries of electronic commerce and information dissemination.
These developments have in turn driven massive changes in the way society disseminates and accesses information of various types. The role that information plays in everyday activities is changing, as people come to rely on access to up-to-the-minute information on weather, markets, politics, and entertainment via the Internet. Changes seem especially challenging and profound in the area of information that is tied or related to a geographic place, that is, a location at or near the surface of the Earth. Millions of people access such WWW sites as MapQuest or Microsoft's Terraserver each day, which offer maps, driving directions, satellite images, and other forms of raw or processed information and related services (see Appendix D for examples). Similar changes are reflected in the proliferation of geospatial data clearinghouses, digital spatial data libraries, geographic information system software, and new high-resolution imaging satellites.
Several factors help explain the high level of interest in the Internet and WWW as technologies for disseminating these particular types of information and related services. First, the methods of storage and dissemination of traditional products--paper maps, atlases, and photographic images--are cumbersome in comparison to digital data products and often require special cabinets and awkwardly shaped shipping packages. Digital methods make it as easy to store or send a map as it is to handle text. Second, geoinformation is often related to a specialized interest, and it may be hard to justify maintaining an extensive collection in a local library or bookstore; the WWW is ideally suited to the distribution of such information in response to specialized needs because the costs of maintaining a server are low, and universal access to the Internet means that only one server is needed. Finally, geoinformation needs to be timely, but it can take years for a paper map to be produced, printed, and disseminated; the WWW allows users to access information as soon as it is posted.
At the same time there are potential disadvantages to use of the WWW as a mechanism for storing and disseminating geoinformation that will have to be addressed. Little of the information now available via the WWW has been subjected to the mechanisms that ensure quality in traditional publication and library acquisition: peer review, editing, and proofreading. There are no WWW equivalents of the library's collection specialists who monitor library content. But it is easy to be misled into believing that quality control problems of the WWW and distributed geolibraries are somehow different from conventional ones. Users of distributed geolibraries will tend to trust data that come from reputable institutions, with documented assurances of quality, and to mistrust data of uncertain origins, just as they do today.
A common theme in all of these efforts to exploit the Internet and the WWW has been the enabling role of technology; many people with an interest in geoinformation and an awareness of the potential of the WWW and related technologies like the Java programming language have begun exploring their use. Five years after the first explosion of interest in the WWW is an appropriate time to pause and ask some basic questions:
The Mapping Science Committee convened a workshop 1 in June 1998to explore these issues. The Workshop on Distributed Geolibraries: Spatial Information Resources was designed to explore long-term visions of how ongoing activities may evolve, to explore possible development strategies, and to identify common needs (see Finding 2). Workshop participants were selected to represent a number of communities with interests in these issues: experts in dissemination of geoinformation; leaders of current activities; specialists in the relevant technologies; and specialists in the associated institutional, legal, social, and economic issues. A list of participants is provided in Appendix A.
| Finding 2 |
|
Although many projects currently exhibit elements of the vision of distributed geolibraries, the lack of a clear statement of that vision impedes coordination and leads to duplication of effort. A clear statement can provide a sense of common purpose. |
Prior to the workshop, participants were asked to contribute a "white paper" on issues they found relevant to the topic. These papers, which provided useful background to the meeting, are listed in Appendix B and are available on the WWW.
This report was prepared by the panel that organized the workshop (a list of panel members appears in the beginning of this report). Thus, it reflects the consensus of the panel, regarding the discussions that took place at the workshop, the issues that arose there and in the white papers, and the workshop's broader context.
The workshop did not attempt to bound the scope of distributed geolibraries precisely, and even if that were possible it would have been unreasonable to expect it in a workshop of such limited duration. Many basic questions remain unanswered, and this report should be read as a first effort in this area and as a stimulus for further work and discussion, rather than as a precise blueprint.
The workshop participants were almost entirely from the United States, and this report necessarily adopts a U.S. perspective. Nevertheless it is hoped that it will be read by non-U.S. researchers and developers interested in distributed geolibraries and that it will help to achieve a greater degree of convergence in research and development at the international level.
A LIBRARY VISION
The organizers of the workshop chose to frame the discussion by reference to the functions, services, and institutional arrangements of the library, for two major reasons: first, to engage the library community, with its long experience in providing access to information, in the development of a vision for a new kind of library and, second, to provide a familiar and concrete starting point for the discussion. It is possible that libraries will be the principal means whereby citizens gain access to the services of the distributed geolibraries of the future; it is also possible that libraries will play no significant part in that process.
The metaphor of the library is powerful because it immediately suggests a number of important issues. For example, one way to think of a library is as a storehouse of the intellectual works of society, and millions of people from all walks of life have contributed works to our current library system. Can we expect to see a similar diversity of contributors in the distributed geolibraries of our future? What incentives are needed to motivate people to make their works accessible? If a library exists to serve a community, its first responsibility should be to provide the information needed by the community. How important is geospatial information about the community itself, produced perhaps within the community, compared to information about areas outside the community perhaps produced by others? Will a local geolibrary, responsible to a local community, acquire and make available very different works and databases than a university-based geolibrary, state geolibrary, federal agency geolibrary, or a private geolibrary?
There are many types of libraries and much variation in the functions they perform. Some of the comments in this report refer to all types of libraries, and some are more appropriate for the research library, the institution maintained by a university, or similar organization for the use of its community of scholars. In general, it is the research library that provides the model of services discussed in this report.
However, the metaphor of the library should not be taken too far, and not all aspects of the operation of a library will be useful in envisioning distributed geolibraries. Many of these will be generic and of no specific relevance to the geoinformation that is the focus of distributed geolibraries. Such issues have already been discussed at length in the library and digital library literatures, and no attempt is made to replicate those discussions here. For example, it is assumed that distributed geolibraries will need to address issues of archiving and preservation (particularly serious issues given the rate of technological change in the digital world), but these are generic to all libraries and are not discussed at length in this report.
DEFINING A DISTRIBUTED GEOLIBRARY
Three ideas help to define the concept of a distributed geolibrary: it is distributed, modeled on the concept of a library, and concerned with information about the Earth. The next three sections discuss these ideas in detail and build an outline of a vision for distributed geolibraries.
A Distributed Library
The term distributed refers to the locations of the physical and functional parts of the library and the locations of its users. In a traditional library the various stages of putting useful information into the hands of users occur largely in one place, in the physical structure known as the library. Books arrive in an acquisitions department; they are cataloged by specialists employed by the library in a cataloging department, placed on shelves within the library in locations designed to make it easy for patrons to browse through holdings on similar topics, retrieved by librarians and users, and signed out of the library at the circulation desk operated by a circulation department. Because these functions occur in one institution, it is sometimes difficult for an observer to separate them and difficult to distinguish the functions of the library from its physical assets.
In today's digital world it is possible for functions to occur in multiple locations, held together and coordinated by communications networks like the Internet. Catalog staff may work in locations far removed from the reference librarians who eventually use the catalog to help users find the information they need. Moreover, today's technology is advancing to the point where patrons (or users) can employ library services to combine data sets located in different places. For many purposes the Internet provides almost infinite connectivity, such that a user may conceive of a single database that is in reality distributed over many different servers under different jurisdictions. Users have the option of processing data on their own computers or sending data to remote locations where processing capabilities are more powerful. Wireless technologies provide for communication to virtually everywhere, and computing technology can now be packaged into electronic units that are readily transportable and in some cases wearable.
Libraries have responded to this new networked environment by establishing coordinated, collaborative, and multi-institutional relationships. The library building no longer houses all of the services it provides to its users; instead, the institution of the library obtains those services in whatever ways maximize effectiveness and minimize costs, by using resources in the building or from a myriad of sites distributed around the globe.
Traditionally, libraries have made a clear distinction between general and special collections, using the latter term to refer to assets that need special treatment or that are unique in some way to a particular library, such as the papers of a particular literary or scientific figure. Maps and images form special collections in many libraries, in part because they are difficult to handle and in part because much of the collection may be unique. The transition to a digital world will mean that many of the difficulties of handling special media disappear, allowing such collections to become part of a library's information mainstream (although working with maps and images will always demand specially designed interfaces and large monitors because of their visual content and broad bandwidth and powerful processors to deal with voluminous data). But the uniqueness of the special collection will become increasingly important in the digital world, in which any item in any collection is potentially accessible from anywhere.
In this report the term custodian refers to the person or agency responsible for maintenance of a given data set. The custodian may be far removed from the server on which the data set is mounted and from which it is disseminated, but nevertheless it is the custodian who holds the definitive version of the data and updates it to account for changes. The custodian may have some form of responsibility for quality--for example, the custodian may decide which data are to be acquired and held based in part on quality or may provide assurances of quality to users. The function of a custodian is different from that of a repository or archive, which is where data are preserved in static form.
Geoinformation
Geoinformation is information that is specific to some part of the Earth's surface or near surface. It includes maps, of course, which abstract and present information about the locations of phenomena on the surface; it also includes images from the air or space (aerial photos or remotely sensed images) that capture the appearance of the surface using energy (either visible or invisible) radiated from it in some part of the electromagnetic spectrum. Such data were earlier defined as geospatial. In addition, geoinformation includes the contents of guidebooks, reports on specific areas, data sets with a geographic dimension, and any other information assets that serve to differentiate one geographic area from another. Finally, it includes information about the atmosphere above the surface, the geology below the surface, and the oceans that cover two-thirds of the planet's surface.
All of these information assets are characterized by having some form of associated geographic footprint, a boundary defining the geographic extent of the information, which is the defining characteristic of geoinformation as the term is used here. A map sheet has a footprint defined by its edges, whereas a guidebook to Moscow has a footprint of the city limits (or the city and the surrounding region). A photograph might have a footprint, defined as the area shown in the photograph; a piece of music (George Gershwin's "An American in Paris," for example) might also be associated with some particular location on the Earth's surface. Moreover, the footprint provides a useful way of finding information. Just as author, subject, and title are ways of finding information assets in a traditional library, so the footprint of geoinformation gives the library the ability to identify all those assets that fit a given geographic query. For example, if information assets in the library had a footprint, it would be possible to identify those assets relevant to a user wanting information on the state of Missouri, or the Caspian Sea, by determining whether the footprint of the asset matched the footprint of the query in whole or in part. It would be possible to ask the library to provide all available information about a given place that is relevant to a defined need, in other words "everything relevant about there."
While the space of a search based on author or subject is discrete, geographic space is continuous and multidimensional, and there is no limit to the number of distinct, unique footprints that exist. Any degree of overlap is possible between a footprint and a query, making search by place inherently more complex than search by other keys. Geographic location is sometimes recorded in the subject fields of library catalogs (for example, the Melvyl catalog of the University of California library system includes a place-related subject in about 30 percent of all records), and it is included in the Dublin Core standard. But distributed geolibraries would prioritize place as the primary key and thus would require that footprints be explicit in all cases.
Two distinct methods are available for specification of footprints. An area of interest may correspond to one or more place names, or recognized terms for describing location. Alternatively, the area may be defined by one or more bounding coordinates, in some recognized system such as latitude and longitude. To be compatible, the two methods require the services of a gazetteer, or an index that relates named places to coordinates. Gazetteers are commonly used to index atlases, though as the name suggests they typically include only places whose names have some level of official recognition.
The issues surrounding place as a search key are to some extent similar to those surrounding time, or date. All of the examples in Chapter 1 require search by place, in many cases qualified by relevant intervals or points in time; perhaps it is possible to devise parallel examples that would require search by time, possibly qualified by place, to motivate the development of chronolibraries. Similarly, an important but less compelling case can be made for a three-dimensional approach to space, based on examples of data that relate to points substantially above or below the Earth's surface.
Spatial keys are not unique to geoinformation, and there are parallels to other domains that may be useful and informative in the development of distributed geolibraries. For example, the Hytime hypermedia document structuring language (Newcombe et al., 1991) includes standards for specification of spatial windows in arbitrary coordinate systems within documents.
Geoinformation can be cumbersome for the traditional library because it comes in many forms, on different media, and because there is no simple basis for cataloging it. Instead, map libraries and other stores of geoinformation have had to maintain expensive and highly trained staffs to help users navigate through their information resources, and users have had to look to numerous sources to meet their geoinformation needs. Users of geoinformation were often highly trained experts, knowledgeable about sources, data quality, acronyms, and other tools of the geoinformation trade. In short, there has been no way for an average person to address a library with the query "tell me everything you have about that place that is relevant to me." Yet such queries are common and immensely important to a wide range of human activities, as the examples in the opening chapter illustrate.
Although it is helpful to think of a distributed geolibrary as a container of the digital equivalent of maps, that metaphor may also be unduly limiting. Geoinformation is not restricted to information that is static, or two-dimensional, but includes information on the dynamic processes and changes happening at a place, and three-dimensional data about the atmosphere and subsurface. But as noted earlier, the two horizontal dimensions are most likely to be the basis for search, possibly refined by time and the vertical dimension.
Characteristics of a Distributed Geolibrary
One way to think about a geolibrary (in a world of paper documents) is to imagine walking into a library building and being confronted not with a card catalog, or its modern digital equivalent, but with a giant physical globe. Suppose what is needed is information about a particular part of Patagonia, the southern extremity of Argentina, for a project on Charles Darwin, who visited Patagonia, or on the people of Welsh descent who live there, or on the works of author Bruce Chatwin, who wrote about his travels there. The library user finds Patagonia on the globe, points to it, and asks a nearby librarian about the relevant assets of the library. Some minutes later the librarian produces a list of those assets, with enough information to allow the user to evaluate their importance to the project. After the user narrows the list, the librarian disappears again, to return with the requested holdings.
Several aspects of this concept ensure that it has remained in the realms of fiction for as long as libraries have existed. Some aspects are technological. There is no way to build a physical globe that can be repositioned at will or magnified on demand to display greater and greater detail. Zooming would need to be possible over several orders of magnitude; a large globe might reasonably be expected to show features on the Earth's surface that are 10 km in size, including large lakes and large cities, but not features as small as a neighborhood; but the user of a geolibrary might well want to consider a single city block, which requires a resolution finer than 10 m, or a factor of 1,000 finer than the initial coarse view. Such resolutions are increasingly common in geospatial data.
In addition to resolution, a physical geolibrary would be difficult to build because many of its users would not be able to find their areas of interest on the globe. Not every user would be able to reposition and zoom to identify his or her own neighborhood, without the assistance of an expert. There are not enough resources to support the necessary expert librarians and no way to transform automatically a specified location into a list of assets. Finally, there is no way to shelve the many different types of information so that they can be easily retrieved and so that two sources of information on similar topics or areas are located near each other in the library. In other words, a physical geolibrary cannot be built.
In a digital world, however, all of these objections disappear, apparently without exception. It is possible to present the digital library user with a picture of a globe; search for locations by name, address, or any other suitable and convenient method; allow repositioning and zooming; search distributed archives for information assets whose footprints match the query, present them to the user in sufficient detail to permit evaluation; and deliver them for further examination and analysis. But although a geolibrary is possible in principle, there are countless technical, practical, economic, and institutional problems that will have to be overcome. Moreover, it is unclear how a geolibrary would deal with issues of intellectual property and how it could be paid for and whether the costs would be outweighed by the benefits. These issues are explored in greater detail in Chapter 3.
FIGURE 2.1. Distributed geolibraries as a third layer of services above the WWW and the Internet.
A distributed geolibrary would provide a much more sophisticated and powerful layer of services above the Internet and the WWW (Figure 2.1). The Internet provides the means of communication between computers, using the TCP/IP standard. The WWW is supported by the Internet, providing services that allow any user to access information provided by any server. But the combination of the two technologies falls far short of the services of a distributed geolibrary:
In other words, a distributed geolibrary would constitute a level of services above those provided by the Internet and the WWW, geared to specific user needs. Distributed geolibrary services offer the potential for more intelligent organization and access, for the creation of new knowledge through analysis of raw data, and for the solution of practical problems. As such, distributed geolibraries are one of a number of new types of Internet services that exploit previously impractical ways of organizing and presenting information.
DISTRIBUTED GEOLIBRARIES AND THE NATIONAL SPATIAL DATA INFRASTRUCTURE
"The National Spatial Data Infrastructure is the means to assemble geographic information 2 that describes the arrangement and attributes of features and phenomena on the Earth. The infrastructure includes the materials, technology, and people necessary to acquire, process, store, and distribute such information to meet a wide variety of needs" (National Research Council, 1993, p. 2, emphasis added). The concept emerged in the early 1990s in response to a number of potentially critical trends that were affecting the nation's supply of geospatial information and related services and institutions:
The Mapping Science Committee's report Toward a Coordinated Spatial Data Infrastructure for the Nation (National Research Council, 1993) and the efforts of many other individuals and agencies led in 1994 to Executive Order 12906, by which President Clinton ordered the development of the National Spatial Data Infrastructure (NSDI). Since then, several other committee reports and extensive efforts by the Federal Geographic Data Committee (FGDC), National States Geographic Information Council (NSGIC), National Association of Counties (NACO), and other groups have refined the concept of the NSDI and demonstrated its power and effectiveness (Tosta and Domaratz, 1997; Moeller, 1998; Rhind, 1999).
| Finding 3 |
|
The contents of a distributed geolibrary are not limited to information normally associated with maps or images of the Earth's surface but include any information that can be associated with a geographic location. In this sense the vision extends far beyond the context of the NSDI. |
When the NSDI was defined in 1993, few users or producers of geospatial data made much use of the Internet, and the WWW was virtually unknown; the first popular browser, Mosaic, was released by the National Center for Supercomputer Applications early that year. Although there was much emphasis on digital geospatial data, the primary method of dissemination was by magnetic tape; there were virtually no digital online catalogs of geospatial data and no methods for searching for data across computer networks. Moreover, since most useful geospatial data were produced by a small number of federal agencies, there was little problem locating the appropriate source. WAIS (Wide Area Information Service) was the first of several network-based technologies that rapidly changed the nature of geospatial data dissemination over the next few years. Today, applications on the WWW have grown into an enormously successful tool, and have had a profound impact on the entire environment for geoinformation acquisition (National Academy of Public Administration, 1998). At the same time, the WWW has presented a growing problem in its inability to deal effectively with the problems of discovering what geoinformation exists and locating an appropriate source, as the number of potential suppliers has mushroomed.
| Finding 4 |
|
The vision of the NSDI as expressed by the Mapping Science Committee in 1993 (National Research Council, 1993) did not anticipate the enormous impact and potential of the Internet and WWW. By emphasizing the problems of production of digital geoinformation, it underemphasized the importance of effective processes of dissemination to users of geoinformation. User communities are growing rapidly and are likely to grow even more rapidly if the current difficulties associated with finding geoinformation on the Internet can be addressed. |
This report and related efforts in general can be understood therefore as an updating of the Mapping Science Committee's concept of the NSDI in the era of the WWW. In organizing this effort and producing this report, the committee is expressing its view that the WWW has added a new and radically different dimension to its earlier conception of the NSDI, one that is much more user oriented, much more effective in maximizing the value of the nation's geoinformation assets, and much more cost effective as a data dissemination mechanism. Distributed geolibraries reflect the same basic thinking about the future of geospatial data, with its emphases on sharing, universal access, and productivity but in the context of a technology that was not widely accessible prior to 1993.
| Finding 5 |
|
Distributed geolibraries provide a useful framework for discussion of the issues of dissemination associated with the NSDI. The vision is readily extendible to a global context. |
The NSDI fits well with the description of infrastructure provided by Star and Ruhleder (1996, pp. 111-112):
"It is both engine and barrier for change; both customizable and rigid; both inside and outside organizational practices. It is product and process. With the rise of decentralized technologies used across wide geographical distance, both the need for common standards and the need for situated, tailorable and flexible technologies grow stronger."
Their defining dimensions of infrastructure provide useful guidance to the development of distributed geolibraries: they would be embedded in other structures, social arrangements, and technologies; their reach or scope would extend beyond a single site or practice; their procedures would be learned as part of membership of an organization or group; they would be linked with conventions or practice of day-to-day work; they would be the embodiment of standards and would build upon an installed base; and they would be visible on breakdown, since we would be most aware of them when they failed to work.
DISTRIBUTED GEOLIBRARIES AND DIGITAL EARTH
Distributed geolibraries bear a strong resemblance to certain aspects of the concept of Digital Earth, a concept that was defined by Vice President Gore in January 1998 and summarized in a speech given in Los Angeles. The vision is aptly summarized in the following extract: Imagine, for example, a young child going to a Digital Earth exhibit at a local museum. After donning a head-mounted display, she sees Earth as it appears from space. Using a data glove, she zooms in, using higher and higher levels of resolution, to see continents, then regions, countries, cities, and finally individual houses, trees, and other natural and man-made objects. Having found an area of the planet she is interested in exploring, she takes the equivalent of a 'magic carpet ride' through a 3-D visualization of the terrain. Of course, terrain is only one of the numerous kinds of data with which she can interact. Using the system's voice recognition capabilities, she is able to request information on land cover, distribution of plant and animal species, real-time weather, roads, political boundaries, and population. She can also visualize the environmental information that she and other students all over the world have collected as part of the GLOBE project. This information can be seamlessly fused with the digital map or terrain data. She can get more information on many of the objects she sees by using her data glove to click on a hyperlink. To prepare for her family's vacation to Yellowstone National Park, for example, she plans the perfect hike to the geysers, bison, and bighorn sheep that she has just read about. In fact, she can follow the trail visually from start to finish before she ever leaves the museum in her hometown.
She is not limited to moving through space, but can also travel through time. After taking a virtual field-trip to Paris to visit the Louvre, she moves backward in time to learn about French history, perusing digitized maps overlaid on the surface of the Digital Earth, newsreel footage, oral history, newspapers and other primary sources. She sends some of this information to her personal e-mail address to study later. The time-line, which stretches off in the distance, can be set for days, years, centuries, or even geological epochs, for those occasions when she wants to learn more about dinosaurs.
Digital Earth is also the title of a project of several years' standing at NASA's Goddard Space Flight Center, which also contains elements of the Vice President's vision. It is also associated with a plan to place a satellite (tentatively named "Triana") between the Earth and the Sun to deliver real-time images of the sunlit Earth to a global audience.
Like distributed geolibraries, Digital Earth is about making use of the vast but uncoordinated masses of geoinformation now becoming available via the Internet and about presenting it in a form that is readily accessible to the general user. Like distributed geolibraries, its central metaphor for the organization of information is the surface of the Earth and place as a key to information access. In a similar vein the U.S. Geological Survey is exploring the Earth's surface as the organizing metaphor for public access to its data resources, and similar ideas are surfacing in other agencies (see Appendix D).
Learning about places on the Earth is a strong theme in Vice President Gore's vision for Digital Earth and a strong motivation for distributed geolibraries. While the prevailing metaphor for human-computer interaction is the office or desktop, that metaphor may not be particularly helpful in organizing information about the Earth. Instead, access to a distributed geolibrary could be through the visual metaphor of the Earth's surface itself; a student interested in Thailand would manipulate a globe on screen until it centers on Thailand and then zoom in for more detail, as in the Digital Earth vision. Distributed geolibraries might make a useful contribution to the educational opportunities of digital libraries, as outlined, for example, in previous reports on digital libraries for science, mathematics, engineering, and technical education (see Corportaion for National Research Initiatives, 1998; National Research Council, 1998).
The library service model that underlies the concept of distributed geolibraries provides a useful way of structuring discussion and of thinking about the resources and research that will be needed to make the vision a reality. Chapter 3 discusses some of the societal and institutional challenges to realizing distributed geolibraries. Addressing many of these policy issues is crucial to creating a conducive atmosphere for considering the potential services and functions of distributed geolibraries (see Chapter 4) and the technical developments needed to build distributed geolibraries (see Chapter 5).
Note 1
The workshop (and this report) focused on the discovery, access, integration,
and use of geoinformation. Other technical issues (e.g., archiving, quality
control and assurance, standards development, telecommunications and computational
capabilities), although critical in the development of the distributed geolibrary
concept, were not extensively considered.
2 The term geographic information here is synonymous with geospatial data as defined in Chapter 1. But as noted earlier in this chapter, many additional types of information qualify as geoinformation by virtue of having a geographic footprint.
The Distributed Geolibrary in Societal and Institutional Context
Implementation of a distributed geolibrary presents a host of challenges, ranging
from the technical to the societal and institutional. The latter are discussed
in this chapter; technical issues are discussed in Chapter 4.
The policy challenges presented by distributed geolibraries include the following:
This chapter addresses many of these issues from the perspective of geoinformation at the local level, how distributed geolibraries might build off the library model (and how traditional libraries have addressed or handled some of these societal and institutional issues), and some of the additional issues introduced by the digital context of distributed geolibraries. These issues are not necessarily unique to distributed geolibraries as many have been discussed extensively within the context of recent digital library programs. The intention here is not to review or paraphrase excellent surveys of the social context of digital libraries, such as that of Borgman et al. (1996), which readers interested in a broader perspective should consult.
LOCAL FOCUS
Five years ago discussions regarding geospatial data in the United States focused on the rapidly increasing use of such data throughout society and the need to create a more formal infrastructure to coordinate geospatial data coverage across the nation, minimize redundant data collection at all levels, and create new opportunities for use throughout the nation (National Research Council, 1993). Much has been accomplished. Concepts such as metadata standards, standard framework databases, and thematic databases have been developed and pursued (see www.fgdc.gov). The federal government in cooperation with state and local governments has been and continues to be well positioned to lead the development of the basic concepts and public domain databases upon which the NSDI is being built.
The NSDI now involves many stakeholders as a result of activities over the past five years. Its basic data will be assembled from diverse institutions throughout the nation, with institutions contributing those parts that are most relevant to their roles (Tosta and Domaratz, 1997; Moeller, 1998; Rhind, 1999). At the core of this vision is the concept of local generation of geoinformation. Geoinformation is inherently local in nature and of greatest importance to those in that local area. It makes sense that the tens of thousands of units of local governments in the United States understand their own geoinformation assets and needs far better than do higher levels of government.
New developments in technology make it possible for local people to gather local data germane to their own needs more readily, extract data from online and other electronic repositories, develop the information products they need, use the products for decision making, and contribute their locally gathered geoinformation and derived products to libraries or other repositories. Developing the technical and institutional means to support incorporation of local knowledge into networked repositories presents a novel challenge.
Stakeholders across the nation are beginning to think and act around more common visions for the NSDI. A library service model provides an initial way to consider the organizational and institutional arrangements for finding and accessing the geoinformation assets and digital products being generated by numerous stake-holders across the nation.
LIBRARY CONSIDERATIONS
The Library as an Institution
In considering possible institutional arrangements for distributed geolibraries, we begin with the assumption that libraries are social institutions that will continue to change but will not be made obsolete by the advent of electronic publishing. Indeed, distributed geolibraries and digital libraries in general will complement the traditional activities of libraries and related institutions. Libraries respond to many complex societal needs. They are used for research, teaching, self-learning, and entertainment. They serve as social and activity centers for many communities, whether these be small towns, neighborhoods, or institutions. The opportunities that libraries provide range from learning about practical matters to exploring science, art, history, or literature for the sheer pleasure of doing so. They are places for children to learn how to read and places for disadvantaged members of communities to seek solutions and solace (Crawford and Gorman, 1995, p. 118). The library system serves as a repository and by doing so preserves most aspects of our culture. Libraries range from small to large, urban to rural, and public to private but cooperate through a common professional culture and set of procedures, sharing information for mutual benefit. In short: "libraries exist to acquire, give access to, and safeguard carriers of knowledge and information in all forms and to provide instruction and assistance in the use of the collections to which their users have access" (Crawford and Gorman, 1995, p. 3).
Libraries have incorporated information technologies in all aspects of library services. Most recently, libraries have embraced network-based programs that support collaboration among institutions and the sharing of resources. In addition, consortia have been established on state, regional, and library-type bases throughout the United States to share information, negotiate licenses, engage in collection development, and for many other purposes. A useful distributed geolibrary of the future will need to participate in these activities as an entity that will accumulate, make available, and conserve electronic carriers of georeferenced knowledge.
Economic Considerations
Existing public libraries do not buy most books or subscribe to most magazines or journals, yet they are highly valued by the estimated two-thirds of American adults who use them (Crawford and Gorman, 1995, p. 127). A typical robust public library will lend out 10 items per person per year based on the population served by the library and will answer two questions per person per year for its service population. Typical circulation of a robust library is twice its content (i.e., a library with a collection of 1 million volumes will lend out 2 million volumes during the year). In-library use of volumes in poor and rural communities often exceeds circulation, and in-library use at academic libraries often exceeds circulation by two to three times.
Public libraries provide these high use and service rates at a cost of approximately five cents per day per capita for their service population, while public libraries in economically healthy areas aspire to 10 cents per day per capita as a reasonable starting point for funding a robust library (Crawford and Gorman, 1995, p. 139). These expenditures appear to be a bargain for the access and services provided, and any proposal for supplanting current library services with electronic services would need to compare costs realistically.
Conversely, would an electronic digital library be available to at least the two-thirds of American adults who currently use existing libraries? Would it serve children and the disadvantaged to the same extent or greater than existing library facilities and resources?
There is an economic conundrum that in the face of a proportionately higher demand some communities might not have the available resources to support distributed electronic delivery services, even though the delivery technology is dropping in price. In terms of distributed geolibraries, this may be an issue, as a recent survey of public libraries in Colorado (Gayon, 1998) indicates that rural libraries receive a larger than expected proportion of requests for geographic information (maps, images, and digital data).
Libraries have the effect, although not a priority purpose, of introducing library users to works, authors, and publishers. Libraries thereby serve the economic function of creating markets for intellectual works. Would a geolibrary have the same effect? These are some of the institutional questions that will need to be addressed as the technological capabilities for distributed geolibraries are built over time.
Distributed Geolibraries and the Existing Library Institution
Might distributed geolibraries develop as part of existing library arrangements or complement them? Although the possibility exists that distributed geolibraries might develop in tandem with libraries and be interconnected with them, the duplication of all the roles of libraries in a new institutional environment would make little sense. A useful analysis of these issues is presented by Hawkins (1994) in the context of digital libraries. Indeed, the way distributed geolibraries evolve will depend in large part on access to resources in existing library institutions.
Some of those things that traditional libraries have never been able to do well might be better done by digital means. One of these functions might be the provision of access to geoinformation. The size and shape of the sheets on which paper maps are produced often depend on the information or the story that the cartographer is attempting to convey graphically, the scale required to present information adequately, and the shape of the geographic area being addressed. Owing to the wide variability in map sizes and the nonstandard placement of information on them, the classification, cataloging, and storage of maps have been far more problematic for librarians than handling books, journals, magazines, and recordings. Thus, in some instances, maps may be ineffective uses of print on paper, and many maps might be better represented, accessed, and used in digital form.
Thus, the advent of distri