BDEI PI’s Workshop: Reporting Results & Research Prospects

EIA-0310659

Principal Investigator

Judith Bayard Cushing
Scientific Inquiry (Computer Science)
The Evergreen State College
2700 Evergreen Parkway
Olympia WA
98505-0002
(360) 867-6652
(360) 867-5430
judyc@evergreen.edu
http://academic.evergreen.edu/j/judyc

Keywords

Biodiversity and Ecosystem Informatics
Semantics (Ontologies and Glossaries)
Database Schema Evolution and Spatial Data Support
Sensing Technologies
Modeling, Visualization and Analysis Infrastructure

Project Summary

In February, 2003, Principal Investigators of Biodiversity and Ecosystem Informatics (BDEI CISE-EIA) planning and incubation grants came together with agency representatives (NSF, USGS and NASA) and others to report research results and refine the BDEI research agenda.  Participants emphasized that biodiversity and ecosystem connections to climate, agriculture, resource management, recreation, and public health make ecosystem informatics a high priority for society.  There was unanimous recommendation that funding agencies focus research in the following areas:

 

  Semantics, metadata and data provenance support, including terminology management with ontologies and glossaries, attaching metadata to data, and other data annotation.

  Adaptive, flexible database schemas, domain-specific data types and schema management, including specialized support for spatio-temporal data and support for different data models.

Data acquisition, documentation and cleaning, including retroactive data capture to speed up digitization of data from static media, in situ and remote sensing technologies, all with automated or semi-automated metadata acquisition and management.

   Modeling and analysis infrastructure, including better mathematical and statistical models of organisims and ecological and biological systems, better visualization, integration of data models with models, and uncertainty and missing data management.

 

Workshop participants further emphasized that the social context of technology is critical.  Research in the above areas must be considered in terms of a variety of researchers, students, resource and information managers (including the US Forest Service, Bureau of Land Management and National Parks), and policy makers.  Furthermore, research products must coalesce into the cyber infrastructure needed for tomorrow’s ecologists. 

 

The BDEI research agenda overlaps with that developed at a National Library of Medicine (NLM) sponsored Workshop on Data Management for Molecular and Cell Biology.  Considerable synergy could be generated by capitalizing on this overlap.

The 15 current BDEI research projects are categorized into four areas:

 

Semantic Data Integration.  Data needed to address critical questions in ecology are scattered, heterogeneous, and complex. Large volumes of diverse data types must be semantically integrated. Early integration efforts generally assumed that terms and formats (syntax) and meaning of terms (semantics) were the same. In the web environment and the decentralized, heterogeneous information collections relevant to ecological research such assumptions do not hold. Research approaches include thesauri and ontologies, but metadata services are also critical.  Existing digital gazetteers (USGS GNIS and DMA) are a good start, but typically lack spatial definition and ecology-specific data structures. Agent architectures show promise for linking ontologies with various metadata services.

 

Spatio-temporal Data & Sensing Technologies.  New sensors – aloft, among, and in situ – yield more and more data.  A recent confluence of technologies – orbital sensors complemented by mobile citizen-scientists teams and specialists equipped with geo-referential proximal sensors and robust wireless networks of reactive sensing agents – result in data streams rich in dimensionality and across a wide range of spatial, temporal, spectral, radiometric, thematic, and taxonomic scales.  Robust representations capturing the complexities of environmental patterns and processes, are required.  Flexible database schema and sophisticated queries, algorithms for spatio-temporal analysis, wireless reactive agent networks and tools for metadata acquisition, management and interpretation are the most critical research areas.  Enhanced sensory presentation of environmental data encompassing visual, sonic, and haptic feedback systems are needed to explore high-dimensional data spaces.  

 

Modeling and Forecasting.  Advanced frameworks, including hardware for adaptive and intelligent systems and high performance computing are needed.  Memory and speed continue as important bottlenecks, so modular models, model coupling, and hardware grid /distributed computing are required.  Model-Data interaction is key since many problems could be informed by existing large data sets, but inference methods for high dimensional problems are lacking.  Data are indirect, massively unbalanced with missing observations, and subject to stochasticity.  Processes interact at a range of scales and hidden ‘parameters’ are often more like variables. Data exhibit variability, uncertainty, and complexity.  Meaningful visualization must be part of working models on which measurements may be made and change simulated.   Biodiversity science requires more efficient algorithms, ways to deal with incomplete knowledge, better understanding of spanning spatio-temporal scales, and techniques for high dimensional problems.   Derived data products and provisioning data at multiple places are also critical.

 

Putting it into Practice.  Advancement hinges on closing various “digital divides”, shaping an ecoinformatics culture and developing socio-technological partnerships.  We must solve social and ethical conundrums among various players: industry and science; bioinformatics and ecosystem informatics; ecologists conducting “big science” and  traditional individual ecologists working in relative isolation; scientists and information managers; professional and citizen scientists; and the tripartite scientists, resource managers and policy makers. Telecommunications and consumer electronics industry R&D budgets, as well as  those in defense, health services and even bioinformatics, dwarf ecosystem informatics budgets; BDEI should look there for technologies, keeping in mind some unique requirements. Stakeholders should adapt open source models, in particular for specialized robots to digitize museum collections. Even with appropriate technology, extensive user training will be required.  Finally, technology is not enough:  who should contribute information, who should have access, and how should research resources be allocated for information management?

 

BDEI Workshop Co-Organizers and Report Co-Authors:  Kate Beard-Tisdale (U. Maine), Kathleen Bergen (U. Michigan), Jim Clark (Duke), Geof Henebry (U. Nebraska), Eric Landis (Natural Resources Information Management), David Maier (Oregon Graduate Institute), John Schnase (NASA), Rob Stevenson, (U. Mass. Boston).

Publications and Products

·         February 11, 2003, Biodiversity and Ecosystem Informatics Workshop

·         May 2003 Panel Report and Birds of Feather Session at DGO

Project Impact

We aim to inform funding directions in NSF, USGS and NASA by identifying key areas where informatics research will significantly benefit biodiversity and ecology research, resource management and policy. The report will inform computer science researchers of research opportunities and potential applications, and software and hardware vendors who seek opportunities for new products for researchers and citizen ecologists.

Goals, Objectives and Targeted Activities

To organize and conduct the February 2003 workshop, write a report identifying the research agenda for biodiversity and ecosystem informatics, and disseminate these results to funding agencies and the research community. Identifying research needs = to the computer science community in technical detail is of high priority.

Area Background

Ecologists and biologists have articulated a need for better technology for biodiversity and ecology research. As a result, in June 2000, an NSF-NASA-USGS sponsored workshop brought biologists, ecologists and resource managers together with computer scientists to identify the CS/IT research issues that impede biodiversity and ecosystem research and ecosystem information infrastructure. Their report sparked the National Science Foundation to issue a call for proposals and 15 awards were made in late 2001. This project coordinates results of that research into a cohesive report.

Area References

 

 

 

 

 

·         Workshop on Data Management for Molecular,Cell Biology http://www.lbl.gov/~olken/wdmbio

Project Websites

canopy.evergreen.edu/bdeipi
 BDEI Workshop Web Site, including agenda, presentations, descriptions of BDEI Projects.

www.evergreen.edu/bdei/2003
BDEI 2003 Workshop Report.