DigitalGovernment.org - Home of the Nat'l. Science Foundation Digital Government Research Program
menu 1
menu 2
menu 3
menu 4
   

dg.o Web

DGRC Energy Data Collection

Project website

Primary Investigator

Email

Institution

Arens, Yigal

arens@isi.edu

USC / ISI

Abstract
The massive amount of statistical and text data available, largely from Federal Agencies, has created a set of daunting challenges to both research and analysis communities. These problems include heterogeneity, size, distribution, and control of terminology. Both for the expert and for the novice, obtaining information from government data sources can be daunting. We propose solutions to four key problems in accessing large distributed data collections, namely, (1) ontological mappings for terminological control; (2) data integration with high speed query processing; (3) interfaces for query input across data bases and presentation of results; and (4) distributed data analysis and data mining.

This work is being performed within DGRC: The Information Sciences Institute of the University of Southern California and Columbia University Digital Government Research Center.

The mission of DGRC in general is the design and development of advanced information systems with capabilities for generating, sharing and interacting with knowledge in a networked environment. The Energy Data Collection (EDC) project develops such technology in the context of the immense amounts of energy-related statistical data already in the possession of, and continually being added to by the Department of Energy, the Bureau of the Census, the Bureau of Labor Statistics and others.

Only a portion of energy data is currently generally accessible, and much of that is provided in printed form. The proposed work will result in the development of the a new information system that will support real-time integrated viewing and manipulation of energy-related data from government sources. EDC will be usable by industry representatives, environmental groups, policy makers, members of the media, teachers, students, and all other interested parties. Among the results of the proposed effort will be a publicly accessible Web site. Constructing the EDC will require addressing some fundamental problems, among them the proliferation of incomparable terminologies and the difficulty of formulating queries against distributed unfamiliar data sources.

A deployed pilot EDC system is a central element of this proposal. In addition to helping focus the research, it will demonstrate the potentially dramatic increase in the ability to handle complex (and legacy) statistical data made possible by the application of advanced information systems research.

DGRC brings together a strong team of researchers and developers with interests and experience in information systems. Participants in project are drawn from Columbia University's Department of Computer Science, from the University of Southern California's Information Sciences Institute. They are aided and supported by technical experts from Federal statistics agencies.


dg.o > archive: demos | library | links | collaborate | mission | news | research

contact | faq | policies | site map


This site and the dg.o Communications Office are maintained by the University of Southern California's Information Sciences Institute, a member of the Digital Government Research Center
FEEDBACK: Please forward questions & comments on this site to the dg.o Communications Office.