UNCERTAINTY IN CYBERINFRASTRUCTURE: RESULTS, ALGORITHMS, CHALLENGES, AND REQUEST FOR COLLABORATION

Ann Gates, Vladik Kreinovich, Paulo Pinheiro da Silva, Craig Tweedie, Leonardo Salayandia, and Christian Servin
Center of Excellence for Sharing resources for the Advancement of Research and Education through Cyberinfrastructure Cyber-ShARE
University of Texas at El Paso (UTEP) 500 W. University, El Paso, TX 79968
http://trust.cs.utep.edu/cybershare/
emails: agates@utep.edu , vladik@utep.edu , paulo@utep.edu , ctweedie@utep.edu , leonardo@utep.edu , christians@minersutep.edu

To be presented by
Vladik Kreinovich
Dept. of Computer Science, UTEP
vladik@utep.edu

In many practical projects ranging from geological to biological to environmental applications, it is necessary to process data gathered by different techniques at different geographic locations. A few decades ago, communications were much slower than computations; as a result, to perform such data processing, it was necessary to first physically bring all the data to the processing computer. At present, with the fast Internet connections, it is possible to keep the data at their locations, and to move necessary parts of data "on demand" in real time (if necessary, automatically converting the moved data to a uniform format). Special cyberinfrastructure (CI) is being designed and built to enhance and automate such data processing. CI enables researchers and practitioners to access and process all the relevant information -- and not to worry about the geographic location and format of different pieces of information.

In the past, UTEP researchers have actively participated in the CI for the Geosciences (GEON) and Circum-Arctic Environmental Observatory Network (CEON) CI projects. The main objective of our NSF-sponsored center is to use this experience for developing general CI tools and for helping researchers from other disciplines to establish and improve their CI.

One of the most important challenges in the use of CI comes from the need to gauge uncertainty of the results of data processing -- and also to establish provenance and degree of trust in this data. There are several reasons why the existing statistical techniques for gauging uncertainty need to be adjusted and modified for CI applications:

* in CI, we typically process an unusually large amount of data;

* this data is very heterogenous, with different uncertainty information given for different data point;

* these data points are located at different geographic locations; and

* we need to get the uncertainty (accuracy) estimates in real time.

When the resulting uncertainty is too high for a given application, we face another challenging problem: to develop and schedule additional measurements (maybe with additional sensors) that will decrease the uncertainty to the desired level.

In this talk, we briefly overview the existing techniques for gauging uncertainty in CI, including techniques developed by our center.

The main objective of our center is to enhance applications of CI. We therefore welcome new practical problems in which there is need for CI and for the corresponding uncertainty estimation. We expect some of these problems to be similar to GEON and CEON ones; for such problems, in collaboration with researchers working on these problems, we will be able to apply (and, if necessary adjust and modify) our CI techniques. We also expect that some practical problems will lead to new challenges and thus, the development of new techniques for gauging uncertainty in CI.

Click here to access the talk slides