Dissertation DIS-2015-10

Bibliograph.
Daten
Benzing, Andreas: Distributed Stream Processing in a Global Sensor Grid for Scientific Simulations.
Universität Stuttgart, Fakultät Informatik, Elektrotechnik und Informationstechnik, Dissertation (2015).
214 Seiten, englisch.
CR-Klassif.C.2 (Computer-Communication Networks)
Kurzfassung

With today's large number of sensors available all around the globe, an enormous amount of measurements has become available for integration into applications. Especially scientific simulations of environmental phenomena can greatly benefit from detailed information about the physical world. The problem with integrating data from sensors to simulations is to automate the monitoring of geographical regions for interesting data and the provision of continuous data streams from identified regions. Current simulation setups use hard coded information about sensors or even manual data transfer using external memory to bring data from sensors to simulations. This solution is very robust, but adding new sensors to a simulation requires manual setup of the sensor interaction and changing the source code of the simulation, therefore incurring extremely high cost. Manual transmission allows an operator to drop obvious outliers but prohibits real-time operation due to the long delay between measurement and simulation. For more generic applications that operate on sensor data, these problems have been partially solved by approaches that decouple the sensing from the application, thereby allowing for the automation of the sensing process. However, these solutions focus on small scale wireless sensor networks rather than the global scale and therefore optimize for the lifetime of these networks instead of providing high-resolution data streams. In order to provide sensor data for scientific simulations, two tasks are required: i) continuous monitoring of sensors to trigger simulations and ii) high-resolution measurement streams of the simulated area during the simulation. Since a simulation is not aware of the deployed sensors, the sensing interface must work without an explicit specification of individual sensors. Instead, the interface must work only on the geographical region, sensor type, and the resolution used by the simulation. The challenges in these tasks are to efficiently identify relevant sensors from the large number of sources around the globe, to detect when the current measurements are of relevance, and to scale data stream distribution to a potentially large number of simulations. Furthermore, the process must adapt to complex network structures and dynamic network conditions as found in the Internet. The Global Sensor Grid (GSG) presented in this thesis attempts to close this gap by approaching three core problems: First, a distributed aggregation scheme has been developed which allows for the monitoring of geographic areas for sensor data of interest. The reuse of partial aggregates thereby ensures highly efficient operation and alleviates the sensor sources from individually providing numerous clients with measurements. Second, the distribution of data streams at different resolutions is achieved by using a network of brokers which preprocess raw measurements to provide the requested data. The load of high-resolution streams is thereby spread across all brokers in the GSG to achieve scalability. Third, the network usage is actively minimized by adapting to the structure of the underlying network. This optimization enables the reduction of redundant data transfers on physical links and a dynamic modification of the data streams to react to changing load situations.

Volltext und
andere Links
PDF (3698007 Bytes)
Abteilung(en)Universität Stuttgart, Institut für Parallele und Verteilte Systeme, Verteilte Systeme
BetreuerProf. Dr. rer. nat. Dr. h.c. Kurt Rothermel
Eingabedatum13. Mai 2016
   Publ. Abteilung   Publ. Institut   Publ. Informatik