||
Database systems have long used "declarative" languages, in which programmers focus on program outcomes (what) rather than implementation (how.) In recent years, our group has demonstrated that recursive declarative languages and runtime engines are an excellent match for building distributed and networked systems. Our declarative networking approach provides radically simplified, efficient implementations of tasks as diverse as distributed query processing, statistical inference, distributed agreement, and core networking protocols. We have demonstrated that declarative programs a few dozen lines long compete with C++ implementations that are tens of thousands of lines long. Our software includes the P2 system for declarative overlay networks on the Internet, and the DSN system for declarative programming of wireless sensor networks.
Berkeley is the multidisciplinary leader in wireless sensor network research. Sensor networks, and related technologies like RFID infrastructures, are by their nature tools for data acquisition and management, and Berkeley's database group has played a key role in this space. We developed the TinyDB sensornet query engine, the first system to provide a high-level language and runtime for tasking entire"clouds" of sensors in a simple way. We designed probabilistic methods for energy-efficient approximation of sensornet queries and distributed triggers, as well as statistical methods to clean noisy data coming from unpredictable RFID readers. The Declarative Sensor Network (DSN) project described above investigates the use of deductive database techniques to programming entire sensornet "stacks", from core networking internals to high-level data management.
Several real-world applications need to effectively manage large amounts of data that are inherently uncertain, employing sophisticated probabilistic modeling tools to accurately reason about complex correlation/causality patterns in the data. Example applications include sensor-rich, "smart-home" environments and bioinformatics databases, where noisy, uncertain data is the norm and probabilistic models are used, e.g., to infer user activities or reason about protein molecule structures. We are working to redefine the algorithms and architecture of a DBMS to effectively manage uncertainty and probabilistic reasoning as "first-class citizens" of the system. This includes novel techniques for (a) exposing statistical modeling structures and inference algorithms to key DBMS components (e.g., query engine, query optimizer), and (b) supporting a uniform, declarative means for higher-level applications to store, query, and learn from such probabilistic data.
Traditional data management has assumed a stored repository of information. Recent years have seen a proliferation of streaming data sources, including sensor networks, financial data feeds, and monitors of networks and software services. Stream data management raises a number of new challenges in adaptively processing multiple queries, managing fault tolerance, dealing with archives, and providing approximate answers in overload situations. Berkeley's database group has been a leader in this area, investigating these issues and more in the context of the Telegraph project for adaptive processing of stream queries, and in the YFilter XML message broker.
Archiver|手机版|科学网 ( 京ICP备07017567号-12 )
GMT+8, 2024-5-29 07:42
Powered by ScienceNet.cn
Copyright © 2007- 中国科学报社