|
Data Sharing and Querying for Peer-to-Peer Data
Management Systems
Peer-to-peer computing consists of an open-ended network of distributed
computational peers, where each peer shares data and services with a set
of other peers, called its acquaintances. The peer-to-peer paradigm was
initially popularized by file-sharing systems such as Napster and Gnutella,
but its basic ideas and principles have now found their way into more
critical and complex data-sharing applications like those for electronic
medical records and scientific data. In such environments, data sharing
poses new challenges mainly due to the lack of centralized control, the
transient nature of inter-peer connections, and the limited, ever-changing
cooperation among the peers.
In the seminar we
can present new solutions for data sharing and querying in a
peer-to-peer data management system, that is, a peer-to-peer system where
each peer manages its own database. The solutions are motivated by
considering data sharing requirements of independent biological data
sources. To support data sharing in such a setting, I propose the use of
mapping tables containing pairs of corresponding data values that reside
in different peers. I illustrate how automated tools can help manage the
tables by checking their consistency and by inferring new tables from
existing ones. To support structured querying, I propose a framework in
which local user queries are translated, through mapping tables, to a set
of queries over the acquainted peers. Finally, I present optimization
techniques that enable an efficient rewriting even over large mapping
tables. The proposed mechanisms have been implemented and evaluated
experimentally and constitute the foundation of a prototype implementation
of an architecture for peer-to-peer data management
|