Peer-to-peer computing consists of an open-ended network of distributed computational peers, where each peer shares data and services with a set of other peers, called its acquaintances. The peer-to-peer paradigm was initially popularized by file-sharing systems such as Napster and Gnutella, but its basic ideas and principles have now found their way into more critical and complex data-sharing applications like those for electronic medical records and scientific data. In such environments, data sharing poses new challenges mainly due to the lack of centralized control, the transient nature of inter-peer connections, and the limited, ever-changing cooperation among the peers.
In the seminars we can present new solutions for data sharing and querying in a peer-to-peer data management system, that is, a peer-to-peer system where each peer manages its own database. The solutions are motivated by considering data sharing requirements of independent biological data sources. To support data sharing in such a setting, I propose the use of mapping tables containing pairs of corresponding data values that reside in different peers. I illustrate how automated tools can help manage the tables by checking their consistency and by inferring new tables from existing ones. To support structured querying, I propose a framework in which local user queries are translated, through mapping tables, to a set of queries over the acquainted peers. Finally, I present optimization techniques that enable an efficient rewriting even over large mapping tables. The proposed mechanisms have been implemented and evaluated experimentally and constitute the foundation of a prototype implementation of an architecture for peer-to-peer data management