Highly Distributed Queries on Personal Data Management Systems with Strong Privacy Guarantees

Dr. Iulian Sandu Popa
Versailles Saint-Quentin-en-Yvelines University


Abstract

The time of individualized management and control over one’s personal data is upon us. Thanks to smart disclosure initiatives (Blue and GreenButton in US, MesInfos in France, Midata in UK) and the right to data portability in the European GDPR legislation, we can access our personal data from the companies or government agencies that collected them. Concurrently, Personal Data Management System (PDMS) solutions are flourishing. Their goal is to offer a (secure) data platform allowing us to easily integrate, use and share all our personal data, and thus empower us to leverage our personal data for our own good and in the benefit of the community. This movement produces a significant paradigm shift since personal data becomes massively distributed. In this context, one (of the many) important issue is: how can users/applications query this massively distributed data in an efficient and privacy-preserving way? In this talk, we present a distributed protocol capable of identifying the pertinent nodes for a query, querying the nodes and aggregating their results in an efficient and secure way. Our protocol leverages a classical DHT peer-to-peer network to efficiently index and query users’ metadata and data, while offering strong privacy guarantees to users. The results indicate that the protocol is robust to advanced lab attacks corrupting a significant number of nodes in the network.