Publications

Stats

View publication

Title Differential Privacy and SPARQL
Authors Carlos Buil-Aranda, Jorge Lobo, Federico Olmedo
Publication date 2024
Abstract Differential privacy is a framework that provides formal
tools to
develop algorithms to access databases and answer statistical queries with
quantifiable accuracy and privacy guarantees. The notions of differential
privacy are defined independently of the data model and the query language
at steak. Most differential privacy results have been obtained on
aggregation queries such as counting or finding maximum or average values,
and on grouping queries over aggregations such as the creation of
histograms. So far, the data model used by the framework research has
typically been the relational model and the query language SQL. However,
effective realizations of differential privacy for SQL queries that required
joins had been limited. This has imposed severe restrictions on applying
differential privacy in RDF knowledge graphs and SPARQL queries. By the
simple nature of RDF data, most useful queries accessing RDF graphs will
require intensive use of joins. Recently, new differential privacy
techniques have been developed that can be applied to many types of joins in
SQL with reasonable results. This opened the question of whether these new
results carry over to RDF and SPARQL. In this paper we provide a positive
answer to this question by presenting an algorithm that can answer counting
queries over a large class of SPARQL queries that guarantees differential
privacy, if the RDF graph is accompanied with semantic information about its
structure. We have implemented our algorithm and conducted several
experiments, showing the feasibility of our approach for large graph
databases. Our aim has been to present an approach that can be used as a
stepping stone towards extensions and other realizations of differential
privacy for SPARQL and RDF.
Pages 745-773
Volume 15
Journal name Semantic Web
Publisher IOS Press (Amsterdam, The Netherlands)