View publication

Title Semantics and Canonicalisation of SPARQL 1.1
Authors Jaime Salas, Aidan Hogan
Publication date November 2022
Abstract We define a procedure for canonicalising SPARQL 1.1
Specifically, given two input queries that return the same solutions modulo
variable names over any RDF graph (which we call congruent queries), the
canonicalisation procedure aims to rewrite both input queries to a
syntactically canonical query that likewise returns the same results modulo
variable renaming. The use-cases for such canonicalisation include caching,
optimisation, redundancy elimination, question answering, and more besides.
To begin, we formally define the semantics of the SPARQL 1.1 language,
including features often overlooked in the literature. We then propose a
canonicalisation procedure based on mapping a SPARQL query to an RDF graph,
applying algebraic rewritings, removing redundancy, and then using canonical
labelling techniques to produce a canonical form. Unfortunately a full
canonicalisation procedure for SPARQL 1.1 queries would be undecidable. We
rather propose a procedure that we prove to be sound and complete for a
decidable fragment of monotone queries under both set and bag semantics, and
that is sound but incomplete in the case of the full SPARQL 1.1 query
language. Although the worst case of the procedure is super-exponential, our
experiments show that it is efficient for real-world queries, and that such
difficult cases are rare.
Pages 829-893
Volume 13
Journal name Semantic Web
Publisher IOS Press (Amsterdam, The Netherlands)
Reference URL View reference page