Publications

Stats

View publication

Title Similarity Joins and Clustering for SPARQL
Authors Sebastian Ferrada, Benjamin Bustos, Aidan Hogan
Publication date October 2024
Abstract The SPARQL standard provides operators to retrieve exact
matches
on data, such as graph patterns, filters and grouping. This work proposes
and evaluates two new algebraic operators for SPARQL 1.1 that return
similarity-based results instead of exact results. First, a similarity join
operator is presented, which brings together similar mappings from two sets
of solution mappings. Second, a clustering solution modifier is introduced,
which instead of grouping solution mappings according to exact values,
brings them together by using similarity criteria. For both cases, a variety
of algorithms are proposed and analysed, and use-case queries that showcase
the relevance and usefulness of the novel operators are presented. For
similarity joins, experimental results are provided by comparing different
physical operators over a set of real world queries, as well as comparing
our implementation to the closest work found in the literature, DBSimJoin, a
PostgreSQL extension that supports similarity joins. For clustering,
synthetic queries are designed in order to measure the performance of the
different algorithms implemented.
Pages 1701-1732
Volume 15
Journal name Semantic Web
Publisher IOS Press (Amsterdam, The Netherlands)