View publication
Title | An Efficient Algorithm for Approximated Self-similarity Joins in Metric Spaces |
Authors | Sebastian Ferrada, Benjamin Bustos, Nora Reyes |
Publication date | July 2020 |
Abstract |
Similarity join is a key operation in metric databases. It retrieves all pairs of elements that are similar. Solving such a problem usually requires comparing every pair of objects of the datasets, even when indexing and ad hoc algorithms are used. We propose a simple and efficient algorithm for the computation of the approximated nearest neighbor self-similarity join. This algorithm computes O(n^(3/2)) distances and it is empirically shown that it reaches an empirical precision of 46% in real-world datasets. We provide a comparison to other common techniques such as Quickjoin and Locality-Sensitive Hashing and argue that our proposal has a better execution time and average precision. |
Downloaded | 5 times |
Pages | article 101510 |
Volume | 91 |
Journal name | Information Systems |
Publisher | Elsevier Science (Amsterdam, The Netherlands) |