Publications

Stats

View publication

Title Binary RDF Representation for Publication and Exchange (HDT)
Authors Javier Fernández, Miguel Martínez-Prieto, Claudio Gutierrez, Axel Polleres
Publication date 2013
Abstract The current Web of Data is producing increasingly large
RDF
datasets. Massive publication efforts of RDF data driven by initiatives like
the Linked Open Data movement, and the need to exchange large datasets
has unveiled the drawbacks of traditional RDF representations, inspired and
designed by a document-centric and human-readable Web. Among the main
problems are high levels of verbosity/redundancy and weak
machine-processable capabilities in the description of these datasets. This
scenario calls for
efficient formats for publication and exchange.
This article presents a binary RDF representation addressing these issues.
Based on a set of metrics that characterizes the skewed structure of
real-world RDF data, we develop a proposal of an RDF representation that
modularly partitions and efficiently represents three components of RDF
datasets: Header information, a Dictionary, and the actual Triples structure
(thus called HDT). Our experimental evaluation shows that datasets in HDT
format can be compacted by more than fifteen times as compared
to current naive representations, improving both parsing and processing
while keeping a consistent
publication scheme. Specific compression techniques over HDT further improve
these compression rates and prove to outperform existing compression
solutions for efficient RDF exchange.
Downloaded 6 times
Pages 22-41
Volume 19
Journal name Journal of Web Semantics
Publisher Elsevier Science (Amsterdam, The Netherlands)
PDF View PDF