Publications

Stats

View publication

Title Everything You Always Wanted to Know About Blank Nodes
Authors Aidan Hogan, Marcelo Arenas, Alejandro Mallea, Axel Polleres
Publication date August 2014
Abstract In this paper we thoroughly cover the issue of blank
nodes, which
have been defined in RDF as "existential variables". We first introduce
the theoretical precedent for existential blank nodes from first order logic
and incomplete information in database theory. We then cover the different
(and sometimes incompatible) treatment of blank nodes across the W3C stack
of RDF related standards. We present an empirical survey of the blank nodes
present in a large sample of RDF data published on the Web (the BTC-2012
dataset), where we find that 25.7% of unique RDF terms are blank nodes, that
44.9% of documents and 66.2% of domains featured use of at least one blank
node, and that aside from one Linked Data domain whose RDF data contains
many "blank node cycles", the vast majority of blank nodes form tree
structures that are efficient to compute simple entailment over. With
respect to the RDF-merge of the full data, we show that 6.1% of blank-nodes
are redundant under simple entailment. The vast majority of non-lean cases
are isomorphisms resulting from multiple blank nodes with no discriminating
information being given within an RDF document or documents being duplicated
in multiple Web locations. Although simple entailment is NP-complete and
leanness-checking is coNP-complete, in computing this latter result, we
demonstrate that in practice, real-world RDF graphs are sufficiently
"rich" in ground information for problematic cases to be avoided by
non-naive algorithms.
Pages 42-69
Volume 27
Journal name Journal of Web Semantics
Publisher Elsevier Science (Amsterdam, The Netherlands)
Reference URL View reference page