Publications

Stats

View publication

Title Merging Web Tables for Relation Extraction with Knowledge Graphs
Authors Jhomara Luzuriaga, Emir Muñoz, Henry Rosales-Méndez, Aidan Hogan
Publication date 2023
Abstract We propose methods for extracting triples from Wikipedia's
HTML
tables using a reference knowledge graph. Our methods use a
distant-supervision approach to find existing triples in the knowledge graph
for pairs of entities on the same row of a table, postulating the
corresponding relation for pairs of entities from other rows in the
corresponding columns, thus extracting novel candidate triples. Binary
classifiers are applied on these candidates to detect correct triples and
thus increase the precision of the output triples. We extend this approach
with a preliminary step where we first group and merge similar tables,
thereafter applying extraction on the larger merged tables. More
specifically, we propose an observed schema for individual tables, which is
used to group and merge tables. We compare the precision and number of
triples extracted with and without table merging, where we show that with
merging, we can extract a larger number of triples at a similar precision.
Ultimately, from the tables of English Wikipedia, we extract 5.9 million
novel and unique triples for Wikidata at an estimated precision of
0.718.
Downloaded 33 times
Pages 1803-1816
Volume 35
Journal name IEEE Transactions on Knowledge and Data Engineering
Publisher IEEE Press (Piscataway, NJ, USA)
PDF View PDF
Reference URL View reference page