Publications

Stats

View publication

Title QAWiki: A Knowledge Graph Question Answering & SPARQL Query Generation Dataset for Wikidata
Authors Alberto Moya Loustaunau, Aidan Hogan
Publication date 2025
Abstract In this resource paper, we present QAWiki: a multilingual,
handcrafted, knowledge graph question answering and SPARQL query generation
dataset for Wikidata. QAWiki consists of 526 questions over Wikidata, of
which 518 are associated with SPARQL queries, and 8 are disambiguation
questions. Each question is presented in both English and Spanish, and
includes paraphrased versions of the question, as well as annotations of
entity and relation mentions for Wikidata. The dataset is hosted in a
Wikibase instance, which allows for collaborative editing and refinement of
the dataset by the community, among other features. Further metadata include
tagging questions with issues (e.g., incompleteness, imprecision, ambiguity)
as well as defining relations between questions (e.g., a question whose
answers are contained in another question, etc.). QAWiki can thus be used as
an evaluation (and training) dataset for knowledge graph question answering
& query generation systems. We provide illustrative experiments over QAWiki
using GPT 4o to generate SPARQL queries over Wikidata, comparing performance
with and without passing entity mentions to the model via the
prompt.
Conference name Wikidata Workshop
Publisher CEUR Publications
Reference URL View reference page