Publications

Stats

View publication

Title Simple Yet Powerful: An Overlooked Architecture for Nested Named Entity Recognition
Authors Matías Rojas, Felipe Bravo-Marquez, Jocelyn Dunstan
Publication date 2022
Abstract Named Entity Recognition (NER) is an important task in
Natural
Language Processing that aims to identify text spans belonging to predefined
categories. Traditional NER systems ignore nested entities, which are
entities contained in other entity mentions. Although several methods have
been proposed to address this case, most of them rely on complex
task-specific structures and ignore potentially useful baselines for the
task. We argue that this creates an overly optimistic impression of their
performance. This paper revisits the Multiple LSTM-CRF (MLC) model, a
simple, overlooked, yet powerful approach based on training independent
sequence labeling models for each entity type. Extensive experiments with
three nested NER corpora show that, regardless of the simplicity of this
model, its performance is better or at least as well as more sophisticated
methods. Furthermore, we show that the MLC architecture achieves
state-of-the-art results in the Chilean Waiting List corpus by including
pre-trained language models. In addition, we implemented an open-source
library that computes task-specific metrics for nested NER. The results
suggest that metrics used in previous work do not measure well the ability
of a model to detect nested entities, while our metrics provide new evidence
on how existing approaches handle the task.
Pages 2108-2117
Conference name International Conference on Computational Linguistics
Publisher Association for Computational Linguistic
Reference URL View reference page