Publications

Stats

View publication

Title PolyLM: Learning about Polysemy through Language Modeling
Authors Alan Ansell, Felipe Bravo-Marquez, Bernhard Pfahringer
Publication date 2021
Abstract To avoid the "meaning conflation deficiency" of word
embeddings, a number of models have aimed to embed individual word senses.
These methods at one time performed well on tasks such as word sense
induction (WSI), but they have since been overtaken by task-specific
techniques which exploit contextualized embeddings. However, sense
embeddings and contextualization need not be mutually exclusive. We
introduce PolyLM, a method which formulates the task of learning sense
embeddings as a language modeling problem, allowing contextualization
techniques to be applied. PolyLM is based on two underlying assumptions
about word senses: firstly, that the probability of a word occurring in a
given context is equal to the sum of the probabilities of its individual
senses occurring; and secondly, that for a given occurrence of a word, one
of its senses tends to be much more plausible in the context than the
others. We evaluate PolyLM on WSI, showing that it performs considerably
better than previous sense embedding techniques, and matches the current
state-of-the-art specialized WSI method despite having six times fewer
parameters. Code and pre-trained models are available at
https://github.com/AlanAnsell/PolyLM.
Pages 563-574
Conference name European Chapter of the Association for Computational Linguistics
Publisher Association for Computational Linguistic
Reference URL View reference page