Publications

Stats

View publication

Title Attention is Turing-Complete
Authors Jorge Pérez, Pablo Barceló, Javier Marinkovic
Publication date April 2021
Abstract Alternatives to recurrent neural networks, in particular,
architectures based on self-attention, are gaining momentum for processing
input sequences. In spite of their relevance, the computational properties
of such networks have not yet been fully explored.We study the computational
power of the Transformer, one of the most paradigmatic architectures
exemplifying self-attention. We show that the Transformer with
hard-attention is Turing complete exclusively based on their capacity to
compute and access internal dense representations of the data.Our study also
reveals some minimal sets of elements needed to obtain this completeness
result.
Downloaded 22 times
Pages 1-35
Volume 75
Journal name Journal of Machine Learning Research
Publisher Microtome Publishing
PDF View PDF
Reference URL View reference page