View publication
| Title | Smaller Self-Indexes for Natural Language |
| Authors | Nieves Brisaboa, Gonzalo Navarro, Alberto Ordóñez Pereira |
| Publication date | 2012 |
| Abstract | Self-indexes for natural-language texts, where these are regarded as token (word or separator) sequences, achieve very attractive space and search time. However, they suffer from a space penalty due to their large vocabulary. In this paper we show that by replacing the Huffman encoding they implicitly use by the slightly weaker Hu-Tucker encoding, which respects the lexical order of the vocabulary, both their space and time are improved. |
| Downloaded | 6 times |
| Pages | 372-378 |
| Conference name | International Symposium on String Processing and Information Retrieval |
| Publisher | Springer-Verlag (Berlin/Heidelberg, Germany) |
|
|
| Reference URL |
|

