Publications

Stats

View publication

Title Improved Compressed Indexes for Full-Text Document Retrieval
Authors Djamal Belazzougui, Gonzalo Navarro
Publication date 2011
Abstract We give new space/time tradeoffs for compressed indexes that answer document retrieval queries on general sequences. On a collection of D documents of total length n, current approaches require at least |CSA|+ O(n lg D / lg lg D) or 2|CSA|+o(n) bits of space, where CSA is a full-text index. Using monotone minimum perfect hash functions, we give new algorithms for document listing with frequencies and top-k document retrieval using just |CSA|+O(n lg lg lg D) bits. We also improve current solutions that use 2|CSA|+o(n) bits, and consider other problems such as colored range listing, top-k most important documents, and computing arbitrary frequencies.
Downloaded 11 times
Pages 286-297
Conference name International Symposium on String Processing and Information Retrieval
Publisher Springer-Verlag (Berlin/Heidelberg, Germany)
PDF View PDF
Reference URL View reference page