|
Title |
Top-k Document Retrieval in Optimal Time and Linear Space |
|
Authors |
Gonzalo Navarro, Yakov Nekrich |
|
Publication date |
2012 |
|
Abstract |
We describe a data structure that uses O(n)-word space and reports k most relevant documents that contain a query pattern P in optimal O(|P|+k) time. Our construction supports an ample set of important relevance measures, such as the frequency of P in a document and the minimal distance between two occurrences of P in a document. We show how to reduce the space of the data structure from O(n log n) bits to O(n(log s + log D + log log n)), where s is the alphabet size and D is the total number of documents. |
|
Downloaded |
11 times |
|
Pages |
1066-1078 |
|
Conference name |
ACM-SIAM Symposium on Discrete Algorithms |
|
Publisher |
SIAM Press (Philadelphia, USA) |
|
PDF |
|
|
Reference URL |
|