In-Place Longest Common Extensions

Nicola Prezza

Università di Udine

Given a text T, the Longest Common Extension (LCE) problem is to build a data structure to efficiently find the length of the longest common prefix between any two T's suffixes. In this talk, I will present a data structure that (i) answers LCE queries in logarithmic time, (ii) replaces the text by giving access to its characters in optimal time, and (iii) takes exactly the same size of the text. The result is achieved by replacing the text with a set of Karp-Rabin fingerprints of some of its prefixes, and then compressing them (to the text's size) using randomization. This is the first known example of in-place self-data structure: a structure replacing the input and using exactly its space. I will conclude by showing important consequences of this result (substantially improving the state of the art): (i) in-place construction of the suffix array and the longest common prefix array (LCP), and (ii) an in-place algorithm to lexicographically sort any subset of text suffixes.