Context-wise construction of the Burrows-Wheeler transform in compressed space

Nicola Prezza

Università di Udine

The Burrows-Wheeler Transform is a text permutation that has revolutionized the fields of pattern matching and text compression, bridging the gap existing between the two. Due to the importance of this technique, a space-and-time efficient BWT construction algorithm would be extremely valuable, permitting more efficient construction of compressed full-text indexes and boosting the efficiency of BWT-based text compressors. To date, the most efficient BWT construction algorithms show a clear space-time trade-off, being able either to build the BWT in compressed n Hk+o(n log sigma) space but superlinear O(nlog n/loglog n) time, or in linear O(n log sigma) space and linear O(n) time. In this seminar I will firstly give a brief introduction to (i) the theory of the Burrows-Wheeler transform, (ii) wavelet trees, and (iii) compressed dynamic strings. Finally, all these elements will be used to describe how we approached the problem of the BWT construction with a solution based on backward search, dynamic strings manipulation, and automata on words. Our solution is particularly efficient on DNA texts, where running time is linear and space compressed.