Fingerprints in compressed strings

Philip Bille, Inge Li Gørtz, Patrick Hagge Cording, Benjamin G Sach, Hjalte Wedel Vildhøj, Søren Vind

Research output: Contribution to journalArticle (Academic Journal)peer-review

13 Citations (Scopus)

Abstract

In this paper we show how to construct a data structure for a string S of size N compressed into a context-free grammar of size n that supports efficient Karp–Rabin fingerprint queries to any substring of S. That is, given indices i and j, the answer to a query is the fingerprint of the substring S [ i , j ] . We present the first O ( n ) space data structures that answer fingerprint queries without decompressing any characters. For Straight Line Programs (SLP) we get O ( log ⁡ N ) query time, and for Linear SLPs (an SLP derivative that captures LZ78 compression and its variations) we get O ( log ⁡ log ⁡ N ) query time. We extend the result to solve the longest common extension problem in query time O ( log ⁡ N log ⁡ ℓ ) and O ( log ⁡ ℓ log ⁡ log ⁡ ℓ + log ⁡ log ⁡ N ) for SLPs and Linear SLPs, respectively. Here, ℓ denotes the length of the LCE.
Original languageEnglish
Pages (from-to)171-180
Number of pages10
JournalJournal of Computer and System Sciences
Volume86
Early online date18 Jan 2017
DOIs
Publication statusPublished - Jun 2017

Keywords

  • Karp–Rabin fingerprints
  • Grammar compressed string
  • Longest common extensions

Fingerprint Dive into the research topics of 'Fingerprints in compressed strings'. Together they form a unique fingerprint.

Cite this