String Processing and Information Retrieval

R Clifford, Efremenko Klim, Porat Benny, Porat Ely, Rothschild Amir

Research output: Contribution to journalArticle (Academic Journal)peer-review

Abstract

We consider the well known problem of pattern matching under the Hamming distance. Previous approaches have shown how to count the number of mismatches efficiently, especially when a bound is known for the maximum Hamming distance. Our interest is different in that we wish collect a random sample of mismatches of fixed size at each position in the text. Given a pattern p of length m and a text t of length n, we show how to sample with high probability c mismatches where possible from every alignment of p and t in O((c + logn)(n + mlogm)logm) time. Further, we guarantee that the mismatches are sampled uniformly and can therefore be seen as representative of the types of mismatches that occur.
Translated title of the contributionString Processing and Information Retrieval
Original languageEnglish
Pages (from-to)99 - 108
Number of pages10
JournalMismatch Sampling
Volume280/2009
DOIs
Publication statusPublished - 2009

Fingerprint

Dive into the research topics of 'String Processing and Information Retrieval'. Together they form a unique fingerprint.

Cite this