Projects per year
Abstract
We revisit the complexity of one of the most basic problems in pattern matching. In the k-mismatch problem we must compute the Hamming distance between a pattern of length m and every m-length substring of a text of length n, as long as that Hamming distance is at most k. Where the Hamming distance is greater than k at some alignment of the pattern and text, we simply output “No”.
We study this problem in both the standard offline setting and also as a streaming problem. In the streaming k-mismatch problem the text arrives one symbol at a time and we must give an output before processing any future symbols. Our main results are as follows:
Our first result is a deterministic O(nk2 log k/m + n polylog m) time offline algorithm for k-mismatch on a text of length n. This is a factor of k improvement over the fastest previous result of this form from SODA 2000 [9, 10].
We then give a randomised and online algorithm which runs in the same time complexity but requires only O(k2 polylog m) space in total.
Next we give a randomised (1 + ∊)-approximation algorithm for the streaming k-mismatch problem which uses O(k2 polylog m/∊2) space and runs in O(polylog m/∊2) worst-case time per arriving symbol.
Finally we combine our new results to derive a randomised O(k2 polylog m) space algorithm for the streaming k-mismatch problem which runs in O(√k log k + polylog m) worst-case time per arriving symbol. This improves the best previous space complexity for streaming k-mismatch from FOCS 2009 [26] by a factor of k. We also improve the time complexity of this previous result by an even greater factor to match the fastest known offline algorithm (up to logarithmic factors).
| Original language | English |
|---|---|
| Title of host publication | Proceedings of the Twenty-Seventh Annual ACM-SIAM Symposium on Discrete Algorithms |
| Publisher | Society for Industrial and Applied Mathematics |
| Pages | 2039-2052 |
| Number of pages | 14 |
| Volume | January 2016 |
| ISBN (Electronic) | 9781611974331 |
| DOIs | |
| Publication status | Published - 10 Jan 2016 |
Fingerprint
Dive into the research topics of 'The k-mismatch problem revisited'. Together they form a unique fingerprint.Projects
- 1 Finished
-
Next generation pattern matching
Clifford, R. (Principal Investigator)
1/01/13 → 14/01/19
Project: Research
Profiles
-
Dr Raphael Clifford
- School of Computer Science - Reader in Algorithm Design
- Intelligent Systems Laboratory
- Algorithms and Complexity
Person: Academic , Member, Group lead