Fast Approximate Point Set Matching for Information Retrieval

Raphaël Clifford, Benjamin Sach

Research output: Chapter in Book/Report/Conference proceedingConference Contribution (Conference Proceeding)

Abstract

We investigate randomised algorithms for subset matching with spatial point sets---given two sets of d-dimensional points: a data set T consisting of n points and a pattern P consisting of m points, find the largest match for a subset of the pattern in the data set. This problem is known to be 3-SUM hard and so unlikely to be solvable exactly in subquadratic time. We present an efficient bit-parallel O(nm) time algorithm and an O(n log(m)) time solution based on correlation calculations using fast Fourier transforms. Both methods are shown experimentally to give answers within a few percent of the exact solution and provide a considerable practical speedup over existing deterministic algorithms.
Translated title of the contributionFast Approximate Point Set Matching for Information Retrieval
Original languageEnglish
Title of host publication SOFSEM 2007: Theory and Practice of Computer Science
Subtitle of host publication33rd Conference on Current Trends in Theory and Practice of Computer Science, Harrachov, Czech Republic, January 20-26, 2007. Proceedings
PublisherSpringer Berlin Heidelberg
Pages212-223
Number of pages12
ISBN (Print)9783540695066
DOIs
Publication statusPublished - 2007

Publication series

NameLecture Notes in Computer Science
Volume4362
ISSN (Print)0302-9743

Fingerprint Dive into the research topics of 'Fast Approximate Point Set Matching for Information Retrieval'. Together they form a unique fingerprint.

Cite this