Abstract
This paper brings together two strands of machine learning of increasing importance: kernel methods and highly structured data. We propose a general method for constructing a kernel following the syntactic structure of the data, as defined by its type signature in a higher-order logic. Our main theoretical result is the positive definiteness of any kernel thus defined. We report encouraging experimental results on a range of real-world data sets. By converting our kernel to a distance pseudo-metric for 1-nearest neighbour, we were able to improve the best accuracy from the literature on the Diterpene data set by more than 10%.
Translated title of the contribution | Kernels and Distances for Structured Data |
---|---|
Original language | English |
Pages (from-to) | 205 - 232 |
Number of pages | 27 |
Journal | Machine Learning |
Volume | 57(3) |
Publication status | Published - Dec 2004 |