Morphology learning using tree of aligned suffix rules

Research output: Chapter in Book/Report/Conference proceedingConference Contribution (Conference Proceeding)

Abstract

Linguistic morphology concerns the structure of words, for instance how the plural form of nouns is obtained from their singular form, or how the past tense of verbs is generated from their infinitive. We describe an approach to function learning in morphology, where given a basic form of a word the goal is to generate a grammatical wordform. This task concerns learning the regular grammar or transformation in the format stem+suffix1 -> stem+suffix2 where suffix1 and suffix 2 are suffixes of two words in one word-pair and stem is the coinciding part in two words. Our approach is based on the tree of aligned suffix rules (TASR) where suffix rules represent left-hand and right-hand word suffixes of the input word pairs. The tree is built top-down, from general rules to specific rules, using suffix rule frequency and rule subsumption to decide which rules go where in the tree. The tree is executed bottom-up, i.e., the most specific rule that fires is chosen. In comparison to rule-induction approaches with similar functionality from the literature, the proposed method is faster, generates less rules and has got the following set of properties: is relatively simple, achieves high performance and has a more clear linguistic interpretation. We also describe preliminary thoughts on inducing morphological rules that are close to context-free mechanism.
Translated title of the contributionMorphology learning using tree of aligned suffix rules
Original languageEnglish
Title of host publicationICML Workshop on Challenges and Applications of Grammar Induction
Publication statusPublished - 2007

Bibliographical note

Other page information: -
Conference Proceedings/Title of Journal: ICML Workshop on Challenges and Applications of Grammar Induction
Other identifier: 2000997

Fingerprint Dive into the research topics of 'Morphology learning using tree of aligned suffix rules'. Together they form a unique fingerprint.

Cite this