NOVELTY - Relative frequency of occurrence for each distinct word sequence among word sequences containing the word, is calculated. A fuzzy set containing corresponding fuzzy membership values calculated from the relative frequency, is generated. A probability that the first word of the pair is semantically suitable as a replacement for the second word, is calculated for each pair of words, using the fuzzy sets. USE - For determining semantic similarity of words selected from documents, for retrieving information in an information system. ADVANTAGE - The more broadly-based semantic relationships and increased number of words represented in the matrix, ensure improved information retrieval of search engines with respect to other information sets. DETAILED DESCRIPTION - INDEPENDENT CLAIMS are also included for the following: (1) information retrieval apparatus; and (2) information processing apparatus.
|Patent number||WO2005041063-A1; EP1668541-A1; US2007016571-A1|
|Publication status||Published - 2005|