Abstract
We describe and evaluate a discriminative clustering approach for
content-based tag recommendation in social bookmarking systems. Our approach
uses a novel and efficient discriminative clustering method that groups posts
based on the textual contents of the posts. The method also generates a ranked
list of discriminating terms for each cluster. We apply the clustering method to
build two clustering models – one based on the tags assigned to posts and the
other based on the content terms of posts. Given a new posting, a ranked list of
tags and content terms is determined from the clustering models. The final tag
recommendation is based on these ranked lists. If the poster’s tagging history is
available then this is also utilized in the final tag recommendation. The approach
is evaluated on data from BibSonomy, a social bookmarking system. Prediction
results show that the tag-based clustering model is more accurate than the term-
based clustering model. Combining the predictions from both models is better
than either model’s predictions. Significant improvement in recommendation is
obtained over the baseline method of recommending the most frequent tags for
all posts
content-based tag recommendation in social bookmarking systems. Our approach
uses a novel and efficient discriminative clustering method that groups posts
based on the textual contents of the posts. The method also generates a ranked
list of discriminating terms for each cluster. We apply the clustering method to
build two clustering models – one based on the tags assigned to posts and the
other based on the content terms of posts. Given a new posting, a ranked list of
tags and content terms is determined from the clustering models. The final tag
recommendation is based on these ranked lists. If the poster’s tagging history is
available then this is also utilized in the final tag recommendation. The approach
is evaluated on data from BibSonomy, a social bookmarking system. Prediction
results show that the tag-based clustering model is more accurate than the term-
based clustering model. Combining the predictions from both models is better
than either model’s predictions. Significant improvement in recommendation is
obtained over the baseline method of recommending the most frequent tags for
all posts
Original language | English |
---|---|
Title of host publication | Proceedings of the ECML PKDD Discovery Challenge 2009 (DC09) |
Pages | 85-98 |
Publication status | Published - 7 Sept 2009 |