Abstract
In this paper we consider different representations for relational learning problems, with the aim of making ILP methods more applicable to real-world problems. In the past, ILP tended to concentrate on the term representation, with the flattened Datalog representation as a `poor man's version'. There has been relatively little emphasis on database-oriented representations, using e.g. the relational datamodel or the Entity-Relationship model. On the other hand, much of the available data is stored in multi-relational databases. Even if we don't actually interface our ILP systems with a DBMS, we need to understand the database representation sufficiently in order to convert it to an ILP representation. Such conversions and relations between different representations are the subject of this paper. We consider four different representations: the Entity-Relationship model, the relational model, a flattened individual-centred representation based on so-called ISP declarations we use for our ILP systems Tertius and 1BC, and the term-based representation. We argue that the term-based representation does not have all the flexibility and expressiveness provided by the other representations. For instance, there is no way to deal with graphs without partly flattening the data (i.e., introducing identifiers). Furthermore, there is no easy way of switching to another individual without converting the data, let alone learning with different individual types. The flattened representation has clear advantages in these respects.
Translated title of the contribution | A First-order Representation for Knowledge Discovery and Bayesian Classification on Relational Data |
---|---|
Original language | English |
Title of host publication | Mining, decision Support, Meta-learning and ILP |
Subtitle of host publication | Forum for Practical Problem Presentation and Prospective Solutions (DDMI-2000), Workshop of 4th International Conference on Principles of Data Mining and Knowledge Discovery (PKDD-2000 |
Pages | 49 - 60 |
Number of pages | 11 |
Publication status | Published - 2000 |
Bibliographical note
Other page information: 49-60Conference Proceedings/Title of Journal: PKDD2000 workshop on Data Mining, Decision Support, Meta-learning and ILP : Forum for Practical Problem Presentation and Prospective Solutions
Other identifier: 1000521