Refining causality: who copied from whom?

TM Snowsill, NRC Fyson, Bie Tijl De, Nello Cristianini

Research output: Chapter in Book/Report/Conference proceedingConference Contribution (Conference Proceeding)

29 Citations (Scopus)


Inferring causal networks behind observed data is an active area of research with wide applicability to areas such as epidemiology, microbiology and social science. In particular recent research has focused on identifying how information propagates through the Internet. This research has so far only used temporal features of observations, and while reasonable results have been achieved, there is often further information which can be used. In this paper we show that additional features of the observed data can be used very effectively to improve an existing method. Our particular example is one of inferring an underlying network for how text is reused in the Internet, although the general approach is applicable to other inference methods and information sources. We develop a method to identify how a piece of text evolves as it moves through the underlying network and how substring information can be used to narrow down where in the evolutionary process a particular observation at a node lies and hence to narrow down the number of ways the node could have acquired the infection. Text reuse is detected using a suffix tree which is also used to identify the substring relations between chunks of reused text. We then use a modication of the NetCover method to infer the underlying network. Experimental results on both synthetic and real life data show that using more information than just timing leads to greater accuracy in the inferred networks.
Translated title of the contributionRefining causality: who copied from whom?
Original languageEnglish
Title of host publicationThe 17th ACM SIGKDD conference on Knowledge Discovery and Data Mining (KDD)
Number of pages9
Publication statusPublished - 2011


Dive into the research topics of 'Refining causality: who copied from whom?'. Together they form a unique fingerprint.

Cite this