As modern processors can execute instructions at far greater rates than these instructions can be retrieved from main memory, computer systems commonly include caches that speed up access times. While these improve average execution times, they introduce additional complexity in determining the Worst Case Execution Times crucial for Real-Time Systems. In this paper, an approach is presented that utilises Bayesian Networks in order to more accurately estimate the worst-case caching behaviour of programs. With this method, a Bayesian Network is learned from traces of program execution that allows both constructive and destructive dependencies between instructions to be determined and a joint distribution over the number of cache hits to be found. Attention is given to the question of how the accuracy of the network depends on both the number of observations used for learning and the cardinality of the set of potential parents considered by the learning algorithm.