In this chapter, we outline a general modus operandi under which to perform intrusion detection at scale. The over-arching principle is this: A network monitoring tool has access to large stores of data on which it can learn 'normal' network behaviour. On the other hand, data on intrusions are relatively rare. This imbalance invites us to frame intrusion detection as an anomaly detection problem where, under the null hypothesis that there is no intrusion, the data follow a machine-learnt model of behaviour, and, under the alternative that there is some form of intrusion, certain anomalies in that model will be apparent. This approach to cyber security poses some important statistical challenges. One is simply modelling and doing inference with such large-scale and heterogeneous data. Another is performing anomaly detection when the null hypothesis comprises a complex model. Finally, a key problem is combining different anomalies through time and across the network.
|Title of host publication||Dynamic Networks and Cyber-Security|
|Publisher||World Scientific Publishing Co.|
|Number of pages||20|
|Publication status||Published - 1 May 2016|