Separating Wheat from Chaff
17 February 2014
An IMDEA Networks pre-doctorate researcher, Andra Lutu, has co-authored an article to be published at IEEE INFOCOM 2014 (33rd Annual IEEE International Conference on Computer Communications), a key conference in computer networking research.
The work, Separating Wheat from Chaff: Winnowing Unintended Prefixes using Machine Learning, focuses on a compelling operational issue in Internet inter-domain routing, namely that the visibility, and the ensuing reachability, of a network address block as yielded by the Border Gateway Protocol (BGP) might not match the expectations of the Internet Service Provider (ISP) originating that prefix. In particular, the ISP's intention could be for a prefix to be globally visible, yet, due to misconfiguration, policy disputes, and other BGP-related operational issues, the prefix might end up as a Limited Visibility Prefix (LVP), thus visible and even reachable from only a portion of the Internet. Detecting and correcting such issues is extremely difficult and this is exactly the ambitious task this publication sets out to solve.
The authors have built a novel tool, the BGP Visibility Scanner, which they have made freely available to the Internet community. In turn, the machine-learning algorithmic solution also proposed in this study for detecting unintended LVPs, relies on direct feedback from network operators who have already used our BGP Visibility Scanner in order to resolve real issues arising in the everyday practice of the Internet.
This scientific advance is the result of collaborative work by Andra Lutu, PhD Student at IMDEA Networks Institute and University Carlos III of Madrid (UC3M), Marcelo Bagnulo and Jesús Cid-Sueiro, both from UC3M, and Olaf Maennel, affiliated to Loughborough University, UK. INFOCOM 2014 will meet in Toronto, Canada from April 27th to May 2nd, 2014.
In this paper, we propose the use of prefix visibility at the interdomain level as an early symptom of anomalous events in the Internet. We focus on detecting anomalies which, despite their significant impact on the routing system, remain concealed from state of the art tools. We design a machine learning system to winnow the prefixes with unintended limited visibility – symptomatic of anomalous events – from the prefixes with intended limited visibility – resulting from legitimate routing operations. We train a winnowing algorithm with ground-truth data on 20,000 operational limited visibility prefixes (LVPs) already classified by the operators of the origin networks. The ground-truth was collected using the BGP Visibility Scanner, a tool we developed to provide operators with a multi-angle view on the efficacy of their routing policies. We build a dataset with the pre-classified prefixes and the features describing their visibility status dynamics. We further use this dataset to derive a boosted decision tree which winnows unintended LVPs with an accuracy of 95%.