BitTorrent is the most successful Peer-to-Peer (P2P) application and is responsible for a major portion of Internet traffic. It has been largely studied using simulations, models and real measurements. Although simulations and modelling are easier to perform, they typically simplify analysed problems and in case of BitTorrent they are likely to miss some of the effects which occur in real swarms. Thus, in this thesis we rely on real measurements. In the first part of the thesis we present the summary of measurement techniques used so far and we use it as a base to design our tools that allow us to perform different types of analysis at different resolution level. Using these tools we collect several large-scale datasets to study different aspects of BitTorrent with a special focus on socio-economic aspects. Using our datasets, we first investigate the topology of real BitTorrent swarms and how the traffic is actually exchanged among peers. Our analysis shows that the resilience of BitTorrent swarms is lower than corresponding random graphs.
We also observe that ISP policies, locality-aware clients and network events (e.g., network congestion) lead to locality-biased composition of neighbourhood in the swarms. This means that the peer contains more neighbours from local provider than expected from purely random neighbours selection process. Those results are of interest to the companies which use BitTorrent for daily operations as well as for ISPs which carry BitTorrent traffic. In the next part of the thesis we look at the BitTorrent from the perspective of the content and content publishers in a major BitTorrent portals. We focus on the factors that seem to drive the popularity of the BitTorrent and, as a result, could affect its associated traffic in the Internet. We show that a small fraction of publishers (around 100 users) is responsible for more than two-thirds of the published content. Those publishers can be divided into two groups: (i) profit driven and (ii) fake publishers. The former group leverages the published copyrighted content (typically very popular) on BitTorrent portals to attract content consumers to their web sites for financial gain. Removing this group may have a significant impact on the popularity of BitTorrent portals and, as a result, may affect a big portion of the Internet traffic associated to BitTorrent.
The latter group is responsible for fake content, which is mostly linked to malicious activity and creates a serious threat for the Bit- Torrent ecosystem and for the Internet in general. To mitigate this threat, in the last part of the thesis we present a new tool named TorrentGuard for the early detec- tion of fake content that could help to significantly reduce the number of computer infections and scams suffered by BitTorrent users. This tool is available through web portal and as a plugin to Vuze, a popular BitTorrent client. Finally, we present MYPROBE, the web portal that allows to query our database and to gather different pieces of information regarding BitTorrent content publishers.
Who is Michal Kryczka?
I have received my Master degree in Computer Science in 2008 in Technical University in Lodz (Poland) and in 2009 in University of Carlos III in Madrid (Spain). Since 2008 I am Research Assistant (financed by FPU scholarship from Spanish Ministry of Education) in IMDEA networks , and a PhD candidate at University of Carlos III in Madrid.
PhD Thesis Advisors: Prof. Dr. Arturo Azcorra, Institute IMDEA Networks & University Carlos III of Madrid; Dr. Rubén Cuevas, University Carlos III of Madrid, Spain