We present a Federated Learning (FL) based solution for building a distributed classifier capable of detecting URLs containing sensitive content, i.e., content related to categories such as health, political beliefs, sexual orientation, etc. Although such a classifier addresses the limitations of previous offline/centralised classifiers, it is still vulnerable to poisoning attacks from malicious users that may attempt to reduce the accuracy for benign users by disseminating faulty model updates. To guard against this, we develop a robust aggregation scheme based on subjective logic and residual-based attack detection. Employing a combination of theoretical analysis, trace-driven simulation, as well as experimental validation with a prototype and real users, we show that our classifier can detect sensitive content with high accuracy, learn new labels fast, and remain robust in view of poisoning attacks from malicious users, as well as imperfect input from non-malicious ones.
Tianyue Chu is a PhD student of Data Transparency Group (DTG) at IMDEA Networks since October 2020. She received her Bachelor’s in Applied Mathematics and Master’s Degree in Statistics from Shenzhen University. During her master’s thesis, she collaborated at Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences (CAS), where she focused on developing Nonparametric statistics methods and machine learning algorithms for a wearable Intelligent Health Monitoring system. Her current main research interests include machine learning, federated learning and Graph neural network.
Este evento se impartirá en inglés