Enabling open circulation of training labeled data in industrial environments may improve many machine learning based processes that are currently employing, for privacy concerns, only internally gathered data. To get to this scenario, data exchanged among parties shall be conveniently anonymized to prevent possible confidentiality and privacy issues. This problem also arises in another context, but this time for unlabeled data, at inference time: Machine-Learning-as-a-Service.
Usually, in a MLaaS scenario, the client has no other choice than trusting the data gatherer on a correct use of the data. But ideally, the client would want to submit for inference a dataset in which every bit of extra information – i.e. information not strictly needed for the wanted task – is either masked or removed. To solve these problems we are researching and designing deep learning systems that are able to create anonymized encodings for the data, while keeping the accuracy for the targeted tasks high.
Vittorio Prodomo has obtained both his BSc (2016) and MSc (2019) degrees in Computer Engineering at the University of Naples Federico II. As part of his work on the Master Thesis, he did a 6-month internship at NEC Labs Europe GmbH in Heidelberg, Germany. This Master Thesis revolved around the use of NLP approaches – such as Word2Vec – for malicious domains detection.
At the beginning of 2020, he began his PhD in Telematic Engineering at Carlos III University of Madrid. The very first focus of his PhD research was the use of Machine Learning for smart resource allocation in Mobile Networks, but he currently works on a new topic: Privacy in Machine Learning. More specifically, the inherent Utility-Privacy trade-off in data anonymization approaches. His main interests are Machine Learning, Deep Learning and Data Analysis.
Este evento se impartirá en español
También puedes seguir el seminario online: https://zoom.us/j/99448700647?pwd=VnhoVjJOL0RFU0RQZ3N5cjd6Y2Z6dz09