SGD on Distributed Systems
Ali Ramezani-Kebrya, Postdoctoral Fellow, University of Toronto and Vector Institute, Canada
External Presentation (External Speaker)
Deep learning is booming thanks to enormous datasets with very large models, and the widespread availability of supercomputing via GPUs. The key algorithm underlying deep learning revolution is stochastic gradient descent (SGD), which fits large models by exploiting GPU architectures to deliver excellent computational efficiency. However, SGD was not designed to be distributed. To perform model fitting in parallel, there is a need for communication-efficient variants of SGD to realize the promise of deep learning. Furthermore, implementations of SGD on large-scale and distributed systems creates new vulnerabilities, which can be identified and misused by one or more adversarial agents. We will present efficient gradient compression and robust aggregation schemes to reduce communication costs and enhance Byzantine-resilience. Our schemes can be used in federated learning settings, where a deep model is trained on data distributed among multiple owners without exposing that data.
About Ali Ramezani-Kebrya
Ali Ramezani-Kebrya is a Postdoctoral Fellow at the University of Toronto and Vector Institute working in the area of machine learning and studying communication, optimization, privacy/security, generalization, and stability aspects of machine learning algorithms. He will join EPFL as a Senior Scientific Collaborator in March 2021. Ali received his Ph.D. from the University of Toronto. His Ph.D. research was focused on developing theory and practices for next generation large-scale distributed and heterogeneous networks. He is a recipient of the Natural Sciences and Engineering Research Council of Canada Postdoctoral Fellowship, which is equivalent to NSF fellowship in the US.
This event will be conducted in English