Machine learning, and especially deep learning, are techniques that are truly computationally demanding. This issue becomes even more critical when we need to work in real-time, where latency is a determining factor. When working with mmWave, it’s even worse: latency and throughput are high, placing even greater demands on the inference environment. FPGAs are an ideal ally for environments like these due to the performance they offer, but they come with very limited resources. Designing ML models that operate in constrained ecosystems and address problems with demanding latency requirements is a challenge that involves employing model simplification techniques, with quantization standing out as a key approach. The presentation will showcase ongoing work aimed at evolving an ML-based receiver to operate in real-time on an FPGA, using quantization as the foundation for model reduction and the stack proposed by Xilinx/AMD for HLS compilation.
Pablo Saucedo is a first-year PhD student at IMDEA Networks Institute from the Wireless Networking Group, supervised by Joerg Widmer. His research focuses on the development and deployment of Machine Learning models for mmWave applications, mainly sensing and scene understanding, with the implications on bandwidth, latency and performance that this requires.
This event will be conducted in English