We consider the problem of outlier detection in the context of functional data analysis (FDA). Observations in FDA context are curves (or functions) so outlying functions can take different forms. Consequently, it is not only desirable to identify an outlying curve but also to understand why such curve is an outlier. To this end, we first discuss various types of outliers (according to the consensus) in FDA literature. Then, we present the fdaoutlier R package which provide implementations of some of the state-of-the-art outlier detection methods for functional data. We then propose Fast-MUOD and Semifast-MUOD methods for detecting and classifying outliers in functional data. These methods work by computing for each curve, three indices which measure the outlyingness of the curves in terms of shape, magnitude and amplitude, relative to the data mass. The classical boxplot is used to separate the indices of the outlying curves from those of those of the typical curves. Finally, we provide some theoretical properties of the Fast-MUOD indices and propose some techniques for extending Fast-MUOD to outlier detection for multivariate functional data. Comparisons with other outlier detection methods for functional data using various simulated data show superior or comparable outlier detection accuracy of the proposed methods. One of the methods is especially well suited to handling big and dense functional datasets with very small computational time compared to other methods. We apply the proposed methods on weather, population growth, and video data.
About Oluwasegun Ojo
Oluwasegun Ojo is currently a Research Assistant in the Global Computing Group at IMDEA Networks Institute and a Mathematical Engineering PhD student at Carlos III University of Madrid. He completed a master’s degree in mathematical sciences at the African Institute for Mathematical Sciences (AIMS), Cameroon. Prior to that, he obtained his bachelor’s degree in statistics in 2014 from the Federal University of Technology, Akure, Nigeria. His current research interest is in the analysis of functional data.
This event will be conducted in English