A functional data analysis of disease outbreak data (Ana-Maria Staicu)

Prerequisites: Linear algebra, basic statistics.

Outline: Since the onset of the COVID-19 pandemic, curves depicting the number of infected members within a population have become commonplace in most citizens’ lives.  Functional data analysis (FDA) (1) is a field of statistics that models the distribution of functions, particularly, curves.  FDA methods are excellent for extracting low-dimensional summaries of functional data. They can be used for visualization, to cluster similar curves and to determine factors that contribute the curves’ shapes, and magnitude.  In this project, we will apply FDA methods to freely available daily county-level COVID-19 data.

Research objectives: New insights into the factors that determine the spread of the disease and contribute to the development of mitigation strategies.  Analysis of effects of demographic, socioeconomic and environmental variables on the shape and magnitude of the disease outbreak curves, and investigation of the effects of government interventions such as school closings and shelter-in-place orders as well as citizen mobility data as provided by Google.

Outcomes: Estimates of intervention effects and improved understanding of the conditions that are indicative of a disease hot spot.

  • Ramsay J, Silverman B. Applied functional data analysis: methods and case studies. New York, NY: Springer; 2002.