Isolating causes of improved generalization in deep learning (Ryan Murray)

Prerequisites: Probability, basic programming, and algorithms

Outline: A concern in statistical machine learning is to avoid overfitting when generalizing algorithms to newly observed data (1). In deep learning, many methods have been proposed to achieve this goal, including dropout, early stopping, regularization, and batch-based descent methods. However, exact efficacy and effect of each of these methods are not fully understood. This project seeks to identify theoretical and practical aspects of these generalization techniques, with a particular focus on isolating their individual effects.

Research objectives: To investigate the individual effects of dropout, early stopping, weight regularizers and batch-based descent, upon a simplified non-parametric statistical problem (previously proposed by Murray to study dropout) meant to tractably model the fitting power of deep neural networks. This investigation will begin with computational experiments, and then will seek to identify mathematical models that describe the observed computational results.

Outcomes: Computational examples, in the form of shareable code, that identify the different effects of these methods upon generalization. Mathematical models explaining these results will also be studied.

  • Ryan W, Fokoué E. Dropout Fails to Regularize Nonparametric Learners. J Stat Theor Prac. 2021;15:1–20.