Statistical Physics of Learning

Jan 1, 2021

A two-layer neural network with N inputs, K hidden neurons and activation function g(.).

In the deep learning era, many great results are achieved by empirical study. New deep learning methods for typical tasks in for instance computer vision, natural language processing and sound processing are tested on benchmark datasets that are publicly available and have become recognised in the respective fields. This facilitates the comparison of different approaches on these typical tasks. However, in order to make further fundamental progress, the great successes of deep learning in practice should also be supported by more theoretical results.

One theoretical approach that has seen renewed interest in recent years is The Statistical Physics of Learning. As the name suggests, it uses statistical physics theory to analyse Machine Learning scenarios. Statistical physics is concerned with modelling the overall properties (macroscopics) of systems that consist of many microscopic entities or particles. A neural network is a machine learning system that consists of many weights (microscopics) that connect neurons, and a configuration of weights realises a certain model with specific properties (macrosopics). The hallmark of The Statistical Physics of Learning approaches are average case results of the main descriptive quantities in machine learning scenarios, i.e. typical results of the training error, the generalisation/test error as well as other summarising parameters of the state of the model, known as order parameters.

We are using the statistical physics of learning to study recent phenomena in machine learning and also to study potential new approaches for machine learning practice. We are studying typical learning behaviour of models with modern activation functions (ReLU, Leaky ReLU, GeLU) and models in dynamic environments.

Deep Learning Statistical Physics

Michiel Straat

Postdoctoral Research Group Leader “Lifelong Machine Learning for Physical Systems”

My research interests include Machine Learning for Physical Systems and the theory of Neural Networks.

Publications

Layered Neural Networks with GELU Activation, a Statistical Mechanics Analysis

In this work we show theoretically that using GELU activation in neural networks induces continuous phase transitions and we analyse …

Frederieke Richert, Michiel Straat, Elisa Oostwal, Michael Biehl

PDF Project Poster DOI

Off-line Learning Analysis for Soft Committee Machines with GELU Activation

In this work we show that the GELU activation function in two-layer neural networks causes a continuous phase transition, independent …

Frederieke Richert, Michiel Straat, Elisa Oostwal, Michael Biehl

PDF Project Poster

Supervised Learning in the Presence of Concept Drift: A modelling framework

A statistical physics based modelling framework is developed in which we study standard training algorithms (SGD, LVQ1) under concept …

Michiel Straat, Fthi Abadi, Zhuoyun Kan, Christina Göpfert, Barbara Hammer, Michael Biehl

PDF Project Slides DOI

Hidden Unit Specialization in Layered Neural Networks: ReLU vs. Sigmoidal Activation

Our systematic comparison of networks with ReLU and sigmoidal units in model situations reveals surprising differences in their …

Elisa Oostwal, Michiel Straat, Michael Biehl

PDF Project DOI

On-line learning dynamics of ReLU neural networks using statistical physics techniques

We introduce exact macroscopic on-line learning dynamics of two-layer neural networks with ReLU units in the form of a system of …

Michiel Straat, Michael Biehl

PDF Project

Statistical Mechanics of On-Line Learning Under Concept Drift

We introduce a modeling framework for the investigation of on-line machine learning processes in non-stationary environments. We …

Michiel Straat, Fthi Abadi, Christina Göpfert, Barbara Hammer, Michael Biehl

PDF Project DOI

Talks

Statistical Physics of Learning

In this talk I discuss the main principles behind the statistical physics of learning.

Jun 27, 2023 12:00 AM Group seminar Machine Learning, Bielefeld, June 2023

Project Slides