Hermite Polynomials Facilitating On-line Learning Analysis of Layered Neural Networks with Arbitrary Activation Function

Abstract

Following the standard statistical mechanics methods we analyze the training by online stochastic gradient descent of two-layer neural networks in a student–teacher scenario. We focus on understanding the role that different activations play, in particular mismatches between the student and the teacher, in these learning scenarios. By expanding the activation functions in the Hermite polynomial basis, we are able to effectively approximate the relevant integrals with much less computational effort than naive numerical integration. Moreover, we also extend the framework to study scenarios of concept drift and weight decay also with arbitrary activation functions. All these extensions comprise relevant advances in the field, allowing us to obtain analytical results for more realistic scenarios.

Publication
Hermite Polynomials Facilitating On-line Learning Analysis of Layered Neural Networks with Arbitrary Activation Function