7 Optimization techniques you may want to know in machine learning (Part 1)
Overview :
Introduction
1- Batch Gradient descent
2- Mini-Batch Gradient descent
3- Stochastic Gradient descent
4- Example of Gradient aggregation methods : SAGA
5- Example of diagonal scaling : RMSProp
6- Momentum
7- Adam
Introduction
In many machine learning problems, you try to minimize some objective funtion f with respect to some parameters x (1). This objective function is also called the cost function. This function often characterizes how well your machine learning algorithm is learning the mapping between your training examples and the desired outputs :
One classical problem is :
LINEAR REGRESSION :
Given some matrix of features A composed of n examples as rows, d features as columns, and some label vector y composed of n real values corresponding to the labels of the n examples of A , we try to approach each value yi (ith value of…