The network training is a minimizing the loss function process. In other words, it is a process that optimizes weights and biases so that they lead to minimized loss.
- Backpropagation: computing gradients technique
- Gradient descent: repeatedly evaluating the gradient and then performing a parameter update.
- Mini-batch Gradient Descent
- Stochastic Gradient Descent update