Understanding Optimizers: From SGD to Adam