News

The closure should return a loss or an iterator with its first element as the loss. Momentum here is the moving average of gradient so that its setting is decoupled from the learning rate, which is ...