Hide metadata

dc.date.accessioned2020-09-10T17:55:32Z
dc.date.available2020-09-10T17:55:32Z
dc.date.created2020-09-06T09:01:00Z
dc.date.issued2020
dc.identifier.citationTruong, Tuyen Trung Nguyen, Hang-Tuan . Backtracking Gradient Descent Method and Some Applications in Large Scale Optimisation. Part 2: Algorithms and Experiments. Applied Mathematics and Optimization. 2020
dc.identifier.urihttp://hdl.handle.net/10852/79322
dc.description.abstractIn this paper, we provide new results and algorithms (including backtracking versions of Nesterov accelerated gradient and Momentum) which are more applicable to large scale optimisation as in Deep Neural Networks. We also demonstrate that Backtracking Gradient Descent (Backtracking GD) can obtain good upper bound estimates for local Lipschitz constants for the gradient, and that the convergence rate of Backtracking GD is similar to that in classical work of Armijo. Experiments with datasets CIFAR10 and CIFAR100 on various popular architectures verify a heuristic argument that Backtracking GD stabilises to a finite union of sequences constructed from Standard GD for the mini-batch practice, and show that our new algorithms (while automatically fine tuning learning rates) perform better than current state-of-the-art methods such as Adam, Adagrad, Adadelta, RMSProp, Momentum and Nesterov accelerated gradient. To help readers avoiding the confusion between heuristics and more rigorously justified algorithms, we also provide a review of the current state of convergence results for gradient descent methods. Accompanying source codes are available on GitHub.
dc.languageEN
dc.rightsAttribution 4.0 International
dc.rights.urihttps://creativecommons.org/licenses/by/4.0/
dc.titleBacktracking Gradient Descent Method and Some Applications in Large Scale Optimisation. Part 2: Algorithms and Experiments
dc.typeJournal article
dc.creator.authorTruong, Tuyen Trung
dc.creator.authorNguyen, Hang-Tuan
cristin.unitcode185,15,13,65
cristin.unitnameFlere komplekse variable, logikk og operatoralgebraer
cristin.ispublishedtrue
cristin.fulltextoriginal
cristin.qualitycode2
dc.identifier.cristin1827552
dc.identifier.bibliographiccitationinfo:ofi/fmt:kev:mtx:ctx&ctx_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.jtitle=Applied Mathematics and Optimization&rft.volume=&rft.spage=&rft.date=2020
dc.identifier.jtitleApplied Mathematics and Optimization
dc.identifier.doihttps://doi.org/10.1007/s00245-020-09718-8
dc.identifier.urnURN:NBN:no-82431
dc.type.documentTidsskriftartikkel
dc.type.peerreviewedPeer reviewed
dc.source.issn0095-4616
dc.identifier.fulltextFulltext https://www.duo.uio.no/bitstream/handle/10852/79322/1/Truong-Nguyen2020_Article_BacktrackingGradientDescentMet.pdf
dc.type.versionPublishedVersion


Files in this item

Appears in the following Collection

Hide metadata

Attribution 4.0 International
This item's license is: Attribution 4.0 International