Comparison of neural network optimization methods using the image classification problem.
Keywords:
neural networks, stochastic gradient descent, optimization methods, training of neural networks, distributed computing, asynchronous server
Abstract
The article analyzes existing optimization methods and types of distributed computing for neural network training. On the basis of the conducted experiments, it was investigated the feasibility of using these methods for different types of data and architecture of neural networks.
References
Wilson, D. R. and Martinez, T. R. (2003). The general inefficiency of batch training for gradient descent learning. Neural Networks, 16(10), 1429–1451.
Rao, C. (1945). Information and the accuracy attainable in the estimation of statistical parameters. Bulletin of the Calcutta Mathematical Society, 37, 81–89.
C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, and Z. Wojna. “Rethinking the Inception Architecture for Computer Vision”. In: CoRR abs/1512.00567 (2015). arXiv: 1512.00567. URL: http: //arxiv.org/abs/1512.00567.
K. He, X. Zhang, S. Ren, and J. Sun. “Deep Residual Learning for Image Recognition”. In: Computing Research Repository abs/1512.03385 (2015). arXiv: 1512.03385. URL: http://arxiv.org/abs/ 1512.03385.
K. Simonyan and A. Zisserman. “Very Deep Convolutional Networks for Large-Scale Image Recognition”. In: Computing Research Repository abs/1409.1556 (2014). arXiv: 1409.1556. URL: http://arxiv.org/abs/1409.1556. K. Simonyan and A. Zisserman. “Very Deep Convolutional Networks for Large-Scale Image Recognition”. In: Computing Research Repository abs/1409.1556 (2014). arXiv: 1409.1556. URL: http://arxiv.org/abs/1409.1556.
ImageNet [Електронний ресурс]. – Режим доступу: http://www.image-net.org/ (Дата звернення 25.10.19 р.).
Tensorflow benchmarks. https://github.com/tensorflow/ benchmarks/tree/master/scripts/tf_cnn_benchmarks. 2018.
Cauchy and the Gradient Method- math.uni-bielefeld.de/documenta/vol-ismp/40_lemarechal-claude.pdf
Russel, S. J. and Norvig, P. (2003). Artificial Intelligence: a Modern Approach. Prentice Hall
Duchi, J., Hazan, E., and Singer, Y. (2011). Adaptive subgradient methods for online learning and stochastic optimization. Journal of Machine Learning Research.
Tang, Y., Salakhutdinov, R., and Hinton, G. (2012). Deep mixtures of factor analysers. arXiv preprint arXiv:1206.4635.
Kingma, D. and Ba, J. (2014). Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980
Rao, C. (1945). Information and the accuracy attainable in the estimation of statistical parameters. Bulletin of the Calcutta Mathematical Society, 37, 81–89.
C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, and Z. Wojna. “Rethinking the Inception Architecture for Computer Vision”. In: CoRR abs/1512.00567 (2015). arXiv: 1512.00567. URL: http: //arxiv.org/abs/1512.00567.
K. He, X. Zhang, S. Ren, and J. Sun. “Deep Residual Learning for Image Recognition”. In: Computing Research Repository abs/1512.03385 (2015). arXiv: 1512.03385. URL: http://arxiv.org/abs/ 1512.03385.
K. Simonyan and A. Zisserman. “Very Deep Convolutional Networks for Large-Scale Image Recognition”. In: Computing Research Repository abs/1409.1556 (2014). arXiv: 1409.1556. URL: http://arxiv.org/abs/1409.1556. K. Simonyan and A. Zisserman. “Very Deep Convolutional Networks for Large-Scale Image Recognition”. In: Computing Research Repository abs/1409.1556 (2014). arXiv: 1409.1556. URL: http://arxiv.org/abs/1409.1556.
ImageNet [Електронний ресурс]. – Режим доступу: http://www.image-net.org/ (Дата звернення 25.10.19 р.).
Tensorflow benchmarks. https://github.com/tensorflow/ benchmarks/tree/master/scripts/tf_cnn_benchmarks. 2018.
Cauchy and the Gradient Method- math.uni-bielefeld.de/documenta/vol-ismp/40_lemarechal-claude.pdf
Russel, S. J. and Norvig, P. (2003). Artificial Intelligence: a Modern Approach. Prentice Hall
Duchi, J., Hazan, E., and Singer, Y. (2011). Adaptive subgradient methods for online learning and stochastic optimization. Journal of Machine Learning Research.
Tang, Y., Salakhutdinov, R., and Hinton, G. (2012). Deep mixtures of factor analysers. arXiv preprint arXiv:1206.4635.
Kingma, D. and Ba, J. (2014). Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980
Abstract views: 451 PDF Downloads: 359
Published
2019-12-28
How to Cite
Polishchuk, M., Kostiuchko, S., & Khrystynets, M. (2019). Comparison of neural network optimization methods using the image classification problem. COMPUTER-INTEGRATED TECHNOLOGIES: EDUCATION, SCIENCE, PRODUCTION, (37), 43-52. https://doi.org/10.36910/6775-2524-0560-2019-37-7
Section
Computer science and computer engineering