Please use this identifier to cite or link to this item: https://dspace.iiti.ac.in/handle/123456789/11296
Title: CaaGrad: an adaptive approach for optimization in deep learning
Authors: Kumar, Saurav
Supervisors: Ahuja, Kapil
Tiwari, Aruna
Keywords: Computer Science and Engineering
Issue Date: 8-Feb-2023
Publisher: Department of Computer Science and Engineering, IIT Indore
Series/Report no.: MSR031;
Abstract: Deep neural networks (DNNs) are widely used and have demonstrated their usefulness in many applications, such as computer vision and pattern recognition. However, the training and inference of these networks can be time-consuming. Such a problem could be alleviated by using efficient optimizers. As one of the most commonly used opti mizers, Adagrad adaptively selects the learning rate using the gradient information. Although Adagrad works reasonably well, it does have drawbacks of overshoot and oscillations around the minima. This slows the convergence of Adagrad. To alleviate this overshoot problem and accelerate the convergence of DNN optimization, we combine the adaptiveness of Adagrad with the change in gradient information as in the PID optimizer. We term our new optimizer as a Control parameter-based gradient descent optimizer or CaaGrad. Using a Multi-Layered Perceptron (MLP) on the MNIST and Fashion MNIST datasets, we show the average accuracy in training, validation, and testing of CaaGrad is 9.96%, 8.67%, and 4.19% better than Adagrad, PID optimizer, and Adam (another state-of-the-art optimizer), respectively. To help overcome the oscillations in CaaGrad, we add a controlling factor to it. We term our new optimizer CaaGrad2. Using MLP (Multi-Layered Perceptron) on MNIST and Fashion MNIST datasets, the average accuracy (in training, validation, and testing) of CaaGrad is 9.11% better than that of Adam.
URI: https://dspace.iiti.ac.in/handle/123456789/11296
Type of Material: Thesis_MS Research
Appears in Collections:Department of Computer Science and Engineering_ETD

Files in This Item:
File Description SizeFormat 
MSR031_Saurav_Kumar_2004101009.pdf2.61 MBAdobe PDFView/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

Altmetric Badge: