CaaGrad: an adaptive approach for optimization in deep learning

Kumar, Saurav

Please use this identifier to cite or link to this item: https://dspace.iiti.ac.in/handle/123456789/11296

Full metadata record

DC Field	Value	Language
dc.contributor.advisor	Ahuja, Kapil	-
dc.contributor.advisor	Tiwari, Aruna	-
dc.contributor.author	Kumar, Saurav	-
dc.date.accessioned	2023-02-22T11:48:02Z	-
dc.date.available	2023-02-22T11:48:02Z	-
dc.date.issued	2023-02-08	-
dc.identifier.uri	https://dspace.iiti.ac.in/handle/123456789/11296	-
dc.description.abstract	Deep neural networks (DNNs) are widely used and have demonstrated their usefulness in many applications, such as computer vision and pattern recognition. However, the training and inference of these networks can be time-consuming. Such a problem could be alleviated by using efficient optimizers. As one of the most commonly used opti mizers, Adagrad adaptively selects the learning rate using the gradient information. Although Adagrad works reasonably well, it does have drawbacks of overshoot and oscillations around the minima. This slows the convergence of Adagrad. To alleviate this overshoot problem and accelerate the convergence of DNN optimization, we combine the adaptiveness of Adagrad with the change in gradient information as in the PID optimizer. We term our new optimizer as a Control parameter-based gradient descent optimizer or CaaGrad. Using a Multi-Layered Perceptron (MLP) on the MNIST and Fashion MNIST datasets, we show the average accuracy in training, validation, and testing of CaaGrad is 9.96%, 8.67%, and 4.19% better than Adagrad, PID optimizer, and Adam (another state-of-the-art optimizer), respectively. To help overcome the oscillations in CaaGrad, we add a controlling factor to it. We term our new optimizer CaaGrad2. Using MLP (Multi-Layered Perceptron) on MNIST and Fashion MNIST datasets, the average accuracy (in training, validation, and testing) of CaaGrad is 9.11% better than that of Adam.	en_US
dc.language.iso	en	en_US
dc.publisher	Department of Computer Science and Engineering, IIT Indore	en_US
dc.relation.ispartofseries	MSR031;	-
dc.subject	Computer Science and Engineering	en_US
dc.title	CaaGrad: an adaptive approach for optimization in deep learning	en_US
dc.type	Thesis_MS Research	en_US
Appears in Collections:	Department of Computer Science and Engineering_ETD

Files in This Item:

File	Description	Size	Format
MSR031_Saurav_Kumar_2004101009.pdf		2.61 MB	Adobe PDF	View/Open

Show simple item record

Altmetric Badge: