Please use this identifier to cite or link to this item: https://dspace.iiti.ac.in/handle/123456789/17499
Title: Sequential decision making under uncertainty: efficient Q-learning frameworks
Authors: Shreyas SR
Supervisors: Vijesh, Antony
Keywords: Mathematics
Issue Date: 3-Dec-2025
Publisher: Department of Mathematics, IIT Indore
Series/Report no.: TH781;
Abstract: This thesis focuses on the development of efficient, convergent algorithms for solving problems in dynamic programming, reinforcement learning, and multi-agent learning. The work begins with novel first-order iterative schemes derived from a computationally expensive second-order method that approximates the Bellman equation using smooth functions. These new schemes retain the global convergence property while being more computationally efficient and easier to implement. Next, the thesis proposes a Weighted Smooth Q-Learning (WSQL) algorithm to address overestimation and underestimation biases in Q-learning and double Q-learning, respectively. By incorporating a weighted blend of mellowmax and log-sum-exp operators, WSQL achieves stability and theoretical convergence guarantees. The third part of the thesis introduces off-policy two-step Q-learning algorithms—both standard and smooth variants—that improve convergence and robustness without relying on importance sampling. Finally, the thesis extends these techniques to the multi-agent setting, proposing a multi-step minimax Q-learning algorithm for solving two-player zero-sum Markov games. Theoretical analysis ensures boundedness and almost sure convergence of the algorithms under suitable assumptions. Across all contributions, the proposed methods are validated through comprehensive numerical experiments on benchmark problems, demonstrating their e!ectiveness, robustness, and practical utility. Keywords: Reinforcement Learning, Q-learning, Bellman Equation, Value Iteration, Two-Player Zero-Sum Games, Stochastic Approximation, Smooth Operators.
URI: https://dspace.iiti.ac.in:8080/jspui/handle/123456789/17499
Type of Material: Thesis_Ph.D
Appears in Collections:Department of Mathematics_ETD

Files in This Item:
File Description SizeFormat 
TH_781_Shreyas_S_R_1901241006.pdf4.85 MBAdobe PDFView/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

Altmetric Badge: