Please use this identifier to cite or link to this item:
https://dspace.iiti.ac.in/handle/123456789/17499
| Title: | Sequential decision making under uncertainty: efficient Q-learning frameworks |
| Authors: | Shreyas SR |
| Supervisors: | Vijesh, Antony |
| Keywords: | Mathematics |
| Issue Date: | 3-Dec-2025 |
| Publisher: | Department of Mathematics, IIT Indore |
| Series/Report no.: | TH781; |
| Abstract: | This thesis focuses on the development of efficient, convergent algorithms for solving problems in dynamic programming, reinforcement learning, and multi-agent learning. The work begins with novel first-order iterative schemes derived from a computationally expensive second-order method that approximates the Bellman equation using smooth functions. These new schemes retain the global convergence property while being more computationally efficient and easier to implement. Next, the thesis proposes a Weighted Smooth Q-Learning (WSQL) algorithm to address overestimation and underestimation biases in Q-learning and double Q-learning, respectively. By incorporating a weighted blend of mellowmax and log-sum-exp operators, WSQL achieves stability and theoretical convergence guarantees. The third part of the thesis introduces off-policy two-step Q-learning algorithms—both standard and smooth variants—that improve convergence and robustness without relying on importance sampling. Finally, the thesis extends these techniques to the multi-agent setting, proposing a multi-step minimax Q-learning algorithm for solving two-player zero-sum Markov games. Theoretical analysis ensures boundedness and almost sure convergence of the algorithms under suitable assumptions. Across all contributions, the proposed methods are validated through comprehensive numerical experiments on benchmark problems, demonstrating their e!ectiveness, robustness, and practical utility. Keywords: Reinforcement Learning, Q-learning, Bellman Equation, Value Iteration, Two-Player Zero-Sum Games, Stochastic Approximation, Smooth Operators. |
| URI: | https://dspace.iiti.ac.in:8080/jspui/handle/123456789/17499 |
| Type of Material: | Thesis_Ph.D |
| Appears in Collections: | Department of Mathematics_ETD |
Files in This Item:
| File | Description | Size | Format | |
|---|---|---|---|---|
| TH_781_Shreyas_S_R_1901241006.pdf | 4.85 MB | Adobe PDF | View/Open |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.
Altmetric Badge: