Please use this identifier to cite or link to this item: https://dspace.iiti.ac.in/handle/123456789/16432
Full metadata record
DC FieldValueLanguage
dc.contributor.authorSumithra Rudresha, Shreyasen_US
dc.date.accessioned2025-07-09T13:48:02Z-
dc.date.available2025-07-09T13:48:02Z-
dc.date.issued2025-
dc.identifier.citationShreyas, S. R. (2025). Double Successive Over-Relaxation Q-Learning With an Extension to Deep Reinforcement Learning. IEEE Transactions on Neural Networks and Learning Systems. https://doi.org/10.1109/TNNLS.2025.3576581en_US
dc.identifier.issn2162-237X-
dc.identifier.otherEID(2-s2.0-105009055087)-
dc.identifier.urihttps://dx.doi.org/10.1109/TNNLS.2025.3576581-
dc.identifier.urihttps://dspace.iiti.ac.in:8080/jspui/handle/123456789/16432-
dc.description.abstractQ-learning (QL) is a widely used algorithm in reinforcement learning (RL), but its convergence can be slow, especially when the discount factor is close to one. Successive over-relaxation (SOR) QL, which introduces a relaxation factor to speed up convergence, addresses this issue but has two major limitations. In the tabular setting, the relaxation parameter depends on transition probability, making it not entirely model-free, and it suffers from overestimation bias. To overcome these limitations, we propose a sample-based, model-free double SORQL (MF-DSORQL) algorithm. Theoretically and empirically, this algorithm is shown to be less biased than SORQL. Furthermore, in the tabular setting, the convergence analysis under boundedness assumptions on iterates is discussed. The proposed algorithm is extended to large-scale problems using deep RL. Finally, both the tabular version of the proposed algorithm and its deep RL extension are tested on benchmark examples. © 2012 IEEE.en_US
dc.language.isoenen_US
dc.publisherInstitute of Electrical and Electronics Engineers Inc.en_US
dc.sourceIEEE Transactions on Neural Networks and Learning Systemsen_US
dc.subjectDeep reinforcement learning (RL)en_US
dc.subjectMarkov decision processes (MDPs)en_US
dc.subjectoverestimation biasen_US
dc.subjectsuccessive over-relaxation (SOR)en_US
dc.titleDouble Successive Over-Relaxation Q-Learning With an Extension to Deep Reinforcement Learningen_US
dc.typeJournal Articleen_US
Appears in Collections:Department of Mathematics

Files in This Item:
There are no files associated with this item.


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

Altmetric Badge: