Causality-driven RL-based Scheduling Policies for Diverse Delay Constraints

Roy, Dibbendu

Please use this identifier to cite or link to this item: https://dspace.iiti.ac.in/handle/123456789/16971

Title:	Causality-driven RL-based Scheduling Policies for Diverse Delay Constraints
Authors:	Roy, Dibbendu
Keywords:	Causality;Counterfactuals;Diverse Delay Constraints;Reinforcement Learning;Scheduling Policies;Calcium Compounds;Optimization;Stochastic Systems;Causal Modeling;Causality;Counterfactuals;Delay Constraints;Delay Violation;Diverse Delay Constraint;Multiusers;Reinforcement Learnings;Scheduling Policies;Wireless Channel;Reinforcement Learning
Issue Date:	2025
Publisher:	Institute of Electrical and Electronics Engineers Inc.
Citation:	Roy, D., & Gross, J. J. (2025). Causality-driven RL-based Scheduling Policies for Diverse Delay Constraints. https://doi.org/10.1109/ICMLCN64995.2025.11140535
Abstract:	We investigate the role of causal models in the context of obtaining scheduling policies that minimize delay violations. We consider multi-user queuing systems with random delay constraints, packet sizes and arrivals in Gilbert-Elliot wireless channels. Owing to Judea Pearl's landmark work on causality to achieve a higher level of cognitive ability, we demonstrate the role of counterfactual reasoning, leading to the well-investigated optimal EDF policy for wired channels. Due to the randomness associated with the wireless channels, finding an optimal policy is not straightforward, leading to RL-based approaches. We present CPPO (counterfactual-PPO) and CA2C (counterfactual-A2C) algorithms that use counterfactual examples generated using causal models during the training process. We argue how stochastic gradient based policy gradient RL algorithms benefit during training due to incorporation of counterfactuals. We show that these algorithms provably lead to lower variance indicating a robust learning performance. Our results demonstrate a ~ 60% increase in the number of cases where CA2C and CPPO outperform their non-counterfactual counterparts with reduced variance and negligible computation overhead. © 2025 Elsevier B.V., All rights reserved.
URI:	https://dx.doi.org/10.1109/ICMLCN64995.2025.11140535 https://dspace.iiti.ac.in:8080/jspui/handle/123456789/16971
ISBN:	979-8331520427
Type of Material:	Conference Paper
Appears in Collections:	Department of Electrical Engineering

Files in This Item:

There are no files associated with this item.

Show full item record

Altmetric Badge: