UTCS Artificial Intelligence
courses
talks/events
demos
people
projects
publications
software/data
labs
areas
admin
Adaptive Packet Routing: The Confidence-Based Dual Reinforcement Q-Learning Algorithm
Active from 1998 - 2000
Standard reinforcement learning (TD or Q learning) is based on forward exploration: later estimates are used to update earlier ones. In Dual Reinforcement Learning, backward exploration is also utilized: earlier estimates are used to update later estimates. The quality of estimates can be further improved by keeping track of how recently they were updated. In this project, these ideas are applied to the Q-routing algorithm for adaptive packet routing in communication networks, improving the speed of learning and the quality of the final routing policy.
People
Shailesh Kumar
Masters Alumni
Publications
Topographic Receptive Fields and Patterned Lateral Interaction in a Self-Organizing Model of the Primary Visual Cortex
1996
Joseph Sirosh and Risto Miikkulainen,
Neural Computation
, Vol. 9 (1996), pp. 577-594.
Related Areas
Reinforcement Learning
Labs
Neural Networks