Peter Stone's Selected Publications

• Classified by Topic • Classified by Publication Type • Sorted by Date • Sorted by First Author Last Name • Classified by Funding Source •

Online Kernel Selection for Bayesian Reinforcement Learning

Online Kernel Selection for Bayesian Reinforcement Learning.
Joseph Reisinger, Peter Stone, and Risto Miikkulainen.
In Proceedings of the Twenty-Fifth International Conference on Machine Learning, July 2008.
ICML 2008

Download

[PDF]434.0kB [postscript]1.7MB

Abstract

Kernel-based Bayesian methods for Reinforcement Learning (RL) such as Gaussian Process Temporal Difference (GPTD) are particularly promising because they rigorously treat uncertainty in the value function and make it easy to specify prior knowledge. However, the choice of prior distribution significantly affects the empirical performance of the learning agent, and little work has been done extending existing methods for prior model selection to the online setting. This paper develops Replacing-Kernel RL, an online model selection method for GPTD using sequential Monte-Carlo methods. Replacing-Kernel RL is compared to standard GPTD and tile-coding on several RL domains, and is shown to yield significantly better asymptotic performance for many different kernel families. Furthermore, the resulting kernels capture an intuitively useful notion of prior state covariance that may nevertheless be difficult to capture manually.

BibTeX Entry

@InProceedings{ICML08-reisinger,
    author="Joseph Reisinger and Peter Stone and Risto Miikkulainen",
    title="Online Kernel Selection for Bayesian Reinforcement Learning",
    booktitle="Proceedings of the Twenty-Fifth International Conference on Machine Learning",
    month="July",year="2008",
    abstract={ Kernel-based Bayesian methods for Reinforcement Learning
		(RL) such as Gaussian Process Temporal Difference (GPTD) are
		particularly promising because they rigorously treat
		uncertainty in the value function and make it easy to specify
		prior knowledge. However, the choice of prior distribution
		significantly affects the empirical performance of the learning
		agent, and little work has been done extending existing methods
		for prior model selection to the online setting. This paper
		develops Replacing-Kernel RL, an online model selection method
		for GPTD using sequential Monte-Carlo methods. Replacing-Kernel
		RL is compared to standard GPTD and tile-coding on several RL
		domains, and is shown to yield significantly better asymptotic
		performance for many different kernel families. Furthermore, the
		resulting kernels capture an intuitively useful notion of prior
		state covariance that may nevertheless be difficult to capture
		manually. },
    wwwnote={<a href="http://icml2008.cs.helsinki.fi/">ICML 2008</a>},
}

Generated by bib2html.pl (written by Patrick Riley ) on Wed Dec 10, 2025 13:49:03