10- Armed Bandit Test bed using greedy algorithm

Version 1.0.0.0 (1.35 KB) by Sai Sandeep Damera

This is a script to create a 10 armed bandit testbed using Greedy algorithm

5.0

(1)

352 Downloads

Updated 12 Mar 2018

View License

This was a set of 2000 randomly generated k-armed bandit
problems with k = 10. For each bandit problem, the action values,
q*(a), a = 1,2 .... 10, were selected according to a normal (Gaussian) distribution with mean 0 and
variance 1. Then, when a learning method applied to that problem selected action At at time step t,
the actual reward, Rt, was selected from a normal distribution with mean q*(At) and variance 1.
For any learning method, we can measure its performance and behavior as it improves with experience over
1000 time steps when applied to one of the bandit problems. This makes up one run. Repeating this
for 2000 independent runs, each with a different bandit problem, we obtained measures of the learning
algorithm's average behavior.
We use the sample average technique for action-value estimates and compare the results of a greedy algorithm by plotting the average reward over 2000 simulations. The code can be modified for a non-greedy algorithm as well.

Cite As

Sai Sandeep Damera (2024). 10- Armed Bandit Test bed using greedy algorithm (https://www.mathworks.com/matlabcentral/fileexchange/66467-10-armed-bandit-test-bed-using-greedy-algorithm), MATLAB Central File Exchange. Retrieved April 19, 2024.

MATLAB Release Compatibility

Created with R2017b

Compatible with any release

Platform Compatibility

Windows macOS Linux

Tags Add Tags

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Armed_Bandit_Testbed_Greedy_Sutton.m

Version	Published	Release Notes
1.0.0.0	12 Mar 2018		Download