This is the "Data Stream Mining" project of Yuyan Zhao and Bérénice Jaulmes.
- bayes_ucb.py
It implements the BayesUCB algorithm[1] for the multi-armed bandit problem with the River library. We chose to use a Beta distribution to compute the posterior distribution, and use the p-th quantile as the upper confidence bound (UCB) for each arm. The arm with the highest UCB is then pulled. And the posterior distribution for the pulled arm is updated.
- test.py
It evaluates the performance of the BayesUCB policy defined, and compares it with the existing bandit algorithms in River. The result is shown in the figure below:
- /previous_version
Our current code has been reorganized by Max via Pull requests to the River. The previous versions are also available at /previous_version
[1] Kaufmann, Emilie, Olivier Cappé, and Aurélien Garivier. "On Bayesian upper confidence bounds for bandit problems." Artificial intelligence and statistics. PMLR, 2012.