Code to Accompany the Book “Bandit Algorithms for Website Optimization”
This repo contains code in several languages that implements several standard algorithms for solving the Multi-Armed Bandits Problem, including:
It also contains code that provides a testing framework for bandit algorithms based around simple Monte Carlo simulations.
This codebase is split up by language. In most languages, there are parallel implementations of the core algorithms and infrastructure for testing the algorithms:
In R, there is a body of code for visualizing the results of simulations and analyzing those results. The R code would benefit from some refactoring to make it DRYer.
To try out this code, you can go into the Python or Julia directories and then run the demo script.
In Python, that looks like:
In Julia, that looks like:
You should step through that code line-by-line to understand what the functions are doing. The book provides more in-depth explanations of how the algorithms work.
Adding New Algorithms: API Expectations
As described in the book, a Bandit algorithm should implement two methods:
As described in the book, an Arm simulator should implement:
In addition, the Bandit algorithms are designed to implement one additional method used in simulations:
In a future iteration, this code should be extended to provide Environment objects, which encapsulate not only a set of Arms, but also a mechanism for having those Arms change over time.