Sitemap
TDS Archive

An archive of data science, data analytics, data engineering, machine learning, and artificial intelligence writing from the former Towards Data Science Medium publication.

Member-only story

Monte Carlo Methods for Solving Reinforcement Learning Problems

Dissecting “Reinforcement Learning” by Richard S. Sutton with Custom Python Implementations, Episode III

18 min readSep 4, 2024

--

We continue our deep dive into Sutton’s great book about RL [1] and here focus on Monte Carlo (MC) methods. These are able to learn from experience alone, i.e. do not require any kind of model of the environment, as e.g. required by the Dynamic programming (DP) methods we introduced in the previous post.

This is extremely tempting — as often the model is not known, or it is hard to model the transition probabilities. Consider the game of Blackjack: even though we fully understand the game and the rules, solving it via DP methods would be very tedious — we would have to compute all kinds of probabilities, e.g. given the currently played cards, how likely is a “blackjack”, how likely is it that another seven is dealt … Via MC methods, we don’t have to deal with any of this, and simply play and learn from experience.

Press enter or click to view image in full size
Photo by Jannis Lucas on Unsplash

Due to not using a model, MC methods are unbiased. They are conceptually simple and easy to understand, but exhibit a high variance and cannot be solved in iterative fashion (bootstrapping).

As mentioned, here we will introduce these methods following Chapter 5 of Sutton’s book…

--

--

TDS Archive
TDS Archive

Published in TDS Archive

An archive of data science, data analytics, data engineering, machine learning, and artificial intelligence writing from the former Towards Data Science Medium publication.

Oliver S
Oliver S

Written by Oliver S

PhD in ML, working as research / software engineer