A course in eleven chapters

You built a Pokémon AI.
Here's how it works.

This repository contains a system that taught itself to play competitive Pokémon doubles by playing millions of turns against itself — and became a draft-and-play recommender. You steered every step of it. This course explains, from zero, what each piece actually does and why it exists.

No machine-learning background is assumed. Every term is defined the first time it appears, every idea is grounded in something that actually happened in this project, and where an analogy helps, it comes from data engineering — pipelines, caches, schemas — because that's the world this course's reader lives in.

The whole project in one picture

The loop the whole repo exists to run. The AI plays batches of games against itself, learns from the outcomes, gets measured, and — if it passes — becomes both the new opponent and the product. Every chapter in this course explains one part of this picture.

The story in one paragraph

Pokémon Champions Regulation M-B is a doubles format: bring six Pokémon, pick four, lead with two. The lucky break that made this project feasible is that the format already exists inside Pokémon Showdown, an open-source battle simulator — so instead of building a game engine, we wrapped one (Chapter 2). On top of it we trained a small neural network by reinforcement learning: it starts clicking random buttons, and every win or loss nudges it toward better clicking (Chapters 3–7). It grew eyes (features for HP, speed, "does this move KO?"), a memory for identities (embeddings for species, items, moves), and eventually took over drafting the team itself. A search procedure that thinks one turn ahead made it stronger still — and then quietly destroyed its ability to play Trick Room, teaching us the project's biggest lesson: a win-rate number can hide a strategy lobotomy (Chapter 10). The final system recommends your pick-four, your leads, and your first moves against any opponent (Chapter 11).

The course

Part I — The setup

Part II — The learner

Part III — The craft

Part IV — The payoff

How to read this course

Chapters build on each other, so first time through, go in order — Part II especially is a staircase. Along the way you'll meet recurring signposts:

In plain terms analogy

The concept restated with no math, usually in data-engineering language.

War story pitfall

Something that actually went wrong in this project, and what it taught us. These are the best parts.

Key point takeaway

The one sentence to remember if you remember nothing else from the section.

Check yourself quiz

Two or three questions at the end of each chapter. Click to reveal the answer.

Numbers in this course are the project's real numbers — win rates, Elo ratings, feature dimensions — taken from the repo's own docs (docs/roadmap.md, docs/evaluate.md, docs/regression.md, docs/architecture.md and the archive). Where a chapter refers to code, it names the file, so you can go read the real thing.

Key point

Nothing in this system was designed by a genius in one sitting. It's a loop — play, learn, measure — plus two years' worth of lessons compressed into a few months of asking "why is it doing that?" and fixing what the answer revealed. This course is the map of those fixes.

You built a Pokémon AI.Here's how it works.