Autochess started as a personal experiment: can I build a chess engine
from scratch that learns to play purely from neural network inference?
No Stockfish, no handcrafted evaluation, no opening books — just a neural
network trained on high-Elo games, endgame tablebases, and self-play reinforcement
learning.
The whole thing is an exercise in curiosity. I wanted to understand how
AlphaZero-style systems work — not by reading papers, but by building one
myself, debugging every gradient, and watching it slowly go from random moves to
something that actually plays chess. It’s equal parts frustrating and magical.
Goals
Learn by building — deeply understand neural chess from data
preparation through training to inference
Full pipeline — supervised pretraining, endgame fine-tuning
with Syzygy tablebases, and self-play RL with search distillation
Pure neural inference — during play the model uses only its
policy and value heads with a shallow search, no traditional engine
Ship it — make it playable in the browser so anyone can
test it and see how a neural-only engine feels
How it works
The model is a residual CNN with a policy head (4672 possible moves) and a value
head. It takes an AlphaZero-style 19-plane 8×8 board representation as input.
Training happens in three phases:
Phase 1 — Supervised pretraining on 2200+ Elo Lichess
games. The model learns to predict human expert moves.
Phase 2 — Endgame fine-tuning using Syzygy tablebases
(3-4-5 piece). Perfect WDL/DTZ labels teach the model precise endgame play.
Phase 3 — Self-play RL with search distillation. The model
plays against itself, using a 1-ply search to generate soft policy targets
via KL divergence loss.
The latest model also uses learned thought tokens — a small
set of trainable embeddings that the model attends to via a transformer layer before
making its move prediction. Think of it as a brief internal “pause to think”
that lets the model shift probability mass between candidate moves.