RT @burkov
This paper shows how to beat models with 671 billion parameters using just 7 million parameters on hard puzzle tasks. On the ARC-AGI reasoning benchmark, this tiny model achieves 45% accuracy compared to 15.8% for DeepSeek R1 and 34.5% for OpenAI's o3-mini. On extremely difficult Sudoku puzzles with only 1,000 training examples, it reaches 87% accuracy where the previous best method managed 55%.
The method treats problem-solving as iterative refinement. The network maintains a current answer and a separate "reasoning" state, then repeatedly updates both: reason about the problem, propose a better answer, reason again, improve further. A Sudoku solver might start with a partially filled grid, update some cells, then reason again about the new constraints—gradually correcting mistakes across up to 16 refinement cycles.
The core insight is that large models fail because they generate answers in one pass, so early mistakes cascade. Breaking the problem into iterative steps, even with far less computational power, proves more effective. Training takes hours on a single GPU instead of weeks on clusters, and the model fits in 28MB instead of terabytes.
Let the article talk to you on ChapterPal: https://chapterpal.com/s/e2942994/less-is-more-recursive-reasoning-with-tiny-networks
Download the PDF: https://arxiv.org/pdf/2510.04871