I looked at just one of those multi-agent-systems, called MAS-ZERO. In its minimal setting, it dispatches 5 concurrent queries to e.g. gpt-5 and loops that 10 times.
Every one of these queries is turned into tens of thousands of tokens by the system prompt, and routinely dispatches secondary queries to 5 other models. So if your initial query was 100 tokens, this turns it into 10,000 x 5 x 50 tokens, so several million.
And that is without even going to "reasoning"
(2/3)
With reasoning, there is yet another explosion of at least 10x, often 100x.
Which takes us rapidly into tens to hundreds of millions of tokens for a single query that started out as a few hundred tokens.
This is why Google now processes quadrillions of tokens monthly.
(3/3)
#FrugalComputing #AgenticAI #GenAI