It takes a lot of tokens to replace a human

March 20, 2026 · Brett and Gerry · ai

Businesses are getting excited about replacing workers with AI agents that can work around the clock and never tire, but replacing (or even augmenting) workers with AI consumes a lot of tokens.
For example, some users report burning through over 100M tokens / day with their openclaw agents. One reason Openclaw tears through the tokens is that it has a “heartbeat” that fires every 30 minutes, and this basic check will often transmit over 100K tokens as basic context about the system NotebookCheck, Feb 2026.

If you burned through 100M input tokens on Claude Opus 4.6, that would cost $500 on Anthropic’s current plan. Of course, output tokens cost 5 times that much. Now, there’s cottage industry helping you reduce your OpenClaw token usage, but there’s still an issue: supercharging your workers is expensive. Jensen Huang recently said that top engineers should be burning through $250K / year in tokens. At current prices of $5-$25 per million tokens, that’s about 25-135M tokens/day.

This means that even if we wanted to replace all our engineers with AI, we’d need a lot of compute. But how much?

If we’re essentially replacing workers with GPUs, we need to know how many tokens per second a modern GPU can generate. For frontier-class models (e.g. Llama 3.1 405B), current Nvidia GPUs (e.g. H100 or H200s) produce about 500 tokens / sec, according to Nvidia benchmarks. Nvidia’s newer B200s can produce almost 1800 tokens / sec.

To support a worker using 100M tokens / day, you would need to replace the worker with about 2.3 H100s. H100s currently cost about $25K. A $57K one-time fee seems better than Jensen Huang’s estimate of $250K / year. But of course, for a company, if you want to run your own models, you’re stuck with the open-source stuff, which is far behind Opus 4.6 in capabilities.

No matter who’s running the GPUs, if we wanted to augment our knowledge workers with AI, we’d need a lot of GPUs. There are about 63 million knowledge workers in the US. If you wanted to replace (or augment) them all with 2.3 H100s, that would cost about $3.6T. We’d need about 7 Stargates to make that happen.

But maybe, weaker users don’t need as many tokens. Maybe Jensen Huang was just saying that companies should be spending about half of workers’ salaries on AI. If you imagine that a company’s expenditures on employees stay about the same, but every company spends 50% more per employee, that means they would have to cut their workforce by 1/3 to keep total expenditures the same. GlassDoor suggests the average knowledge worker makes about $84-152K / year. So this means that (after firing 1/3 of the workforce) there would be about 42 million knowledge workers, and given the salary ranges provided by GlassDoor, aggregate spend on knowledge workers (salary plus AI combined) would run about $3.4T - 6.4T annually. The token bill alone would be roughly half of that: $1.8T - 3.2T, or about 4-6 Stargates’ worth of compute.

Either way, the bill is enormous. Replacing workers with compute isn’t just a technical question, it’s a financing question on the scale of rebuilding national infrastructure. And the comparison to current salaries misses something: salaries circulate back into the economy as consumer demand, while token spend mostly flows to a handful of chip and cloud vendors. So even if the compute is technically feasible, the economic substitution isn’t one-for-one. Replacing a worker with GPUs isn’t just expensive, it quietly rewires where the money goes.

Related: The AI Layoff Trap.