- Published on
Coding agents run in autonomous loops that compound token costs in ways classic chat never does. This article breaks down how tokens accumulate, why agent mode is expensive by default, and four practical strategies to cut your bill significantly: prompt caching, planning first, output compression with the Caveman skill, and input compression with RTK.