Source: The Register
As OpenAI, Anthropic, and other API providers tighten rate limits and raise token costs, developers are shifting to self-hosted open models like Llama and Mistral to avoid metering altogether—trading convenience for control. Companies with in-house ML expertise now have a concrete incentive to absorb operational complexity rather than pay per-token rent to cloud providers. The move mirrors earlier patterns in databases and compute, but matters because coding agents are where enterprises first see measurable ROI from LLMs, making local deployment a business decision, not a technical one.