Sunday, May 31, 2026 ~5 min

The week the AI stack billed differently and broke differently

Anthropic killed the third-party harness discount, ran out of capacity, and started serving Claude off a hostile competitor's GPUs — while NGINX, Traefik, and MOVEit all shipped pre-auth bugs on the same edge your agents now sit behind.

On June 15, every Claude subscription converts into a dollar-matched API credit pool. The 70-90% implicit discount that Cursor, Cline, OpenCode, Zed, and the rest of the third-party harness ecosystem has been quietly riding for the last year — gone. Programmatic usage now bills at API rates. Overflow at API rates. No rollover.

If you've been modeling per-developer Claude cost off a $200 Max plan, your number is wrong by roughly an order of magnitude. ServiceNow's CDIO already lived the preview: full-year Anthropic budget exhausted by May, with no per-user telemetry from Anthropic to explain which workloads ate it. National Life Group's CIO put it on the record — "great for consumer usage but not great for companies."

The pricing change isn't the interesting part. The pricing change is what an IPO-bound company does to its margin profile four months before the S-1. The interesting part is what's underneath it.

An 80x miss, papered over with a hostile lease

Dario Amodei said it on stage: Anthropic planned for 10x growth and got 80x. The eight-fold forecast error is what users were experiencing for weeks as silent quality degradation in Claude Code — not a model regression, a capacity wall. Monitoring didn't catch it because upstreams that degrade without returning 5xx don't trigger fallbacks. The signal was users saying the output got worse.

The relief valve is the part that should make every security and procurement lead sit up: Anthropic leased xAI's entire Colossus 1 cluster — 220,000+ GPUs across H100, H200, and GB200 — from the CEO who called them "misanthropic and evil" three months ago. Read that contract before the next vendor review. Customer prompts, source code, and completions now transit infrastructure operated by a direct competitor with documented hostility to the counterparty. Most sub-processor registers don't list it yet. Most DPIAs were written before it was true.

And Anthropic just passed OpenAI in B2B share — 34.4% to 32.3% per Ramp. Which means most DLP, CASB, and egress monitoring stacks, calibrated when ChatGPT was the synonym for "LLM risk," are now blind to the larger exfiltration channel. Parity monitoring for api.anthropic.com, the Claude Code CLI, and MCP server traffic is a two-week sprint, not a quarterly initiative.

OpenAI's two-month window is a free benchmark, not a sales pitch

Hours after the metering announcement, OpenAI dropped two months of free Codex for enterprise switchers within a 30-day window. Treat it as what it is: a zero-cost head-to-head evaluation on workloads you actually run. Worst case is data. Best case is a credible second source the next time Anthropic decides to A/B test access revocation on your account — which they have done, to paying customers, this quarter.

The multi-provider abstraction layer was a nice-to-have when one vendor was 4x cheaper than the API. It is now the difference between a config change and a rewrite when the next pricing letter arrives.

Meanwhile, the edge is on fire

While the AI cost story dominated the inboxes, three pre-auth bugs landed on the layer every request hits before it gets anywhere near a model:

NGINX rewrite module RCE — unauthenticated, pre-auth, internet-facing, 18 years in the codebase. Affects every deployment using rewrite rules, which is roughly all of them. The bug is older than half the engineers who'll have to patch it.
Traefik CVE-2026-35051/39858 — CVSS 10.0 auth bypass. ForwardAuth, BasicAuth, every middleware in the chain is decorative until patched. Internal services are effectively internet-facing with no auth.
MOVEit Automation CVE-2026-4670 — CVSS 9.8, same product line and bug class as the 2023 Cl0p campaign that ran for months before most victims noticed.

Pair that with PraisonAI, which went from disclosure to working exploit in four hours, and Argo CD's 9.6 letting any authenticated user pull plaintext Kubernetes secrets — including the model-registry tokens, HuggingFace PATs, and provider API keys most teams park there.

The four-hour weaponization tempo is the number that retires the 30-day patch SLA as a planning assumption. UK AISI confirmed the why: Anthropic's Mythos cleared both end-to-end attack ranges autonomously — full network takeover, not advanced persistence. GPT-5.5-cyber cleared one. The evidence that detection rules tuned to human-paced lateral movement will produce false negatives against agentic chains is no longer theoretical.

And because Congress is routing Mythos access through NSA rather than CISA, the civilian uplift schedule is later than the threat curve. Plan as if no government help arrives at parity with adversaries.

What to do this week

Three things, in order, before Friday.

One: model the June 15 cost impact and ship the abstraction layer. Tag every Claude-backed workload — Agent SDK, GitHub Actions, batch evals, harness traffic — and project the June bill at API rates. Stand up an LLM gateway (LiteLLM with the recent CVE patched, Portkey, your own thin shim) with per-user, per-feature attribution and daily budget alerts. Anthropic ships no native cost telemetry. The observability gap is yours to close, and ServiceNow proved what happens if you don't.

Two: patch the edge tonight. NGINX, Traefik, MOVEit — in that order. Then rotate every secret Argo CD could read in any namespace it touched. Patching without rotation leaves the credentials exposed; the second one is the one most teams skip and regret.

Three: instrument trajectory-level evals. Vercel's gateway data across 200K teams says 59% of token volume is now agentic. If your eval harness still scores single-turn responses against a reference answer, you are benchmarking the minority of your traffic. Add cost-per-successful-task, tool-call F1, steps-to-completion, and recovery-from-error rate this sprint. Pass@1 stops measuring reliability at exactly the capability level you're shipping into.

The pricing letter, the capacity miss, the hostile lease, the edge CVEs, and the agentic traffic share are the same story told from five seats. The story is that the assumptions underneath last year's AI roadmap — cheap inference via harness arbitrage, single-vendor dependency, 30-day patch windows, single-turn evals — all expired in the same week. The teams that recalculate this week keep their margin and their renewals. The teams that don't will discover all five problems simultaneously, and probably from the same incident.

◆ Behind the synthesis

Six specialist takes that fed this piece.

The piece above is one stream in my voice. Below are the six lenses my pipeline produced upstream — each tuned for a different reader. Use them when you want the angle that matters most to your role.

The week the AI stack billed differently and broke differently

An 80x miss, papered over with a hostile lease

OpenAI's two-month window is a free benchmark, not a sales pitch

Meanwhile, the edge is on fire

What to do this week

Six specialist takes that fed this piece.

NGINX shipped an unauthenticated RCE in the rewrite module.

Two pre-auth bugs dropped on the same day: an 18-year-old unauthenticated RCE in the NGINX rewrite module, and a CVSS 10.0 auth bypass in Traefik.

Anthropic's June 15 pricing restructure eliminates the 70-90% implicit discount third-party harness users (Cursor, Cline, OpenCode) have been building unit economics around — your per-developer AI cost assumption is wrong by roughly an order of magnitude.

Anthropic's Mythos became the first AI model to fully take over both UK AISI attack ranges autonomously, and a parallel study showed AI reverse-engineering all five major EDR products in days rather than weeks.

Anthropic's June 15 pricing change closed the seventy-to-ninety percent subscription arbitrage the third-party Claude tools were quietly running on, which is to say every Claude-dependent wrapper in the portfolio woke up last Friday with a different unit economics deck.