Edition 2026-05-06 · read as Investor
GPT-5.5PriceHikeandAgentTollsBreaktheAIROIMath
- Sources
- 37
- Words
- 1,770
- Read
- 9min
Topics AI Capital LLM Inference Agentic AI
◆ The signal
GPT-5.5 raised net API costs somewhere between forty-nine and ninety-two percent this week, five SaaS incumbents announced per-action tolls on external AI agents, Uber's CTO conceded the AI budget is blown, and KKR's Stavros put the actual portfolio earnings uplift from AI at five percent rather than the fifty that underwrites trillion-dollar lab marks. Two landlords can now raise rent independently of each other. The ROI math finally has an honest denominator.
◆ INTELLIGENCE MAP
01 Enterprise AI Cost Stack Breaks Open
act nowThree cost shocks hit the enterprise AI P&L simultaneously: GPT-5.5 costs 49-92% more net, Anthropic killed flat-rate subs, and GitHub is rebuilding for 30x load. Uber's CTO publicly said they 'blew through' the AI budget. KKR's Stavros admitted 5% actual EBITDA lift — not 50%. Every AI application ARR needs a gross margin waterfall before the next mark.
- GPT-5.5 net cost hike
- GitHub load rebuild
- KKR actual AI uplift
- Assumed uplift in decks
- Pitch Deck AI Uplift50
- KKR Actual AI Uplift5
02 Enterprise SaaS Incumbents Build Agent Tollgates
act nowServiceNow launched Action Fabric (per-action metered), SAP bans unapproved agent access, Workday and HubSpot announced usage-based agent pricing, and DataDog capped MCP at 5K/day. Agent startups now face two landlords — model providers AND data incumbents — both raising rent independently. AWS is positioning as the 'open' alternative. The neutral MCP control-plane layer is the asymmetric bet to source.
- ServiceNow YTD
- DataDog MCP cap/day
- SAP posture
- Workday CEO quote
- 01SAPFull ban without endorsement
- 02ServiceNowPer-action metered (Action Fabric)
- 03WorkdayUsage-based (details TBD)
- 04DataDog5K/day cap on MCP
- 05HubSpotCharging for agent access
03 Palantir's 85% Growth Creates Bifurcation Benchmark
monitorPalantir printed 85% revenue growth and $892M quarterly FCF — exceeding all of Q1 revenue — while the enterprise software index fell 30%+ YTD. At 43x forward sales (2x the well-positioned cyber cohort), the stock fell after-hours anyway. Karp named OpenAI's Deployment Company as a direct competitor. The market is repricing legacy SaaS as structurally impaired, not cyclically weak — one company grew 85% and everyone else is melting.
- Revenue growth
- US commercial growth
- Forward sales multiple
- SaaS peers YTD
04 OpenAI-Microsoft Restructuring Terms + Foundation Model Price War
monitorMicrosoft-OpenAI exclusivity formally died: replaced by nonexclusive IP license through 2032, Microsoft retains ~27% equity, AGI kill switch removed. OpenAI can now serve AWS freely. Same week, xAI launched Grok 4.3 at ~50% below Sonnet pricing and OpenAI weaponized Claude Cowork import tools. Anthropic is being squeezed on price and lock-in simultaneously while targeting a $900B round.
- MSFT license term
- Grok vs Sonnet price
- Anthropic target val
- OpenAI implied val
05 Stripe Consolidates Stablecoin Stack as Crypto VC Craters
backgroundStripe assembled the most complete stablecoin payments stack in existence — Bridge (OCC charter), Privy (110M wallets), Valora, Tempo ($5B permissioned L1) — atop $1.9T in payment volume. Meanwhile crypto VC collapsed 74% MoM to $659M, a 2-year low. Standalone crypto payments startups are now uninvestable without a Stripe-defense thesis. Ethereum's proposed issuance curve would kill LSTs — Lido's lead conceded as much publicly.
- Stripe payment volume
- Crypto VC decline MoM
- Privy wallets
- Stripe stablecoin M&A
- Crypto VC Prior Month2534
- Crypto VC Current659
◆ DEEP DIVES
01 The Three-Layer AI Cost Crisis — Your Gross Margins Are About to Get Audited
The Cost Stack Just Became Three Landlords Deep
The enterprise AI cost thesis broke this week, and it broke from three directions at once. The mechanism matters, because the portfolio response differs by layer.
Layer 1: model provider costs are rising, not falling. GPT-5.5 ships a headline two-times price increase. The honest number, once completion-token effects are included, is 49-92% net cost inflation. OpenAI is explicitly pivoting from market share to margin. Anthropic killed flat-rate subscriptions the same week. GitHub is rebuilding its platform for 30x more load, which is what metered pricing produces at scale. A twenty-dollar Claude subscription burning hundreds to thousands in compute was never going to last. The subsidy ended.
Layer 2: the enterprise incumbents are imposing per-action tolls. ServiceNow's Action Fabric meters every action an external AI agent takes against its data. SAP now bans external agent access without endorsement. Workday and HubSpot are both moving to usage-based agent pricing. DataDog caps MCP requests at 5,000 daily. JPMorgan's Mark Murphy named it correctly: a tax on customers using outside AI agents.
Layer 3: the actual ROI does not cover the bill. KKR's Pete Stavros told Milken the AI earnings uplift across the portfolio is 5%, not 50%. That is the first honest number from a tier-one GP with every incentive to round up. Uber's CTO admitted they 'blew through' the AI budget after turning on agentic tools. Sierra's Bret Taylor called the ramp 'pricey before ROI.'
What This Means for Your Book
Every AI application company now faces a cost stack with two independent landlords who can raise rent unilaterally — the model provider and the data incumbent — selling into a buyer measuring single-digit ROI against double-digit cost increases. The math:
A company paying 49-92% more for inference, plus new per-action fees from ServiceNow or SAP, selling into a buyer measuring 5% uplift, is running negative gross margin on AI features that were supposed to defend ARR.
The a16z speedrun cohort confirms it at seed: founders are spending $300K/year on agents in lieu of engineers, with Cursor bills exceeding engineer salaries. Token spend and headcount are both rising at once. The 'AI replaces headcount' thesis is empirically dead at the company level while platform costs compound.
Where Sources Diverge
There is a reasonable counter-thesis, and it is worth stating cleanly before dismissing it. Inference costs fall fast enough that the flat-rate model looks retroactively sensible by late 2026. This is the $700B hyperscaler capex bet, that raw compute supply bends the curve. Several sources cite it as possible. None underwrite it as base case. The more honest framing, or rather the less flattering one, is that usage-based pricing sticks, wrappers pass costs through, and a meaningful cohort of AI-feature companies discover the feature that defended ARR is the feature that compressed the margin they were defending ARR to protect.
Action items
- Run a pricing-model stress test across every AI-exposed portfolio company this week — model gross margin under metered pricing vs. current flat-rate, with 20-40% platform toll stack added
- Mandate multi-model routing architecture for any portco with >30% API COGS from a single provider within 90 days
- Re-underwrite every active LBO model to ≤5% AI EBITDA lift over 24 months; present adjusted returns at next IC
- Source 3-5 MCP gateway / agent-governance startups for active diligence at pre-A to Series A
Sources:Benedict Evans · Laura Bratton · Cory Weinberg · TLDR AI · a16z speedrun · StrictlyVC
02 Palantir 85% vs. SaaS -30%: The Market Is Pricing a Permanent Split
One Quarter Does Not Make a Category, But It Might End One
Palantir printed eighty-five percent revenue growth and eight hundred ninety-two million dollars in quarterly free cash flow, a single-quarter FCF figure larger than all of Q1 2025 revenue. Management walked full-year guidance from sixty to seventy-one percent and guided Q2 to seventy-nine. US commercial grew a hundred and thirty-three. US government grew eighty-four. The stock fell after-hours anyway, because at forty-three times forward sales, roughly double the better-positioned cybersecurity comps, the company has to keep accelerating to hold the line it is already standing on.
Meanwhile the enterprise software index is down more than thirty percent year to date. ServiceNow is down forty with an unproven agent pivot and a twenty-to-thirty percent price hike. The rest of the cohort growing at twenty percent is being marked not as cyclically weak but as structurally impaired. Karp's on-call jab, 'if you want old-school software that actually doesn't work and probably will disappear, there are a lot of names,' is aggressive enough to look embarrassing if the tape turns.
The Bifurcation Thesis
Multiple sources converge on the same read, or rather the more interesting version of it: the market has stopped treating enterprise software as a valuation debate and started treating it as a category mortality question. The mechanism is specific. Palantir's data orchestration model, with forward-deployed engineers, government clearances, and a proprietary data position, compounds with AI. The median SaaS incumbent's seat-license model does the opposite, because AI agents eliminate seats.
Metric Palantir SaaS Peers Cyber Leaders Revenue growth 85% ~20% 20-30% FCF quality $892M/qtr Compressing Strong Forward sales multiple 43x 4-8x (compressing) ~20x AI positioning Category winner Defensive/disrupted Neutral benefactor The less comfortable reading, and the one worth holding as a hedge: Palantir at forty-three times forward sales with management taking a victory lap is the classic top-of-cycle setup. The fundamentals are real. The multiple is not durable if growth decelerates even modestly. And seventy-one percent full-year guidance is a bar that compounds quarter on quarter.
The Karp Line Worth Noting
Karp named OpenAI's Deployment Company as a direct competitor on the earnings call, the first public acknowledgement that frontier labs are moving onto Palantir's enterprise deployment turf. That changes who writes the SOW. Two better-capitalized entrants with superior models just copied the GTM. Government is fine. Commercial renewals are the exposed surface.
The market is telling you legacy SaaS growing below thirty percent without a credible AI-native roadmap belongs at four to six times forward revenue, not ten to twelve. That is not a forecast. It is a description of where the flows are already going.
Action items
- Reprice every SaaS position in the book: sub-30% growers get marked to distressed multiples this quarter; benchmark AI-native platforms to Palantir's growth/FCF profile
- Screen for the 'second Palantir' in sanctioned enterprise data access — SAP-native, Salesforce-native integration layers with proprietary data position
- Trim Palantir public exposure or buy protection above 43x forward sales; the risk/reward is asymmetric to the downside despite fundamentals
Sources:Martin Peers · The Information AM · Morning Brew · Cory Weinberg · Mindstream
03 OpenAI-Microsoft Terms Are Public — The Cloud AI Chessboard Moved
What the Restructuring Actually Says
The Microsoft-OpenAI renegotiation is now specific enough to underwrite against, which is the first useful thing anyone has been able to say about it in months. The terms: Azure exclusivity is replaced by a nonexclusive IP license through 2032, OpenAI can serve any cloud including AWS, Microsoft retains roughly 27% equity in the for-profit entity and keeps "primary cloud" status, the AGI-triggered kill switch is gone, Microsoft stops collecting revenue share while OpenAI continues paying one through 2030, and the fifty-billion-dollar OpenAI-Amazon deal is now unblocked. That last item is the one that actually moves.
This is not incremental news. It is a structural change in the cloud AI competitive landscape, and any thesis underwriting Azure lock-in was quietly repriced the afternoon the ink dried. AWS responded inside the same news cycle by distributing ChatGPT APIs on Bedrock alongside its Frontier agent tool and Stateful Runtime Environment. The optionality Microsoft gave up is now AWS inventory.
The Foundation Model Price War Escalates
In the same window, xAI launched Grok 4.3 at $1.25/$2.50 per million tokens, roughly fifty percent below Anthropic's Sonnet 4.6 with claimed performance parity, a 1M context window, and reasoning. OpenAI shipped an import feature in Codex that pulls settings, plugins, agents, and project configuration directly out of Claude Cowork, which collapses switching costs to approximately zero. Anthropic is being squeezed on price at the API layer and on lock-in at the workspace layer in the same news cycle.
Meanwhile Opus 4.7 shipped with 43% more user frustration than its predecessor on Base44's experiential benchmark. Quality perception is degrading at the worst possible moment for a round.
The Uncomfortable Valuation Math
OpenAI carries an $833-852B implied valuation. Anthropic is targeting $900B. Both numbers price monopoly-style capture of enterprise revenue through the PE channels discussed last week. The moat they imply is being filled from below by DeepSeek V4 and open-weights frontier parity, and from above by regulation and pre-release vetting. The labs are raising at peak-cycle marks into peak-cycle moat compression. This is probably wrong, but it is the shape of the trade.
The trade is to rebalance toward application-layer data moats, AWS-native infra, and ARM inference before the IPO comps arrive. Secondaries at meaningful discounts to primary marks; avoid primary at these valuations.
Meta's multi-million-unit AWS Graviton ARM CPU deal for inference is the first credible dent in Nvidia's inference dominance at hyperscaler scale. It also signals the training-to-inference pivot that converts AWS's architectural bet from liability to moat. That is where the capital should go.
Action items
- Audit portfolio exposure to Azure-OpenAI exclusivity narratives and re-score any company whose moat was 'Azure-OpenAI co-sell partner' — present findings at next partner meeting
- Re-underwrite Anthropic secondary positions with 20-30% Claude Cowork user churn scenario and Grok-tier pricing as baseline gross margin assumption
- Accelerate diligence on AWS Bedrock-native agentic tooling and ARM/Graviton-optimized inference plays at Series A/B
Sources:Last Week in AI · ben's bites · Benedict Evans · Ben Thompson · TLDR AI · Simplifying AI
04 Amazon's Inference Moat + Logistics Platform: The Re-Rating Case
Three Amazon Signals Converging Into One Thesis
Amazon produced three signals this week that the street is pricing as three separate stories and which are, or rather are more usefully read as, a single one: convert marginal costs into capital costs, then rent the capex. First, ASCS formally productizes logistics-as-a-service with P&G and 3M as anchor customers, and FedEx and UPS traded accordingly, with UPS closing -10.47% in a single session. Second, the AI compute market shifted from training-dominant to inference-dominant, at which point AWS's disaggregated architecture (Nitro plus custom Trainium silicon), until recently described as a weakness, starts looking like a structural cost moat. Third, Amazon Leo's 50-plus planned launches quietly supply drone-delivery connectivity.
The unifying read: this is the third run of the retail-to-FBA-to-AWS platform playbook. If pattern recognition is alpha, ASCS is it.
The Cloud Neutrality Trade
The underappreciated reframing is compute neutrality as moat. Microsoft missed Azure growth earlier in 2026 because it diverted compute to internal OpenAI workloads, which is a polite way of saying the AI factory and the neutral cloud are the same building. Google has the same problem dressed up as search defense. Amazon, whose core business is moving physical goods, has no comparable internal compute sink competing with customer workloads.
Hyperscaler AI Compute Conflict Inference Position Neutrality AWS Minimal Architecturally favored High Azure High (OpenAI crowds out) Training-optimized Low GCP Medium (Search + Gemini) TPU (captive) Medium The logistics unbundling is less glamorous and probably the larger repricing. Amazon entering the $1.3 trillion third-party logistics TAM as a formal platform participant, rather than the accidental one it has been for years, turns FedEx and UPS into structurally impaired acquirers. Any portfolio company holding a parcel-logistics exit thesis needs that thesis rewritten this quarter.
Thesis-Kill Risk
This is probably wrong, but the entire AWS inference-moat argument rests on one line item: Trainium 3 trajectory. If the custom silicon stalls, AWS reverts to being an Nvidia reseller with worse training economics than Oracle, which is a sentence nobody wanted to type. Track it quarterly with a hard kill threshold. The secondary risk is simpler: Jensen Huang's "won't make that mistake again" line means Nvidia intends to compete with its own customers for model-layer value.
Action items
- Reweight public cloud exposure: increase AMZN, reduce MSFT on Azure-neutrality risk; model 100-200bps AWS margin expansion from inference mix shift
- Stress-test logistics and 3PL-adjacent portfolio companies for Amazon ASCS displacement over 2-4 quarter horizon
- Build a Trainium generation-over-generation benchmark tracker as the single-point-of-failure monitor for the AWS inference thesis
Sources:Ben Thompson · TLDR · Bloomberg Technology · Morning Brew
◆ QUICK HITS
Update: Sierra valuation hit $15.8B post-money ($950M raise, Tiger/GV led) — ~100x February ARR; sets ceiling comp for every agent-layer deal in pipeline but Uber's budget-blowout admission undermines the unit economics beneath it
StrictlyVC
Anthropic acquired Bun (JavaScript runtime) — first datapoint that foundation-model labs will vertically integrate into developer runtimes, not just IDEs; compress entry timelines on remaining independent runtime/build-tool assets (Deno, pnpm, Rolldown)
JavaScript Weekly
Apple toured Samsung's Taylor fab and Qualcomm is multi-sourcing Snapdragon 8 Elite Gen 6 — TSMC concentration crack opens non-zero probability repricing for Intel Foundry Services thesis
Bloomberg Technology
NVD announced it will no longer universally enrich CVEs — creates immediate $1B+ wedge for AI-native vulnerability intelligence startups doing reachability-based prioritization; Series A pipeline in 60 days
SANS NewsBites
Supply chain worm Mini Shai-Hulud crossed PyPI→npm→PHP, hitting SAP CAP-JS (500K+ weekly downloads) and PyTorch Lightning with 8.3M compromised downloads — pull portfolio exposure list and require rotation attestation this week
SANS NewsBites
Instacart migrated off Elasticsearch+FAISS to unified Postgres/pgvector: 10x write reduction, 2x latency improvement, 6% zero-result-search drop — stress-test Pinecone-class vector DB positions against pgvector substitution scenario
ByteByteGo
OpenAI voice AI now serves 900M+ users on custom WebRTC architecture — validates real-time voice as mass-consumer primitive; expand Series A pipeline in voice-native B2B verticals (support, sales, clinical)
TLDR Dev
Image generation drives 6.5x more downloads than model upgrades (Gemini +22M, ChatGPT +12M) but only ChatGPT converts to consumer revenue — consumer AI wrapper deals benchmarked on MAU without paid conversion data are mispriced
TLDR Marketing
◆ Bottom line
The take.
Enterprise AI just hit a three-layer cost crisis — model providers raising prices 49-92%, SaaS incumbents imposing per-action tolls, and KKR admitting the actual earnings uplift is 5% not 50% — which means every AI application in your portfolio faces compressing gross margins, two landlords who can independently raise rent, and customers whose ROI doesn't cover the bill. The repricing happens this quarter, not next.
Frequently asked
- How should I stress-test portfolio gross margins given the new AI cost stack?
- Model each AI-exposed portco's gross margin under metered pricing instead of flat-rate, then layer a 20-40% platform toll on top for ServiceNow/SAP-style per-action fees. With GPT-5.5 driving 49-92% net inference cost inflation and KKR's Stavros putting actual EBITDA uplift at 5%, companies discovering this at the next board cycle will be marking down by Q3. Mandate multi-model routing for any portco with more than 30% API COGS concentrated in one provider.
- What does the Microsoft-OpenAI restructuring actually change for cloud positioning?
- Azure exclusivity is replaced by a nonexclusive IP license through 2032, Microsoft retains roughly 27% equity but loses revenue share, the AGI kill switch is gone, and the $50B OpenAI-Amazon deal is unblocked. AWS responded inside the same news cycle by distributing ChatGPT APIs on Bedrock. Any thesis underwriting Azure lock-in or 'Azure-OpenAI co-sell' moats was quietly repriced — those portcos are pitching a 2024 story into a 2026 market.
- Is Palantir at 43x forward sales a buy, hold, or trim?
- Trim or buy downside protection despite the fundamentals. Eighty-five percent revenue growth and $892M quarterly FCF are real, but at roughly double cybersecurity comps the multiple requires continued acceleration, and Karp just named OpenAI's Deployment Company as a direct competitor moving onto Palantir's enterprise GTM turf. A deceleration from 85% to 60% growth at this multiple produces a 30%+ drawdown — the risk/reward is asymmetric.
- Where should new capital go if legacy SaaS is being structurally repriced?
- Application-layer data moats with sanctioned enterprise access, AWS Bedrock-native agentic tooling, ARM/Graviton inference infrastructure, and MCP gateway/agent-governance startups at pre-A to Series A. SAP's external-agent ban and ServiceNow's Action Fabric create a premium for sanctioned integration, while Meta's multi-million-unit Graviton deal signals the inference workload migration that turns AWS architecture from liability into moat.
- What's the single-point-of-failure to monitor on the AWS inference-moat thesis?
- Trainium 3 generation-over-generation performance is the kill switch. If custom silicon stalls, AWS reverts to being an Nvidia reseller with worse training economics than Oracle, and the entire neutral-cloud-plus-inference-leadership argument collapses. Track it quarterly with a hard threshold and contract position sizing if the trajectory slips — the secondary risk is Nvidia itself competing with hyperscaler customers for model-layer value.
◆ Same day, different angle
Read this day as…
◆ Recent in investor
Keep reading.
- SpaceX is pricing June 12 at one-point-seven-five trillion, roughly a hundred times revenue, into the worst tape we have seen for a listing…
- SpaceX is quietly collecting $2.17B/month in AI compute rent from Anthropic and Google — a $26B annualized run-rate that isn't in secondary…
- Anthropic edged OpenAI in enterprise billing on Ramp last week, 34.4 percent to 32.3, in the same week ServiceNow admitted it had burned its…
- ServiceNow burned its full-year Anthropic budget by May, with no SLAs, no per-user telemetry, no enterprise dashboard.
- Anthropic's June 15 pricing change closed the seventy-to-ninety percent subscription arbitrage the third-party Claude tools were quietly run…