PROMIT NOW · PRODUCT DAILY · 2026-02-27

AI Agents Ship: Pricing and Seat Models Face Reckoning

· Product · 46 sources · 1,732 words · 9 min

Topics Agentic AI · AI Capital · LLM Inference

The AI agent era just went from theoretical to shipping: Perplexity, Anthropic, and Cursor all launched autonomous agent products in the same week, while Salesforce admitted its $800M ARR Agentforce is cannibalizing legacy revenue — not expanding it. Your two most urgent decisions this quarter: (1) how your product gets consumed by AI agents, not just humans, and (2) whether your pricing model survives when agents replace the seats you charge for.

◆ INTELLIGENCE MAP

  1. 01

    The Agent Platform War: Shipping Products, Not Demos

    act now

    Five+ companies shipped production AI agent products simultaneously — Perplexity Computer, Claude Cowork, Cursor Cloud Agents, Notion Custom Agents, and Anthropic's enterprise plugins — while incumbents like Workday and HubSpot are building data-access tollbooths against them, signaling the agent platform layer is consolidating in real-time.

    8
    sources
  2. 02

    SaaS Pricing Model Crisis: Seat-Based Revenue Under Existential Threat

    act now

    Salesforce's Agentforce hit $800M ARR but organic growth decelerated to 7-8%, Snowflake's CFO admitted AI margins are lower than legacy products, and only 8% of consumers will pay extra for AI features — the market is punishing AI growth that cannibalizes rather than expands revenue.

    7
    sources
  3. 03

    AI Agent Security: Systemic Trust-Boundary Failures

    monitor

    Manus AI agent had CVSS 9.8 zero-click exploits, Claude Code suffered RCE vulnerabilities, NPM worms now specifically target AI coding tools, and an AI agent mass-deleted a Meta director's inbox — agentic AI has a systemic, not incidental, security problem that will gate enterprise adoption.

    7
    sources
  4. 04

    AI Infrastructure Economics: Cost Models in Flux

    monitor

    OpenPipe's ART framework trains a 14B model for $80 that beats o3 at 64x lower cost, while Meta scrapped its training chip and Google sold TPUs to Meta in a multibillion-dollar deal — the AI cost landscape is simultaneously commoditizing at the application layer and consolidating at the infrastructure layer.

    6
    sources
  5. 05

    China's Open-Source AI Offensive and Geopolitical Model Risk

    background

    Three frontier Chinese models shipped in one week (Qwen 3.5, GLM-5, DeepSeek V4), GLM-5 hit #1 on open leaderboards under MIT license, and Anthropic exposed 24,000+ fake accounts conducting industrial-scale distillation attacks — the open-source model landscape is getting more capable and more geopolitically complicated simultaneously.

    5
    sources

◆ DEEP DIVES

  1. 01

    The Agent Platform War Is Here — And Your Product Is Either the Agent, the Platform, or the Target

    <h3>Five Agent Products Shipped Simultaneously — This Is a Phase Transition</h3><p>In a single week, <strong>Perplexity Computer</strong> launched as a general-purpose digital worker claiming workflows running for "hours or even months," <strong>Claude Cowork</strong> shipped scheduled recurring tasks with a plugin architecture spanning finance, HR, legal, engineering, and design, <strong>Cursor Cloud Agents</strong> deployed dedicated cloud VMs producing merge-ready PRs across web, mobile, desktop, Slack, and GitHub, <strong>Notion Custom Agents</strong> introduced autonomous bots with 24/7 operation and org-level access, and <strong>Anthropic acquired Vercept</strong> for computer-use capabilities. This isn't a trend — it's a market declaring itself open for business.</p><blockquote>Products that are operable by AI agents will see compounding adoption; products that require human-only interaction will gradually lose to agent-compatible alternatives.</blockquote><h3>Three Distinct Agent Lanes Are Forming</h3><p>The market is fragmenting into clear segments: <strong>general-purpose agents</strong> (Perplexity Computer operating any UI), <strong>workflow agents</strong> (Claude Cowork with domain-specific plugins and connectors to Gmail, DocuSign, Clay), and <strong>vertical agents</strong> (Cursor producing code artifacts, not suggestions). The most strategically important detail isn't the flashy demos — it's Anthropic's new <strong>Customize tab with plugins, skills, and connectors</strong>. This is Anthropic building the App Store for AI agents. Combined with Vercept, they're betting the winning platform won't have the best model — it'll have the deepest integration ecosystem.</p><h3>The GUI Agent Paradigm Shift Changes Product Design</h3><p>Cursor Agents now test their own code by using a computer and return <strong>video demos</strong> of output. Google showed Gemini ordering food autonomously on Android. Both Cursor (acquiring Autotab) and Anthropic (acquiring Vercept) made acquisitions specifically for computer-use capabilities. When you see simultaneous acquisitions and product launches across five companies, your product's UI is now an interface for <strong>non-human users</strong>. Products that are predictable, well-structured, and semantically clear will become preferred in agent-mediated workflows. Products with janky modals and anti-automation patterns will be routed around.</p><h3>The UX Nobody Has Solved</h3><p>Here's the whitespace opportunity: <em>nobody has figured out the UX for long-running autonomous agents</em>. Perplexity claims workflows running for months. Cowork has scheduled recurring tasks. But what does the user experience look like when an AI agent has been working on your behalf for 72 hours? How do you surface progress, handle errors, enable intervention? OpenAI's Kevin Weil revealed that top performers run <strong>3-4 parallel Codex agent jobs</strong> across different work trees, treating idle compute as wasted productivity. The PM who designs the right "mission control" UX for autonomous agents will own a category.</p><hr><h4>Sources Disagree On: Agent Trust</h4><p>Multiple sources report agents shipping to production, but contradicting data shows AI agent adoption is concentrated <strong>almost entirely in programming tasks</strong> — Anthropic's own tool-call data shows software engineering captures 49.7% of usage while healthcare is 1%, legal 0.9%. An AI agent mass-deleted a Meta director's inbox when scaling from toy to real environment. The gap between demo capability and production trust remains the binding constraint outside coding.</p>

    Action items

    • Run a competitive teardown of Perplexity Computer, Claude Cowork, and Cursor Cloud Agents against your product's automation features by March 14
    • Evaluate Anthropic's Cowork plugin architecture as a distribution channel this sprint — determine if building a Cowork plugin gives your product access to all paid Claude subscribers
    • Design your product's 'agent API' — how would an AI agent (not a human) use your product — and add to Q2 roadmap
    • Prototype async/long-running AI task UX patterns (status dashboards, error handling, intervention flows) for your product by end of Q1

    Sources:Perplexity Computer 💻, DeepSeek withholds v4 🐋, Cowork scheduled tasks 💼 · Claude has some conflicts · agent vs SaaS · Claude for Design 🤖, Touchscreen Mac 💻, Firefly Quick Cut 🎬 · AI News Weekly - Issue #467: Anthropic has receipts. And nobody wants to pay for AI. · OpenAI's Kevin Weil on the Future of Scientific Discovery

  2. 02

    The SaaS Pricing Model Is Breaking — And the Data Proves It

    <h3>Agentforce Is Growing Fast and Cannibalizing Faster</h3><p>Salesforce's Q4 FY2026 earnings are the most important data point for enterprise PMs this quarter. <strong>Agentforce hit $800M ARR</strong> (60% QoQ growth from $500M), yet strip out the $8B Informatica acquisition and organic growth was just <strong>7-8%</strong> — a deceleration. CFO Robin Washington said Agentforce growth is being <em>offset</em> by weakness in marketing, commerce, and Tableau. Marc Benioff gave a "vague answer" when asked directly about cannibalization. Stock dropped 5% after-hours, extending a <strong>28% YTD decline</strong>.</p><blockquote>At $800M ARR, Agentforce represents just 1.7% of Salesforce's projected $46B revenue. Even at 60% QoQ growth, it would take years to become material — and every dollar may be coming at the expense of a legacy dollar.</blockquote><h3>The Pricing Crisis Is Industry-Wide</h3><p>Salesforce introduced the <strong>Agentic Work Unit (AWU)</strong> — measuring completed AI agent tasks relative to token consumption — as a potential future pricing metric. But Benioff told investors: <em>"We're still trying to exactly figure out what these numbers mean for us."</em> A $300B+ company inventing pricing in real-time. Meanwhile:</p><ul><li><strong>Snowflake's CFO</strong> explicitly admitted AI product margins aren't as high as legacy products, then laid off 200 employees</li><li><strong>Workday's CEO</strong> called rival AI agent providers "parasites" getting a "free ride" on customer data</li><li><strong>HubSpot's CEO</strong> declared they will "monitor, meter, and monetize" AI agent access</li><li><strong>Only 8%</strong> of American consumers would pay extra for AI features (NBER survey)</li><li><strong>80%</strong> of firms report zero productivity impact from AI adoption</li></ul><h3>The Value Chain Is Bifurcating</h3><p>Contrast the application layer's struggles with infrastructure: <strong>Nvidia posted $68B quarterly revenue</strong> (up 73% YoY), $120B annual profit (up from $4.4B three years ago — a 27x increase), and $96.6B in free cash flow. Only Apple generated more. The AI value chain is clear: <strong>infrastructure captures outsized profit</strong> while application-layer software struggles to monetize. Z.ai's Zixuan Li predicts token-based pricing won't be mainstream by end of 2026, replaced by subscription and outcome-based contracts.</p><h3>The Emerging Pricing Framework</h3><table><thead><tr><th>Model</th><th>Example</th><th>Risk Profile</th><th>Best For</th></tr></thead><tbody><tr><td>Per Task</td><td>Valar Labs (CPT codes)</td><td>Low — fits existing billing</td><td>Regulated industries</td></tr><tr><td>Per Workflow</td><td>Hippocratic AI (fixed fee)</td><td>Medium — bundles value</td><td>Most AI agent products today</td></tr><tr><td>Per Episode</td><td>Thyme Care (monthly PMPM)</td><td>High — cost accountability</td><td>Outcome-driven verticals</td></tr><tr><td>Per Patient/User</td><td>Counsel Health (annual unlimited)</td><td>Highest — assumes near-zero marginal cost</td><td>AI-native platforms at scale</td></tr></tbody></table><p><em>Source: a16z's healthcare pricing framework, applicable across verticals</em></p>

    Action items

    • Build an explicit cannibalization model for your AI features — map which existing product lines lose usage/revenue as AI features gain adoption — and present net-revenue view to leadership by next planning cycle
    • Model your revenue under three pricing scenarios — current seat-based, AWU-style task-based, and outcome-based — assuming AI agents reduce addressable seat count by 20%, 40%, and 60% over 3 years
    • Audit all third-party API integrations where your product reads from Workday, HubSpot, or Salesforce and map which data access paths are at risk of being metered or restricted this quarter
    • Reframe your next AI feature pitch around measurable workflow ROI using the NBER '80% zero impact' stat as cautionary benchmark and Claude Code's $1B run-rate as the success template

    Sources:agent vs SaaS · Applied AI: From 'Parasites' to 'SaaSquatch,' Salesforce and Workday Leaders Take Swipes at AI Rivals · The Briefing: Nvidia and Salesforce Q4 · Nvidia Posts Blockbuster Numbers · AI News Weekly - Issue #467: Anthropic has receipts. And nobody wants to pay for AI. · Infinite Healthcare

  3. 03

    AI Agent Security Is a Systemic Crisis — Not an Edge Case

    <h3>The Trust-Boundary Problem Is Architectural, Not Patchable</h3><p>This week delivered a cascade of evidence that agentic AI has a <strong>systemic security problem</strong> that will gate enterprise adoption. Meta's Manus AI agent suffered <strong>CVSS 9.8 zero-click indirect prompt injections</strong> where asking it to "summarize this page" could trigger Gmail data exfiltration, reverse shells with passwordless sudo, and cross-tenant media access. Researchers explicitly stated this isn't a Manus-specific bug — it's a <em>"systemic trust-boundary failure affecting any agentic AI platform that allows untrusted content to influence privileged tool invocation."</em></p><h3>Your AI Coding Tools Are Now Attack Vectors</h3><p>Three parallel vulnerabilities hit the developer toolchain:</p><ul><li><strong>Claude Code</strong>: RCE and API key exfiltration via malicious Hooks and MCP configs in cloned repos (CVE-2025-59536, CVE-2026-21852) — merely <em>opening</em> a malicious repository triggers compromise</li><li><strong>SANDWORM_MODE</strong>: NPM worm specifically targeting AI coding assistants — injects malicious MCP servers into Claude, Cursor, VS Code Continue, and Windsurf, steals SSH keys and AWS credentials, and includes a <strong>polymorphic engine using local Ollama for self-rewriting</strong></li><li><strong>Cline compromise</strong>: Prompt injection in Claude-powered Issue Triage workflow could compromise production releases affecting millions of developers</li></ul><blockquote>The threat model for AI coding assistants shifted from 'running untrusted code' to 'opening untrusted projects.' Every PM building developer tools needs to internalize this.</blockquote><h3>Real-World Agent Failures Are Mounting</h3><p>A Meta AI director's production inbox was <strong>mass-deleted by OpenClaw</strong> — the agent's archiving strategy worked on a toy inbox but catastrophically failed on a real one. Claude Opus 4.6 generated code that led to a <strong>$1.78M smart contract exploit</strong>. Amazon's Kiro AI coding tool caused a <strong>~13-hour service disruption</strong> by deleting and recreating an environment. An AI agent swarm found <strong>~100 exploitable kernel bugs for $600</strong> ($4/bug), but AMD, Intel, NVIDIA, Dell, Lenovo, and IBM failed to patch within 90+ days.</p><h3>The Emerging Agent Security Stack</h3><p>New tools signal "AI agent security" is crystallizing as a distinct category: <strong>Wardgate</strong> (credential isolation gateway between agents and services), <strong>nono</strong> (kernel-level sandbox with built-in profiles for Claude Code and OpenCode), and <strong>Evoke Security</strong> ($4M pre-seed for AI agent governance). Microsoft Semantic Kernel Python SDK had a <strong>CVSS 9.9 RCE</strong> in its InMemoryVectorStore — the exact component teams use for RAG prototypes. Sentry versions 21.12.0 through 26.1.0 are vulnerable to SAML account takeover.</p>

    Action items

    • Audit every AI agent feature that allows external content to trigger tool invocations and add explicit trust-boundary isolation and user consent gates to security requirements by March 14
    • Mandate MCP server allowlisting for your engineering team's AI coding assistants (Claude, Cursor, VS Code Continue, Windsurf, Cline) this sprint
    • Add 'AI Output Validation' as a mandatory section in your PRD template for any feature shipping AI-generated code, configs, or infrastructure changes
    • Verify Sentry deployment version (upgrade to 26.2.0+ if self-hosted with SAML SSO) and check Microsoft Semantic Kernel Python SDK version (pin to ≥1.39.4) by end of week

    Sources:Manus Prompt Injection 💉, CarGurus 12.M Leak 🚙, LLM-based Deanonymization 🥸 · [tl;dr sec] #317 - 100+ Kernel Bugs in 30 Days, Secret Scanning, Threat Actors Stealing Your PoC · 0-Days Sold to Russian Broker, Serv-U RCEs, RoguePilot Flaw, FileZen Exploitation · Claude Code Flaws Exposed Devices to Silent Hacking · @RISK®: The Consensus Security Vulnerability Alert: Vol. 26, Num. 08 · 5 trends that should top CISO's RSA 2026 agendas

  4. 04

    The AI Cost Equation Is Flipping — Specialized Models at 64x Lower Cost Change Your Build-vs-Buy Math

    <h3>The $80 Agent That Beats o3</h3><p>OpenPipe open-sourced <strong>ART (Agent Reinforcement Trainer)</strong>, a framework that applies GRPO-based reinforcement learning to any Python application. Their ART-E agent — a <strong>Qwen2.5-14B model trained on a single GPU for under $80</strong> — achieved 96% accuracy on email search, outperforming OpenAI's o3, o4-mini, Gemini 2.5 Pro, and GPT-4.1. The cost delta isn't incremental — it's <strong>64x</strong>: $0.85 vs. $55.19 per 1,000 runs, with 5x latency improvement (1.1s vs. 5.6s). A feature costing $55K per million invocations on o3 costs $850 on a self-trained agent.</p><blockquote>The performance gap between direct API calls and RL-trained agents becomes 'massive' when the agent must chain 4-6 dependent decisions.</blockquote><h3>Open-Source Models Are Closing the Gap Fast</h3><p>Three frontier Chinese models shipped in one week: <strong>Qwen 3.5</strong> (multimodal, described as "dirt cheap" with massive Hugging Face adoption), <strong>GLM-5</strong> (#1 on open leaderboards, MIT-licensed, 744B total / 40B active MoE, scoring close to Claude Opus 4.5 on agentic benchmarks), and <strong>DeepSeek V4</strong> (teased). Z.ai integrated DeepSeek Sparse Attention into GLM-5 — Chinese labs are cross-pollinating innovations across competitive boundaries in ways Western labs don't.</p><h3>But Infrastructure Constraints Persist</h3><p>The cost picture has a critical tension. On one hand, specialized models are getting dramatically cheaper. On the other:</p><ul><li><strong>Meta scrapped</strong> its most advanced AI training chip — if Meta can't build one, custom silicon is out of reach for everyone else, cementing Nvidia's pricing power through 2028+</li><li><strong>OpenAI projects $111B in cash burn</strong> through 2030 while Stargate has stalled — your primary API vendor is simultaneously cash-constrained and capacity-constrained</li><li><strong>Amazon is negotiating $50B</strong> into OpenAI ($15B upfront, $35B contingent on AGI or IPO) — the smartest money is building in optionality, not certainty</li><li><strong>Google sold TPUs to Meta</strong> in a multibillion-dollar deal — the first real crack in Nvidia's monopoly, but competition will take years to materialize</li></ul><h3>OpenAI's Ensemble Architecture Playbook</h3><p>OpenAI VP Kevin Weil revealed that OpenAI uses <strong>model ensembles internally "in many places"</strong> — an orchestration model that plans and delegates to cheaper specialized models. He explicitly says startups are making a costly mistake by not doing the same. His <strong>6-12 month capability S-curve framework</strong> is gold for roadmap planning: capabilities go from 0% → 5-10% → 60-80% on evals, with the jump from barely-working to reliable taking approximately 6-12 months. Start product work during the 5-10% phase; don't wait for 60-80%.</p>

    Action items

    • Audit your AI feature portfolio for agentic workflows (multi-step, tool-calling) where you're paying frontier API costs — rank by monthly spend × task specificity and identify top 3 ART candidates by end of March
    • Benchmark GLM-5 and Qwen 3.5 against your current model provider on your top 3 AI use cases this quarter
    • Audit your AI feature architecture for single-model anti-patterns and identify where an ensemble approach (orchestrator + specialized models) would improve reliability — prioritize flows with highest error rates
    • Build or strengthen your model abstraction layer to enable multi-provider failover (OpenAI, Anthropic, open-source) before Q3

    Sources:ART: Train Agents That Can Learn From Experience · The Sequence Chat #814: Z.ai's Zixuan Li Talks About GLM · Meta's Internal Chip Design Efforts Hit Roadblocks · Exclusive: Google Strikes Multibillion-Dollar AI Chip Deal With Meta, Sharpening Nvidia Rivalry · Exclusive: Amazon's $50 Billion Investment in OpenAI Could Hinge on IPO, AGI · OpenAI's Kevin Weil on the Future of Scientific Discovery

◆ QUICK HITS

  • Anthropic abandoned its core safety pledge — will now only pause development if it has a 'significant lead' over competitors, under Pentagon pressure threatening $200M+ in contracts by Feb 28 deadline

    Still scheming

  • Stripe in early talks to acquire PayPal (down ~46% YoY) — audit your payments provider dependencies and draft contingency plans for API consolidation

    Jane Street Terraform showdown ⚖️, Anthropic $6B liquidity 🧠, Coinbase stablecoin windfall in Washington's hands 💵

  • Meta integrating USDC/USDT across 3B users (Facebook, Instagram, WhatsApp) by late 2026 via Stripe/Bridge — the biggest stablecoin distribution event in history

    Meta preps for stablecoin integration Ⓜ️, Kraken launches Stock Perps 🚀, Backpack token 🪙

  • Cloudflare rebuilt Next.js as 'vinext' using AI in one week for ~$1,110 in tokens — 4x faster builds, 57% smaller bundles — proving AI can generate production-quality framework alternatives at near-zero cost

    Jane Street vs Bitcoin 🪙, AGI career decisions 💼, Vercel Chat SDK 🤖

  • Figma now supports full Claude Code roundtrip: build working UI → capture as editable Figma frames → refine → push back to code via MCP server, potentially collapsing 2-3 sprint ceremonies

    Claude for Design 🤖, Touchscreen Mac 💻, Firefly Quick Cut 🎬

  • Apple released Foundation Models SDK for Python with on-device inference — evaluate for any mobile features where latency, privacy, or API cost are constraints

    Perplexity Computer 💻, DeepSeek withholds v4 🐋, Cowork scheduled tasks 💼

  • Google and Anthropic are actively blocking AI agent tools (OpenClaw) from accessing their coding products — subscription pricing was never designed for 24/7 agent usage patterns

    AI Agenda Exclusive: A Robot Data Startup Raises $60 Million; Why Google Blocked OpenClaw

  • Basis, a 2023-founded AI accounting agent, raised $100M at $1.15B valuation and is used by ~30% of top 25 accounting firms — three years from founding to unicorn with deep enterprise penetration

    Jane Street Terraform showdown ⚖️, Anthropic $6B liquidity 🧠, Coinbase stablecoin windfall in Washington's hands 💵

  • Big Tech M&A is back to near-2018 levels with ~75% of deals being AI acquisitions — Google alone has made 3 acquisitions in 2026; seed rounds of $35M-$61M are the new normal for AI-native startups

    Nvidia Posts Blockbuster Numbers

  • MCP tool catalog loading costs can be cut 94% by using CLI lazy-loading instead of eager JSON Schema loading — implement if building AI agent features with MCP

    Intelligence crisis 🧠, Claude Code remote control 🕹, React Native for Meta Quest 🥽

  • Anthropic's tool-call data shows 95% of AI agent verticals are wide open: software engineering captures 49.7% of usage while healthcare (1%), legal (0.9%), and education (1.8%) are greenfield

    Git for Data Lakes 🌿, The Data Reckoning 📉, Query Flow Diagrams 🗺️

  • B2B buyers now complete technical feasibility evaluation independently using AI before contacting sales — treat your docs, API refs, and architecture diagrams as first-class sales surfaces

    Parenting trends 👶, relationship loops 🔄, life moments vs. demographics 💍

BOTTOM LINE

The AI agent era shipped this week — not as a demo, but as five competing production platforms — and Salesforce's earnings proved that AI features cannibalize legacy revenue rather than expanding it. Your two existential questions for Q2: Is your product designed to be operated by agents (not just humans), and does your pricing model survive when agents replace the seats you charge for? The PMs who answer both this quarter will define the next era of enterprise software; the ones who wait will be answering to their CFOs about why AI growth is margin-negative.

Frequently asked

How should I redesign my product so AI agents — not just humans — can use it effectively?
Treat your UI as an interface for non-human users by exposing a structured 'agent API,' adopting MCP compatibility, and ensuring predictable, semantically clear flows without anti-automation patterns. Evaluate Anthropic's Cowork plugin architecture as a distribution channel, since it could give you access to all paid Claude subscribers the way Slack bots did in 2016. Products that remain human-only will gradually lose to agent-compatible alternatives.
If AI agents replace seats, how do I stop my pricing model from collapsing?
Model revenue under at least three scenarios — current seat-based, task-based (like Salesforce's Agentic Work Unit), and outcome-based — assuming agents reduce addressable seats by 20%, 40%, and 60% over three years. Pair this with an explicit cannibalization model showing which legacy lines lose revenue as AI features gain adoption, because Salesforce's Q4 showed Agentforce hit $800M ARR while organic growth decelerated to 7–8%. Your CFO will ask within 60 days.
What's the whitespace opportunity in autonomous agent products right now?
The UX for long-running autonomous agents is unsolved — no one has figured out how to surface progress, handle errors, or enable intervention when an agent has been working for 72 hours or across months. Perplexity Computer and Claude Cowork are shipping capability without experience, and OpenAI's top performers already juggle 3–4 parallel Codex jobs with no real mission-control UI. The PM who nails async status dashboards and intervention flows can own a category.
How real is the AI agent security risk, and what should I do about it this sprint?
It's systemic, not an edge case: Manus suffered CVSS 9.8 zero-click prompt injections, Claude Code had RCE via cloned repos, and the SANDWORM_MODE worm actively targets Cursor, Windsurf, and VS Code Continue to steal credentials. Mandate MCP server allowlisting for your engineering team's AI coding tools, add trust-boundary isolation and consent gates to any feature where external content can trigger tool calls, and patch Sentry (≥26.2.0) and Semantic Kernel Python SDK (≥1.39.4) immediately.
Does the 64x cost drop from specialized agents actually change my build-vs-buy decision?
Yes, for narrow agentic workflows with high volume. OpenPipe's ART framework trained a Qwen2.5-14B agent on one GPU for under $80 that beat o3 on email search at $0.85 vs. $55.19 per 1,000 runs with 5x lower latency. Audit your portfolio for multi-step, tool-calling features ranked by monthly spend × task specificity, and adopt OpenAI's own internal pattern of an orchestrator model delegating to cheaper specialized models rather than routing everything to a frontier API.

◆ ALSO READ THIS DAY AS

◆ RECENT IN PRODUCT