PROMIT NOW · INVESTOR DAILY · 2026-04-13

Open-Source AI Tops SWE-Bench as Cyber Selloff Hits Margins

· Investor · 12 sources · 1,320 words · 7 min

Topics Agentic AI · LLM Inference · AI Capital

Open-source AI just claimed the #1 position on SWE-Bench Pro under an MIT license — the same week UBS confirmed over 50% of enterprises are actively 'containing' non-AI software spend and the selloff breached cybersecurity stocks for the first time (Palo Alto -6.7%, CrowdStrike -4%). The base model layer is commoditizing and the application layer is getting budget-cut simultaneously. If your portfolio is caught between these two forces — charging proprietary API margins or selling seats to enterprises now capping non-AI spend — the compression window just shortened to 2-3 quarters.

◆ INTELLIGENCE MAP

  1. 01

    Open-Source AI Claims Benchmark Crown — Proprietary Moats Compress

    act now

    GLM-5.1 (MIT license) scored 58.4 on SWE-Bench Pro, dethroning GPT-5.4 and Claude Opus 4.6. Google's Gemma 4 under Apache 2.0 runs on mobile devices. The base model layer is now a commodity — value capture migrates to orchestration, edge deployment, and proprietary data layers.

    58.4
    SWE-Bench Pro #1 (open)
    4
    sources
    • GLM-5.1 SWE-Bench
    • License
    • Autonomous runtime
    • Tool calls/session
    1. 01GLM-5.1 (MIT)58.4
    2. 02GPT-5.4 (Proprietary)56
    3. 03Claude Opus 4.6 (Prop.)55
    4. 04Gemma 4 31B (Apache)52
  2. 02

    Enterprise SaaS Selloff Breaches Cybersecurity Safe Haven

    act now

    UBS confirms >50% of enterprise buyer conversations now mention 'containing' non-AI software spend. ServiceNow and Snowflake dropped ~8%, but the key break is cybersecurity: Palo Alto -6.7%, CrowdStrike -4%. Short sellers are building positions in VM pure-plays (QLYS, RPD, TENB). Figma at $7.9B is 60% below Adobe's 2022 bid.

    50%+
    enterprises containing spend
    3
    sources
    • Palo Alto Networks
    • ServiceNow
    • Snowflake
    • Figma vs Adobe bid
    • Asana YTD
    1. Snowflake-8
    2. ServiceNow-8
    3. Palo Alto-6.7
    4. CrowdStrike-4
    5. Salesforce-3.5
  3. 03

    Agent Revenue vs. Agent Reality — Usage Data Creates a Contradiction

    monitor

    Large-scale ChatGPT research shows decision support and writing dominate actual usage; autonomous execution barely registers. Yet Perplexity's agent pivot drove 50% MoM revenue jump to $450M ARR. Meanwhile, LaunchDarkly data shows AI code ships faster but reliability hasn't improved. The market may be overpricing pure autonomy while underpricing copilot/middleware plays.

    $450M
    Perplexity ARR (agent)
    3
    sources
    • Perplexity ARR
    • MoM growth
    • Perplexity MAU
    • Autonomous usage
    1. Decision Support35
    2. Writing/Drafting30
    3. Info Seeking20
    4. Coding10
    5. Autonomous Exec.5
  4. 04

    Diffusion LLMs Could Unlock 100x GPU Efficiency

    monitor

    Autoregressive LLM inference uses ~1% of A100 compute capacity. Diffusion LLMs generate tokens in parallel, shifting inference to compute-bound — where GPUs actually excel. Three models (LLaDA, Dream 7B, BD3-LM) are approaching quality parity, and Dream 7B is already in production. If this scales, it compresses inference costs and strands current serving infrastructure investments.

    100x
    GPU utilization gap
    1
    sources
    • Current GPU util.
    • Diffusion potential
    • Dream 7B status
    • Models at parity
    1. Autoregressive1
    2. Diffusion LLM100
  5. 05

    Gen Z Capital Reallocation — Structural Fintech TAM Shift

    background

    26-year-old investment participation jumped 5x from 8% to 40% in a decade as homeownership declined. A third of Gen Z is allocating to prediction markets and sports betting. Crypto ownership grew 8.5x to 17% of US investors. Finfluencers drive 55% of new investors but rank as least trusted source. The housing-to-markets capital shift is permanent and structural.

    5x
    Gen Z investing surge
    1
    sources
    • Participation 2015
    • Participation 2025
    • Prediction mkt adopt
    • Crypto ownership
    1. 20158
    2. 202540

◆ DEEP DIVES

  1. 01

    Open-Source Just Crossed the Moat — The Proprietary AI Premium Is Evaporating

    <h3>The Benchmark Crossover Is Here</h3><p>For the first time, an open-source model under a <strong>fully permissive MIT license</strong> holds the #1 position on SWE-Bench Pro — the industry's gold-standard coding evaluation. Z.AI's GLM-5.1, a 754-billion parameter Mixture-of-Experts model, scored <strong>58.4</strong>, dethroning both OpenAI's GPT-5.4 and Anthropic's Claude Opus 4.6. This isn't a narrow benchmark quirk — it's a direct challenge to the revenue models of every company charging premium API margins for proprietary model access.</p><p>Simultaneously, Google released <strong>Gemma 4</strong> under Apache 2.0, built on the same technology powering Gemini 3. The E2B and E4B variants run multimodal AI inference on <strong>mobile devices and Raspberry Pis</strong>. Two of the world's largest AI players just made frontier-class capabilities free.</p><blockquote>The competitive axis in AI has shifted from model intelligence to deployment geometry. Anthropic bets on restricted security distribution, Meta on ambient consumer embedding, and Z.AI on open-source developer capture — none of them are competing on 'smartest model' anymore.</blockquote><h4>What Makes GLM-5.1 Different</h4><p>Z.AI optimized for <strong>endurance over speed</strong>. GLM-5.1 operates autonomously for up to 8 hours, executing <strong>1,700 tool calls</strong> without strategy drift. In demonstrations, it autonomously built an entire Linux-style desktop environment — writing code, compiling, running in Docker, diagnosing bottlenecks, and <strong>rewriting its own architecture</strong> to fix problems. This is a qualitative shift from 'AI coding assistant' to 'AI software engineer that works overnight.'</p><h4>The Investment Implications Are Immediate</h4><p>Four sources converge on the same conclusion: the base model layer is becoming a commodity. The proprietary moat window is compressing to <strong>2-3 quarters</strong> in coding-adjacent capabilities. Portfolio companies whose competitive advantage rests on API margin arbitrage — wrapping GPT/Claude and charging a markup — face margin compression as MIT-licensed alternatives reach parity.</p><p>However, this commoditization creates investable whitespace. Value capture is migrating to three layers:</p><ol><li><strong>Agentic orchestration and infrastructure</strong> — observability, guardrails, and lifecycle management for long-horizon autonomous agents (pre-consensus, equivalent to cloud monitoring in 2012)</li><li><strong>Edge deployment stack</strong> — Gemma 4 running on phones means on-device AI is viable now; edge MLOps and privacy-preserving local inference move from niche to mainstream</li><li><strong>MCP-native developer tools</strong> — Model Context Protocol is emerging as the integration standard across Cursor, Codex, VS Code, and Windsurf; protocol-level distribution advantage is forming</li></ol><p><em>The model is the new database. The value is in the application layer built on top — and specifically in the orchestration middleware that makes agentic workflows reliable and secure.</em></p>

    Action items

    • Audit all portfolio companies whose moat relies on proprietary model API margin and present findings at next IC meeting
    • Build a pipeline of 5-10 agentic infrastructure startups (orchestration, observability, guardrails) for Q3 deployment
    • Reassess any RAG-centric portfolio companies for architectural risk

    Sources:Three frontier labs, three divergent moats · Open-source models just dethroned GPT-5.4 and Claude Opus · Anthropic is waging a six-front war · ChatGPT usage data just undermined the autonomous agent thesis

  2. 02

    The SaaS Selloff Just Breached Cybersecurity — And the UBS Data Says It's Structural

    <h3>The Budget Containment Has Become Procurement Policy</h3><p>UBS Securities published data confirming what channel checks had been whispering: <strong>over 50% of enterprise customer conversations</strong> now explicitly mention 'containing' non-AI software spend — a trend building since December 2025. This isn't sentiment; it's procurement policy. And last Friday, for the first time, the selloff breached what had been the market's safe haven.</p><table><thead><tr><th>Category</th><th>Company</th><th>Friday Drop</th><th>Key Signal</th></tr></thead><tbody><tr><td><strong>Previously insulated</strong></td><td>Palo Alto Networks</td><td>-6.7%</td><td>Security safe haven premium evaporating</td></tr><tr><td></td><td>CrowdStrike</td><td>-4.0%</td><td>Endpoint security moat questioned</td></tr><tr><td><strong>Core enterprise</strong></td><td>ServiceNow</td><td>-8.0%</td><td>Budget containment hits seat expansion</td></tr><tr><td></td><td>Snowflake</td><td>-8.0%</td><td>AI-native data platforms emerging</td></tr><tr><td><strong>Most AI-vulnerable</strong></td><td>Figma</td><td>-50% YTD</td><td>$7.9B EV vs. $20B Adobe bid (2022)</td></tr><tr><td></td><td>Asana</td><td>-60% YTD</td><td>Value trap or takeover target</td></tr></tbody></table><h4>The Two-Front War on Vulnerability Management</h4><p>A separate but compounding signal: short sellers are now <strong>actively building positions</strong> in vulnerability management pure-plays Qualys (QLYS), Rapid7 (RPD), and Tenable (TENB). These companies face a pincer — from above, platform vendors like CrowdStrike and Palo Alto absorb VM into broader suites; from below, AI models commoditize vulnerability detection to near-zero marginal cost. One trader called the trade in <strong>February 2026</strong>, naming RPD and TENB as structurally impaired by AI progress. The market hasn't fully absorbed this repricing.</p><h4>The Thesis Shift Is Subtle But Seismic</h4><p>The market previously treated cybersecurity as an <em>AI beneficiary</em> — more AI means more attack surface, more spending. Now it's pricing a different scenario: <strong>AI platforms internalizing security capabilities themselves</strong>, making standalone vendors redundant rather than essential. This creates a barbell:</p><ul><li><strong>AI-native security startups</strong> (building from scratch with AI) become high-conviction targets — the category formation is analogous to cloud security post-AWS</li><li><strong>Legacy VM vendors</strong> with no AI-native roadmap face structural impairment regardless of near-term earnings</li></ul><h4>The Distressed Opportunity</h4><p>Figma at <strong>$7.9B enterprise value</strong> — a 60% discount to Adobe's attempted $20B acquisition — is the headline, but the entire collaboration/productivity category is in a valuation trough. The critical diligence question: is AI displacement of design tools real (category shrinks permanently) or is the market overshooting (you're buying a durable workflow at a discount)? <em>Figma's continued heavy R&D spending suggests management believes the latter.</em></p>

    Action items

    • Stress-test growth assumptions across all SaaS portfolio holdings against a 10-15% reduction in non-AI enterprise software budgets this week
    • Evaluate Figma ($7.9B) and Asana as potential distressed acquisition targets in Q2 diligence cycle
    • Short-list or avoid VM pure-plays (QLYS, RPD, TENB) — reassess any active pipeline deals in standalone vulnerability management

    Sources:AI budget cannibalization just broke the cybersecurity firewall · Anthropic just declared war on your cybersecurity portfolio · Anthropic's Mythos forces critical infrastructure repricing

  3. 03

    The Agent Paradox: $450M ARR vs. the Data That Says Autonomy Isn't What Users Want

    <h3>The Contradiction That Defines This Cycle</h3><p>Two data points arrived this week that shouldn't both be true — but they are, and the tension between them is the most important thesis signal in AI right now.</p><p><strong>Data Point 1:</strong> Large-scale analysis of millions of ChatGPT conversations reveals that <strong>decision support, writing, and information seeking</strong> account for the overwhelming majority of real-world usage. Coding is a <strong>surprisingly small share</strong>. Autonomous task execution? Barely registers. Non-work usage is growing faster than work usage.</p><p><strong>Data Point 2:</strong> Perplexity's pivot from AI search to AI agents drove a <strong>50% single-month revenue jump to $450M ARR</strong> with 100M monthly active users — the fastest validation of agent-based monetization at scale we've seen.</p><blockquote>The market is overpricing autonomous AI agents and underpricing decision-support copilots — and the first large-scale usage data just proved it. But Perplexity's $450M ARR proves agent-based business models can work. The resolution: agents that augment decisions win; agents that promise full autonomy are building for a use case that doesn't exist at scale.</blockquote><h4>Where the Reliability Data Compounds the Picture</h4><p>LaunchDarkly survey data adds a third dimension: AI-generated code is shipping <strong>faster than ever, but production reliability has not improved proportionally</strong>. The velocity-reliability gap is widening. This matters because autonomous agents operating for 8 hours (like GLM-5.1) amplify both the speed <em>and</em> the reliability risk. The market needs new infrastructure layers — runtime control, AI-code observability, deployment safety nets — before autonomous agents become enterprise-ready.</p><h4>The Enterprise Adoption Blockers Are the Investment Opportunity</h4><p>Five specific enterprise blockers have been identified that map directly to fundable categories:</p><ol><li><strong>Integration</strong> — agents need reliable, secure connections to enterprise APIs (Jentic's exact positioning, led by a serial founder with two exits)</li><li><strong>Security</strong> — fine-grained permissions, audit logs, sandboxing for agent actions</li><li><strong>Reliability</strong> — agents that work in demos fail in production; simulation sandboxes are emerging as requirements</li><li><strong>Compliance</strong> — regulatory frameworks haven't caught up; OpenAI's Stargate UK pause shows copyright uncertainty is already killing projects</li><li><strong>Maintainability</strong> — self-improving agents raise governance questions no existing tooling can answer</li></ol><p>Each blocker represents a <strong>$1B+ category opportunity</strong> if enterprise agent deployment scales at the pace Visa's 106M-dispute deployment suggests. The category is pre-consensus, which means valuations are still reasonable. This is the picks-and-shovels layer for the agentic era — and it's where the copilot thesis and the autonomy thesis converge.</p><h4>What This Means for Portfolio Construction</h4><p>Companies positioning AI as <strong>'augmentation'</strong> (making experts better) sustain premium pricing. Companies positioning as <strong>'replacement'</strong> (eliminating grunt work) enter a race to prove ROI through headcount reduction — a value prop that compresses margins. If you're evaluating AI-ops startups, <em>the language they use in their pitch deck tells you which pricing trajectory they're on.</em></p>

    Action items

    • Audit portfolio exposure to 'autonomous agent' thesis — stress-test each company's value prop against the ChatGPT usage data showing decision-support dominance
    • Deep-dive Jentic and 2-3 comparable agent-infrastructure startups for potential investment or watchlist placement
    • Add augmentation-vs-replacement positioning language to standard AI-ops diligence framework

    Sources:ChatGPT usage data just undermined the autonomous agent thesis · Perplexity's $450M ARR at 50% MoM growth · AI's velocity-reliability gap is opening a new infrastructure investment cycle

◆ QUICK HITS

  • Update: Anthropic Claude Code source leak exposed 512,000 lines including a hidden background agent (KAIROS) — 50,000 copies distributed before containment. Compound risk with pay-as-you-go pricing change. Any portfolio company with Claude Code dependency needs a board-level contingency conversation.

    Perplexity's $450M ARR at 50% MoM growth

  • Anthropic paid $400M+ (all-stock) for Coefficient Bio — an 8-month-old, sub-10-person ex-Genentech stealth biotech startup. New acqui-hire benchmark: ~$40M+ per head for domain experts unlocking frontier lab vertical expansion. Retention risk for your AI-healthcare portfolio companies just spiked.

    Three frontier labs, three divergent moats

  • OpenAI paused Stargate UK data center citing highest electricity costs globally and copyright policy uncertainty — leading indicator of capital reallocation away from UK AI infrastructure toward Nordic, Middle East, and US corridors.

    Perplexity's $450M ARR at 50% MoM growth

  • D-Wave Quantum ($5.27B market cap) faces insider whistleblower allegations of misleading metrics and fabricated AI narratives — catalytic webcast April 15 via Coherence.Report. Stress-test any quantum computing portfolio exposure before then.

    Anthropic just declared war on your cybersecurity portfolio

  • xAI spending pushed SpaceX to a nearly $5 billion loss, revealing dangerous financial contagion across Musk's corporate portfolio — any space startup raising on SpaceX comps should be tested against actual unit economics, not narrative.

    AI budget cannibalization just broke the cybersecurity firewall

  • Constellation Software's 29.9% 20-year CAGR via 500+ VMS acquisitions is the single biggest permanent-capital competitor to PE software roll-ups — map your VMS deal pipeline against CSU's six operating groups to avoid bidding blind.

    Constellation Software's 29.9% CAGR is repricing vertical SaaS M&A

  • Brookfield Corporation trades at $42 vs. estimated $68 intrinsic value — a historically wide 38% NAV discount that may signal broader LP sentiment deterioration toward alternative asset managers mid-fundraise.

    Constellation Software's 29.9% CAGR is repricing vertical SaaS M&A

  • GLP-1 pharmacogenomic variation identified — genetic testing before prescription could become standard of care for 1B+ obesity patients. Companion diagnostics TAM forming at intersection of pharmacogenomics and metabolic medicine.

    Anthropic's Mythos forces critical infrastructure repricing

  • Linux Kernel mandated AI code provenance tracking (Assisted-by tags, human-only sign-off) — creates greenfield compliance tooling market as the standard propagates across OSS projects within 12-18 months.

    ChatGPT usage data just undermined the autonomous agent thesis

BOTTOM LINE

Open-source AI just claimed the frontier benchmark crown under MIT license while UBS confirmed half of enterprises are actively capping non-AI software spend — the model layer is commoditizing and the application layer is getting budget-cut simultaneously, compressing the value capture window to three specific layers: agentic infrastructure middleware, edge deployment, and AI-native security. If your portfolio sits between these pincers — charging proprietary API margins or selling seats to enterprises now containing non-AI spend — the repricing has already started and you have 2-3 quarters before it becomes consensus.

Frequently asked

Which portfolio positions are most exposed to the open-source model crossover?
Companies whose moat rests on reselling proprietary model API access at a markup are most exposed. With Z.AI's GLM-5.1 taking #1 on SWE-Bench Pro under MIT license and Gemma 4 shipping under Apache 2.0, coding-adjacent API margins face compression within 2-3 quarters. Multi-model contingency plans and migration toward orchestration, observability, or edge deployment value layers are the defensible responses.
Is the Figma valuation at $7.9B EV a genuine distressed opportunity or a value trap?
It's a high-conviction diligence target, not an automatic buy. The 60% discount to Adobe's 2022 $20B bid is real, and continued heavy R&D suggests management believes the workflow is durable. The decisive question is whether AI is permanently shrinking the design-tool category or the market is overshooting — resolve that before the takeover window closes in Q2.
Why did cybersecurity stocks sell off if AI is supposed to expand the attack surface?
The market is repricing cyber from AI beneficiary to AI-displaced. Palo Alto (-6.7%) and CrowdStrike (-4%) dropped because investors now expect AI platforms to internalize security capabilities, making standalone vendors redundant. Combined with UBS data showing over 50% of enterprises containing non-AI software spend, even the safe-haven premium is evaporating.
How do you reconcile Perplexity's $450M ARR with data showing users don't want autonomous agents?
Agents that augment decisions monetize; agents that promise full autonomy are building for demand that doesn't exist at scale. Perplexity's 50% MoM jump came from agent-assisted search and research workflows — augmentation framed as agency. ChatGPT usage data confirms decision support, writing, and information seeking dominate, while autonomous task execution barely registers.
Which agentic infrastructure categories are still pre-consensus enough to deploy into this quarter?
Agent-to-API integration middleware, runtime observability and guardrails, simulation sandboxes for reliability testing, and MCP-native developer tooling remain pre-consensus with reasonable valuations. Each maps to a specific enterprise deployment blocker — integration, security, reliability, compliance, maintainability — and each has $1B+ category potential if enterprise agent adoption tracks the pace signaled by Visa's 106M-dispute deployment.

◆ ALSO READ THIS DAY AS

◆ RECENT IN INVESTOR