PROMIT NOW · INVESTOR DAILY · 2026-02-20

AI Capital Reprices as Mega-Seeds Collide With GPU Physics

· Investor · 23 sources · 1,931 words · 10 min

Topics AI Capital · Agentic AI · LLM Inference

AI capital is repricing at every layer simultaneously: $5B+ in mega-seed rounds dropped this week (Ineffable Intelligence at $4B, World Labs at $1B, Entire at $300M), while inference economics reveal a structural memory-bandwidth wall that makes current GPU infrastructure 99% wasteful for the workloads that matter most. The funds that win the next decade will be those that can underwrite both the 'coconut round' founder-pedigree premium at entry AND the physics-constrained unit economics that determine which AI companies actually survive to scale.

◆ INTELLIGENCE MAP

  1. 01

    AI Mega-Rounds and the Death of Stage Taxonomy

    act now

    AI first rounds now span a 2,000x range ($1M to $2B), capital is fragmenting from monolithic LLM scaling into spatial AI, reinforcement learning, and creative tooling, and the only entry point for category-defining companies is pre-announcement relationship access — the traditional staged-financing model is broken.

    4
    sources
  2. 02

    AI Infrastructure Economics: The Memory Wall and Inference Reality

    monitor

    Inference runs at ~1% GPU utilization due to memory bandwidth constraints that worsen every chip generation, creating investable wedges in inference-specific silicon, edge AI (11x cheaper than cloud at high usage), and utilization optimization — while simple RAG outperforms complex methods at 3-5x lower cost, threatening over-engineered tooling valuations.

    4
    sources
  3. 03

    Platform Commoditization Crushing the Application Layer

    act now

    Google bundling Lyria 3 into Gemini, GitHub embedding agentic workflows for 100M+ developers, AI-generated 'disposable interfaces' bypassing SaaS UIs, and 100+ finance copilots fragmenting ARR all confirm that application-layer AI startups without distribution moats or proprietary data are entering the kill zone.

    5
    sources
  4. 04

    Fintech AI Operating Leverage and BNPL Expansion

    monitor

    Klarna halved headcount while boosting pay 50%, Ramp hit $1B revenue in 7 years with AI-first ops, Affirm posted 62.5% net income growth with BNPL at just 1.5% US retail penetration — while the crypto treasury model died publicly (only 1 of dozens beat the S&P 500, Thiel exited ETHZilla).

    2
    sources
  5. 05

    Regulatory and Geopolitical Risk Repricing

    background

    Meta's bellwether addiction trial exposes internal contradictions under oath, Fed minutes confirm rate cuts months away, Iran military escalation threatens energy prices, Anthropic faces Pentagon blacklisting, and the password manager trust model just broke across 60M users — multiple sectors face simultaneous regulatory repricing.

    5
    sources

◆ DEEP DIVES

  1. 01

    The Coconut Round Era: $5B+ in AI Mega-Seeds Demands a New Portfolio Playbook

    <h3>The Structural Shift</h3><p>The venture capital stage taxonomy broke this week. When <strong>Thomas Dohmke</strong> (former GitHub CEO) raised a <strong>$60M seed at $300M valuation</strong> for Entire, it was the smallest of a cluster: <strong>Ineffable Intelligence</strong> (David Silver, AlphaGo architect) is targeting <strong>$1B at a $4B valuation</strong> — reportedly Europe's largest-ever seed. <strong>World Labs</strong> (Fei-Fei Li) closed <strong>$1B</strong> with AMD, NVIDIA, and Fidelity co-investing alongside a <strong>$200M Autodesk strategic</strong>. Mira Murati's <strong>Thinking Machines raised $2B</strong>. Saudi Arabia's Humain pumped <strong>$3B into xAI</strong> with a bizarre SpaceX share conversion mechanism. That's over $7B in announced AI capital in a single cycle.</p><p>The market has started calling these <strong>"coconut rounds"</strong> — a term coined by Bloomberg's Ed Ludlow — because the old vocabulary can't contain what's happening. The seed label now spans a <strong>2,000x range</strong> ($1M to $2B), and the sole underwriting criterion for the mega end is founder pedigree.</p><hr><h3>Where Capital Is Fragmenting</h3><p>Critically, this capital is <strong>not</strong> all flowing to the same thesis. The market is diversifying away from monolithic LLM scaling into distinct sub-categories, each attracting billion-dollar conviction bets:</p><table><thead><tr><th>Company</th><th>Founder Pedigree</th><th>Raise</th><th>Valuation</th><th>Thesis</th></tr></thead><tbody><tr><td><strong>Thinking Machines</strong></td><td>Mira Murati (OpenAI CTO)</td><td>$2B</td><td>Undisclosed</td><td>Frontier AI research</td></tr><tr><td><strong>Ineffable Intelligence</strong></td><td>David Silver (AlphaGo)</td><td>$1B</td><td>$4B</td><td>RL-first AI, explicitly not LLM scaling</td></tr><tr><td><strong>World Labs</strong></td><td>Fei-Fei Li (Stanford/Google)</td><td>$1B</td><td>Undisclosed</td><td>Spatial intelligence (MARBLE)</td></tr><tr><td><strong>Apptronik</strong></td><td>Robotics team</td><td>$935M</td><td>Undisclosed</td><td>Humanoid robotics</td></tr><tr><td><strong>Entire</strong></td><td>Thomas Dohmke (GitHub CEO)</td><td>$60M</td><td>$300M</td><td>AI-native developer tools</td></tr></tbody></table><p>Two patterns demand attention. First, the <strong>valuation floor for coconut rounds is ~$300M</strong> and the ceiling exceeds $4B — territory that used to require $50M+ ARR. Second, Silver's explicit framing — <strong>RL, not incremental LLM updates</strong> — is a thesis statement that the next capability discontinuity comes from a different paradigm than what 90% of AI capital currently funds. When competing chip companies (AMD and NVIDIA) both back World Labs, they're signaling spatial AI compute demand will be large enough to grow the pie for everyone.</p><blockquote>Dohmke had the option to raise at $700M but deliberately settled at $300M — optimizing for dilution management and downstream optionality rather than maximizing Day 1 paper value. That discipline is rare and investable.</blockquote><h3>Portfolio Construction Implications</h3><p>The elimination of staged financing removes the traditional risk-reduction mechanism that protected LPs. Companies attempting to compress 4-5 rounds over 7-10 years into 2-3 rounds over 3-5 years changes your <strong>reserve strategy, follow-on cadence, and expected hold period</strong>. Benchmark's 2020 fund sitting at <strong>10x+ invested capital</strong> (driven by Legora, Mercor, Sierra, Cursor) and Jack Altman winding down a <strong>$275M fund</strong> to join as GP signals the top of the market believes AI-first bets are the generational trade.</p><p>The alpha is entirely in <strong>pre-announcement access</strong>. By the time a coconut round is reported, allocation is oversubscribed. The next 20 most likely coconut founders are senior leaders at OpenAI, Anthropic, Google DeepMind, Meta FAIR, and xAI who haven't yet left. The "disciplined coconut" tier ($50-100M seeds at $250-500M valuations) offers better risk-adjusted returns than $1-2B mega-seeds, where the path to a venture-returning outcome requires $20B+ exits.</p>

    Action items

    • Audit fund stage-based allocation model against coconut round dynamics by end of Q1 — determine whether check size and ownership targets allow participation or require explicit opt-out to sub-$10M seeds
    • Build a target list of the 20 most likely next coconut founders from OpenAI, Anthropic, DeepMind, Meta FAIR, and xAI leadership — begin relationship-building this month
    • Update LP communications this quarter to address coconut round portfolio construction — explain concentrated, high-conviction bets at $300M-$4.5B entry valuations
    • Reassess AI developer tools portfolio companies against Entire's $60M war chest and Dohmke's GitHub network within 2 weeks

    Sources:Capital-Intensive 'Coconut' Rounds Upend the Traditional Venture Funding Model · Gemini music gen 🎵, World Labs $1B 🌍, Spec-driven AI dev 🧱 · 🎶 Google's play for the AI music mainstream · X crypto & stock trading 🪙, AI will shrink workforce 🤖, Affirm expands BNPL 💸

  2. 02

    The Memory Wall Is the Real AI Moat — and It's Repricing the Entire Infrastructure Stack

    <h3>The Physics You Can't Ignore</h3><p>While billions pour into AI mega-rounds, the unit economics of actually <em>running</em> AI are governed by a physics constraint the market hasn't fully priced. During decode (token generation), the H100 runs at roughly <strong>1% of its theoretical peak compute</strong>. Not because of bad software — because of memory bandwidth. Decode arithmetic intensity collapses to 1-4 FLOPs/byte, far below the H100's 295 FLOPs/byte threshold. And here's what makes this structurally investable: <strong>compute grows 3x every two years while bandwidth grows at half that rate</strong>. The problem gets worse with every GPU generation.</p><p>The KV cache is the binding constraint on AI unit economics. At 4K context, a 7B model serves <strong>278 concurrent users per H100</strong>. At 128K context, that drops to <strong>~8 users</strong> — a <strong>35x cost escalation</strong> that most financial models ignore. Agentic systems, where agents share traces and context compounds, push every deployment toward the expensive end of this curve.</p><table><thead><tr><th>Layer</th><th>Cost per Million Tokens</th><th>Multiple vs. Floor</th></tr></thead><tbody><tr><td>Raw compute floor (full utilization)</td><td>$0.004</td><td>1x</td></tr><tr><td>Self-hosted (30% utilization)</td><td>$0.013</td><td>3x</td></tr><tr><td>Google Gemini Flash-Lite</td><td>$0.30</td><td>75x</td></tr><tr><td>OpenAI GPT-4o-mini</td><td>$0.60</td><td>150x</td></tr><tr><td>Anthropic Claude Haiku</td><td>$1.25</td><td>312x</td></tr></tbody></table><p>The <strong>75-330x gap</strong> between compute floor and API pricing isn't margin — it's operational overhead. This gap is the addressable market for inference optimization companies.</p><hr><h3>Three Investable Wedges</h3><h4>1. Inference-Specific Silicon</h4><p>Groq (massive on-chip SRAM), Cerebras (wafer-scale silicon), Etched (transformer-specific ASICs), and AMD's MI300A (92 FLOPs/byte vs. H100's 295) are attacking the structural memory-bandwidth bottleneck. The irony: <strong>Qualcomm's Snapdragon at ~1 FLOP/byte is actually better matched</strong> to decode's 1-4 FLOPs/byte arithmetic intensity than NVIDIA's H100. The compute gap is 20,000x, but the bandwidth gap is only 66x.</p><h4>2. Edge AI Infrastructure</h4><p>At 100M MAU with 500 requests/user/month, cloud API costs <strong>$11.25M/month</strong> versus <strong>$1.0M on-device</strong> — an <strong>11x premium</strong>. On-device cost is flat regardless of usage because users already own the hardware. This means <strong>always-on AI features are economically impossible on cloud metering</strong> but viable on-device. Three independent developments are converging to make this real: YOLO26 eliminates post-processing complexity for edge object detection, SLMs use quantization for local inference, and Python 3.14's GIL removal unlocks true parallelism for edge workloads.</p><h4>3. Utilization Optimization</h4><p>Moving from 10% to 30% GPU utilization delivers a <strong>3x cost improvement</strong> — more impactful than most architectural innovations. LangChain's coding agent jumped from <strong>Top 30 to Top 5 on Terminal Bench 2.0</strong> with nothing more than a harness change — no model improvement. This proves the value layer is shifting from model providers to <strong>orchestration and harness engineering</strong>. Meanwhile, FloTorch's 2026 benchmark showed simple RAG chunking beating complex methods at <strong>3-5x lower infrastructure cost</strong>, threatening the valuation thesis for over-engineered RAG tooling startups.</p><blockquote>The memory wall, not the compute ceiling, determines who wins AI infrastructure; every dollar invested in faster FLOPs without proportional bandwidth is a dollar wasted on decode workloads.</blockquote>

    Action items

    • Screen all portfolio companies building on long-context or agentic workflows for KV cache exposure — model the 8-35x cost escalation from 4K to 32K-128K context by end of month
    • Reassess inference chip startup valuations (Groq, Cerebras, Etched) as structural physics plays, not incremental optimizations — build thesis memo this quarter
    • Stress-test any active RAG infrastructure deals against FloTorch's simplicity benchmark — challenge chunking complexity as a moat
    • Source Series A/B deals in agent harness and orchestration companies this quarter

    Sources:The Real Cost of Running AI · Gemini music gen 🎵, World Labs $1B 🌍, Spec-driven AI dev 🧱 · Trust Through Data Lineage 🕸️, Auto-Healing Spark Memory ⚙️, BI Built in SQL 📊 · Researchers Solved a Decade-old Problem in Object Detection

  3. 03

    Platform Kill Zone Expanding: Hyperscalers Are Commoditizing the AI Application Layer

    <h3>The Convergence</h3><p>Five independent signals this cycle point to the same conclusion: <strong>the AI application layer is being systematically commoditized</strong> by platform incumbents, and startups without distribution moats, proprietary data, or deep workflow integration are entering the kill zone.</p><p><strong>Signal 1: Google's Lyria 3.</strong> Text-to-music, photo-to-music, and video-to-music generation integrated directly into Gemini with SynthID watermarking. Standalone AI music platforms like Suno and Udio now face a distribution problem they cannot solve — Google ships this to hundreds of millions of users at zero marginal cost.</p><p><strong>Signal 2: GitHub Agentic Workflows.</strong> Microsoft is embedding agentic AI directly into the developer infrastructure layer where <strong>100M+ developers</strong> already live. Developers describe desired outcomes in plain Markdown; agents handle triage, documentation, code quality, and more. This is textbook platform bundling that compresses the TAM for every standalone AI coding tool.</p><p><strong>Signal 3: Disposable Interfaces.</strong> AI coding tools now enable end-users to generate custom interfaces that bypass traditional SaaS UIs entirely, accessing product value directly through APIs. If your portfolio company's moat is "beautiful UI" rather than proprietary data, that moat is dissolving.</p><p><strong>Signal 4: Finance Copilot Graveyard.</strong> With <strong>100+ LLM-enabled entrants</strong> fragmenting ARR, questionable unit economics from API dependence on Bloomberg/FactSet data, and LLM reliability gaps that institutional workflows cannot tolerate, the standalone finance copilot is a value trap. The real unlock — seamless two-way Excel integration — hasn't been cracked yet.</p><p><strong>Signal 5: AI Monetization Fork.</strong> Perplexity pulled sponsored answers entirely, betting that trust drives subscriptions. OpenAI and Google are testing ads in AI responses. This divergence determines where marketing dollars flow — and whether AI ad-tech has a structural ceiling.</p><table><thead><tr><th>Category</th><th>Platform Threat</th><th>Survival Criteria</th></tr></thead><tbody><tr><td>AI Music Generation</td><td>Google Lyria 3 in Gemini</td><td>Enterprise licensing, proprietary IP deals</td></tr><tr><td>AI Coding Tools</td><td>GitHub Agentic Workflows</td><td>Enterprise workflow depth, non-GitHub distribution</td></tr><tr><td>SaaS UI Layer</td><td>AI-generated disposable interfaces</td><td>API-first architecture, data moats</td></tr><tr><td>Finance Copilots</td><td>Incumbent terminal data lock-in</td><td>Excel bridge, proprietary data</td></tr><tr><td>AI Search</td><td>Google AI Overviews, ChatGPT Search</td><td>Subscription trust premium (Perplexity bet)</td></tr></tbody></table><blockquote>When Google gives away AI music generation for free inside a product with hundreds of millions of MAUs, quality parity is no longer a moat. The same logic applies to every AI feature that can be bundled into an existing platform.</blockquote><h3>Where Value Migrates</h3><p>The value is migrating to three places: <strong>platform owners</strong> (Google, OpenAI, Microsoft), <strong>infrastructure providers</strong> (content provenance, agentic commerce rails, AI agent authorization), and <strong>frontier research labs</strong> with breakthrough capabilities platforms can't easily replicate. AI agent authorization is emerging as a particularly compelling new category — ReBAC systems like SpiceDB are structurally superior to static policy engines for agentic workloads, and the TAM expands with every enterprise AI agent deployment. OpenAI's hire of <strong>Charles Porch</strong> (15-year Meta/Instagram veteran) as VP of Global Creative Partnerships signals they're building cultural legitimacy as a moat — not just technology.</p>

    Action items

    • Conduct a portfolio-wide 'kill zone audit' this week — flag every holding where >50% of competitive positioning derives from UI/UX rather than proprietary data, APIs, or workflow lock-in
    • Reassess exposure to standalone AI music generation startups (Suno, Udio) and AI coding tools without enterprise lock-in by end of month
    • Map the AI agent authorization landscape (Authzed/SpiceDB, Oso, Permit.io) as a new investment category this quarter
    • Avoid new investments in standalone finance copilots — redirect attention to companies building the Excel integration bridge or the context layer (systems of record)

    Sources:🎶 Google's play for the AI music mainstream · Meta smartwatch ⌚, Zuckerberg testifies ⚖️, GitHub Agentic Workflows 🤖 · PostgreSQL bloat 🐼, React Doctor 🧑‍⚕️, disposable interfaces ⚡️ · X crypto & stock trading 🪙, AI will shrink workforce 🤖, Affirm expands BNPL 💸 · Reddit creative trends 🖼️, B2B carousel formula ✅, find AI queries in GSC 🔍

  4. 04

    Regulatory and Macro Risk Repricing: Social Media Liability, Fed Hold, and Supply Chain Gaps

    <h3>Meta's Bellwether Trial: The Most Underpriced Risk in Social Media</h3><p>Mark Zuckerberg took the stand in a bellwether addiction trial where <strong>internal emails directly contradict his testimony</strong>. In 2015, Zuckerberg emailed teams to boost time spent by 12% as a 2016 goal — then testified Meta doesn't "give teams goals on time spent." Internal estimates showed <strong>4 million children under 13</strong> on Instagram while the company claimed they weren't allowed. Adam Mosseri said on a 2020 podcast that "there's such a thing as being addicted to a social media platform" — then testified he disagrees addiction exists.</p><p>TikTok and Snap <strong>settled before trial</strong>, the strongest signal that the industry views litigation risk as material. This is a bellwether case shaping thousands of pending lawsuits. If you have any portfolio exposure to ad-supported social platforms, you need a litigation scenario in your model — <em>specifically, model 15-25% engagement reduction mandates</em>.</p><hr><h3>Fed: Higher for Longer Is Now the Base Case</h3><p>February 18 Fed minutes revealed a <strong>divided committee</strong> with rate cuts months away. The 10-Year Treasury ticked to <strong>4.079%</strong>. Any DCF model assuming H1 2026 rate relief is stale. The implication for growth equity is direct: discount rates stay elevated, compressing present values of future cash flows across your portfolio.</p><hr><h3>Supply Chain Single Points of Failure</h3><p>Trump invoked the <strong>Defense Production Act</strong> for glyphosate and elemental phosphorus, revealing the US has <strong>exactly one domestic producer</strong> of both chemicals — with China as the only alternative. Elemental phosphorus feeds into semiconductors, batteries, and military applications. Simultaneously, Bayer faces a three-body problem: a rumored <strong>$7.25B Roundup settlement</strong>, a pending Supreme Court case, and the DPA order that validates glyphosate as strategically essential. The Supreme Court ruling is the binary trigger — if for Bayer, the settlement closes and the stock re-rates; if against, liability widens.</p><hr><h3>Cybersecurity: Detection Moats Eroding</h3><p>ETH Zurich demonstrated <strong>25 attacks across Bitwarden (12), LastPass (7), and Dashlane (6)</strong> — breaking zero-knowledge encryption guarantees across <strong>60M users</strong>. Separately, ADWSDomainDump bypasses both <strong>CrowdStrike Falcon and Microsoft Defender</strong> using ADWS (port 9389) instead of LDAP. ShinyHunters executed <strong>15 breaches in 7 weeks</strong> of 2026 via social engineering and SSO abuse. The pattern: <strong>identity is the primary attack surface</strong>, and detection-based defenses are hitting structural limits. This creates displacement opportunities in next-gen credential management, ITDR, and AI agent security (Nono's kernel-enforced sandbox for MCP/LLM workloads).</p><blockquote>The market is telling you AI infrastructure is investable, AI applications are fragile, social media faces a litigation repricing nobody's modeling, and the Fed just took H1 rate cuts off the table — adjust your discount rates and your deal flow filters accordingly.</blockquote>

    Action items

    • Stress-test portfolio companies with ad-supported social media revenue against 15-25% engagement reduction mandates — complete scenario analysis by end of Q1
    • Update DCF models across growth-stage portfolio to reflect rate cuts no earlier than Q3 2026 this month
    • Identify the sole US domestic glyphosate/phosphorus producer as a potential investment target this week
    • Screen deal flow for next-gen credential management platforms positioned to displace Bitwarden/LastPass/Dashlane's 60M user base — USENIX Security 2026 publication will be the procurement catalyst

    Sources:☕️ Just one glitch · ☕️ ROUNDED UP ☘ Thursday, February 19, 2026 ☘ C&C NEWS 🦠 · Android Firmware Malware 🚨, Dell Zero-Day Exploited 🖧, Password Manager Lies 🔓 · Meta smartwatch ⌚, Zuckerberg testifies ⚖️, GitHub Agentic Workflows 🤖 · Today in Politics, Bulletin 311. 2/19/26

◆ QUICK HITS

  • Klarna halved headcount since 2022, targets another 33% cut by 2030 — AI chatbot replaces 800 support agents while remaining employee pay rises 50%

    X crypto & stock trading 🪙, AI will shrink workforce 🤖, Affirm expands BNPL 💸

  • BNPL at just 1.5% of US retail volume — Affirm posted $1.1B quarterly revenue (+30%) and $130M net income (+62.5%) while expanding into travel, tax, rent, and debit

    X crypto & stock trading 🪙, AI will shrink workforce 🤖, Affirm expands BNPL 💸

  • Crypto treasury model dead: only 1 of dozens of Strategy copycats beat the S&P 500; Thiel exited ETHZilla (down 97%); Nakamoto Holdings at $0.29 after raising $710M at $1.12

    Web 4.0 & Automatons 🤖, Theil Exits EthZilla 🏃, The Nakamoto Heist 🦹

  • Anthropic faces potential Pentagon supply chain blacklisting — defense AI contracts could redistribute to Palantir, Scale AI, Anduril, and OpenAI

    Meta smartwatch ⌚, Zuckerberg testifies ⚖️, GitHub Agentic Workflows 🤖

  • 50%+ of all ETH supply (80.95M ETH) now locked in proof-of-stake deposit contract — structural supply compression reducing free float

    Web 4.0 & Automatons 🤖, Theil Exits EthZilla 🏃, The Nakamoto Heist 🦹

  • Aave revenue grew from $5.2M to $142M — but the ~$51M ACI budget vote is a DeFi governance stress test worth monitoring

    Web 4.0 & Automatons 🤖, Theil Exits EthZilla 🏃, The Nakamoto Heist 🦹

  • eBay acquiring Depop from Etsy for $1.2B in cash — benchmarks exit multiples for recommerce and signals Etsy's strategic retreat from adjacencies

    ☕️ Just one glitch

  • X-Money payments beta expected within 2 months alongside Smart Cashtags for stock/crypto trading — competitive threat to Robinhood and Cash App pending regulatory clearance

    X crypto & stock trading 🪙, AI will shrink workforce 🤖, Affirm expands BNPL 💸

  • Humain's $3B xAI investment converts to SpaceX shares — a governance structure no traditional investor should underwrite without significant protections

    Gemini music gen 🎵, World Labs $1B 🌍, Spec-driven AI dev 🧱

  • Fintech VC surged 35% to $40.8B across 2,126 deals in 2025 — but fewer, bigger deals mean capital is concentrating in winners while the Series A/B window narrows

    X crypto & stock trading 🪙, AI will shrink workforce 🤖, Affirm expands BNPL 💸

BOTTOM LINE

AI capital is simultaneously repricing at the top (coconut rounds hitting $2B for pre-product companies) and being constrained at the bottom (inference runs at 1% GPU utilization due to memory physics that worsen every chip generation), while hyperscalers are commoditizing the application layer in between — the funds that win will be those that can access elite founders before announcement, underwrite the infrastructure physics that determine survival, and ruthlessly exit portfolio companies sitting in the platform kill zone.

Frequently asked

What is a 'coconut round' and why does it matter for portfolio construction?
A 'coconut round' is the new category of $50M–$2B seed rounds at $300M–$4B+ valuations, underwritten almost entirely on founder pedigree rather than traction. The term, coined by Bloomberg's Ed Ludlow, reflects that seed labels now span a 2,000x range. This breaks traditional staged financing: reserve strategies, follow-on cadence, and expected hold periods all need to be rebuilt, and LPs expecting classic seed risk profiles must be re-educated on concentrated, high-entry-price bets.
Why is GPU memory bandwidth, not compute, the real constraint on AI unit economics?
During token generation (decode), workloads have an arithmetic intensity of just 1–4 FLOPs/byte, far below the H100's 295 FLOPs/byte threshold — meaning GPUs run at roughly 1% of theoretical peak. Compute is growing 3x every two years while bandwidth grows at half that rate, so the gap widens each generation. This is why inference-specific silicon (Groq, Cerebras, Etched) and even Qualcomm's Snapdragon (~1 FLOP/byte) are structurally better matched to decode than NVIDIA's flagship GPUs.
How should I model KV cache costs for portfolio companies building agentic or long-context products?
Model an 8–35x COGS escalation as context length grows. A 7B model on an H100 serves ~278 concurrent users at 4K context but only ~8 users at 128K context. Most founder financial models project unit economics at 4K while product roadmaps require 32K–128K for agentic workflows, meaning reported gross margins are materially overstated. Any long-context or multi-agent company needs its inference cost curve re-underwritten before follow-on.
Which AI application categories are entering the platform kill zone right now?
Standalone AI music (threatened by Google Lyria 3 in Gemini), AI coding tools without enterprise lock-in (GitHub Agentic Workflows), pure-UI SaaS layers (disposable AI-generated interfaces), and finance copilots (100+ fragmented entrants against Bloomberg/FactSet lock-in). Survival requires proprietary data, deep workflow integration, non-incumbent distribution, or frontier capability — quality parity alone is no longer a moat when hyperscalers bundle for free.
What macro and regulatory shifts should I reprice into the portfolio this quarter?
Three: Fed minutes make rate cuts no earlier than Q3 2026 the base case, so DCF discount rates must rise across growth-stage holdings. The Meta addiction bellwether trial — with internal emails contradicting sworn testimony and TikTok/Snap settling pre-trial — warrants modeling a 15–25% engagement reduction scenario for ad-supported social exposure. And Trump's Defense Production Act invocation for glyphosate and elemental phosphorus reveals single-producer supply chain vulnerabilities that will attract federal subsidies and national-security premiums.

◆ ALSO READ THIS DAY AS

◆ RECENT IN INVESTOR