~5 min
A two-person company hit $1.8B because the moat moved
Medvi's $20K-to-$1.8B run rate isn't an outlier — it's the same week Gemma 4 went Apache 2.0, GitHub fell to 90% uptime under agent load, and chain-of-thought stopped working. The boring layers are where the fight is now.
Matthew Gallagher spent $20,000, hired his brother, wired up ChatGPT, Claude, Midjourney, ElevenLabs, and a couple of regulated-services APIs, and built a GLP-1 telehealth business doing $401M in year one and tracking $1.8B in year two at 16.2% net margins — triple Hims, with two employees. Replit's CEO confirmed the one-person billion-dollar company has shipped. Nine independent newsletters covered the same story this week. The breadth of attention is itself the signal.
At the same time, Google released Gemma 4 under Apache 2.0 — a 31B dense model tied with Kimi K2.5 (744B) and GLM-5 (1T) on Arena, and a 26B MoE that activates 3.8B parameters and runs at 162 tok/s on a single 4090. OpenAI killed Sora after burning a million dollars a day and watching DAUs halve. GitHub's effective uptime cratered to about 90% — roughly 2.5 hours of daily degradation — because Claude Code traffic grew 6x in three months and the platform's stateful infrastructure was built for humans. Wharton, Apple ML, and Anthropic all published the same uncomfortable finding: chain-of-thought prompting on reasoning models buys you 2.9% accuracy at 20–80% latency cost, and on Gemini Flash 2.5 it's net negative.
None of these are isolated stories. They are the same story told from six angles: the moat moved, and most teams haven't noticed where it went.
What stopped being defensible
The model layer. Gemma 4 under Apache 2.0 means the next Medvi-style founder doesn't even pay for inference if they're willing to self-host. Qwen3.6-Plus is matching Opus 4.5 on SWE-bench. Sebastian Raschka's reverse-engineering shows Gemma 4 31B is architecturally near-identical to Gemma 3 27B — the jump came from training recipe, not architecture. Apple's Simple Self-Distillation paper got a 12.9pp gain on LiveCodeBench by fine-tuning Qwen3-30B on its own outputs with no filtering, no RL, no verifier. Sample, train, ship. That's the whole method.
Headcount as a proxy for capability. Two people did $401M in revenue. Chatbase hit $9M ARR with 18 people and zero outside capital. RevenueCat saw 40%+ growth in net-new developers shipping production apps in March alone. The minimum viable team for a competitive business has collapsed into single digits in any vertical where the regulated layer is API-accessible.
Prompt-engineering folklore. If you're still telling reasoning models to think step by step, you're paying 30–70x per query for accuracy that's flat or negative. Reasoning traces hide shortcut usage 61–75% of the time, and unfaithful traces are longer than faithful ones — the most polished output is the one most likely to be wrong. Stop trusting the trace. Verify the answer.
What started mattering more
The harness. Hermes Agent's pluggable memory across seven backends. LangChain shipping Claude Code → LangSmith tracing. Cursor 3 rebuilt as a multi-agent fleet manager. The model-harness training loop — capture traces, fine-tune an open model on your domain, deploy, repeat — is the actual flywheel. Apache 2.0 makes those traces yours to train on. If you're not logging structured execution traces from every agent run today, you're burning training signal you'll wish you had next quarter.
Routing. The inverted-U Apple ML documented is real: standard models beat reasoning models on easy tasks, reasoning wins the middle, both collapse on the hardest problems. A complexity classifier in front of your endpoints is now a six-figure infrastructure decision. DeepSeek R1 at $2.19/M tokens versus o3-mini at $4.40/M is a 2x gap with comparable quality on AIME. The teams who route well will quietly outspend the teams who don't on everything else.
Proprietary data and the regulated edge. Medvi outsourced doctors and pharmacy to CareValidate and OpenLoop. The interesting investment isn't Medvi — it's the picks-and-shovels companies that make the regulated layer API-accessible for the next hundred Medvis. Anthropic paid $400M for an 8-month-old, pre-revenue bio startup. That's the number to remember when someone tells you vertical AI is overpriced.
Resilience. GitHub at 90% is your CI/CD's new ceiling. Microsoft absorbed the platform into its AI group, eliminated the CEO role, and let Copilot fall to third behind Claude Code and Cursor. Three of the February–March incidents were failover paths that worked in testing and broke in production — the classic distributed-systems failure mode. Mirror your top repos to a second remote this week. Stand up self-hosted runners for deployment-critical pipelines. Git is distributed by design; you've been using it as if it weren't.
The macro that breaks the optimistic model
Google, Amazon, and Anthropic all throttled simultaneously. Kent Beck's read is correct: the binding constraint is investor patience, not silicon. Andreessen says the AI supply chain is sold out for three to four years and old Nvidia chips are appreciating. Power users are spending $1,000/day on Claude tokens. Meta committed $27B to a single 7.5GW data center. Oil hit $111. Blue Owl gated private credit redemptions on software-company exposure.
The assumption that inference costs decline on a steep curve — the assumption underneath most 2026 AI roadmaps — is fragile. Build for the case where costs plateau. Stress-test pricing against flat unit economics for two years.
What to do this week
One exercise. Pick your top three revenue lines. For each, write a one-page Medvi threat model: what could a 5-person AI-native team build against this with $50K and 60 days, and which of your defenses survive? If the answer is "most of what we do, at 1/100th the cost," you have a quarter — not a year — to decide what's actually proprietary and double down on it. Everything else is a legacy tax a competitor will arbitrage.
Then audit your prompts. Strip chain-of-thought from anything hitting a reasoning endpoint. Ship a complexity router in front of your model selection. Run Apple's self-distillation method on your best fine-tuned model — the experiment costs hours and the upside is double-digit accuracy gain. Mirror your critical repos. Log every agent trace. Pick one vertical where your regulatory or data moat is genuine and invest there like it's the only thing you own.
Because it is.
◆ Behind the synthesis
Six specialist takes that fed this piece.
The piece above is one stream in my voice. Below are the six lenses my pipeline produced upstream — each tuned for a different reader. Use them when you want the angle that matters most to your role.
-
GitHub's availability has cratered to roughly one nine (~90%) — about 2.5 hours of degradation per day — driven by a 6x surge in AI agent traffic over three months.
GitHub is now your riskiest infrastructure dependency at ~90% effective uptime from AI agent traffic — map your blast radius and build mirrors this sprint. Meanwhile, chain-of-thou…
42 sources · 6 min Read → -
AI-powered offensive operations crossed from theoretical to operational: a Chinese state group ran the first documented autonomous AI espionage campaign — executing 80-90% of tactical operations against 30 global targets via Claude Code — while CyberStrikeAI breached 600+ FortiGates across 55 countries and Google reported attacker dwell time has collapsed to 22 seconds.
AI-powered offensive operations are now operational — a Chinese state group autonomously espionaged 30 targets with AI executing 80-90% of the work, CyberStrikeAI breached 600+ For…
42 sources · 7 min Read → -
Google's Gemma 4 31B matches trillion-parameter models at 1/30th the size under Apache 2.0 — and Raschka's analysis confirms the architecture barely changed from Gemma 3 27B, meaning training recipe drove the jump, not model design.
Gemma 4 proves training recipe beats architecture (31B matching trillion-parameter models under Apache 2.0), Apple proves self-distillation beats model swaps (+12.9pp for free), an…
42 sources · 7 min Read → -
A solo founder spent $20K, hired his brother, and built a $1.8B-run-rate telehealth company using AI for every function — code, ads, customer service, analytics.
A 2-person startup hit $1.8B in revenue using $20K of AI tools while three major inference providers throttled simultaneously — proving build costs have collapsed to near zero but…
42 sources · 8 min Read → -
A 2-person company just hit $1.8B in revenue using a $20K AI tool stack — and Google releasing frontier-competitive Gemma 4 under Apache 2.0 this week means the cost to replicate this model dropped to zero licensing.
A two-person company hit $1.8B in revenue this year using a $20K AI tool stack, and Google just made frontier-competitive models free under Apache 2.0 — collapsing the cost to repl…
42 sources · 8 min Read → -
A telehealth company built for $20K with 2 employees is on pace for $1.8B in 2026 revenue — the same week OpenAI shut down Sora after burning $1M/day with halving DAUs and killed a $1B Disney partnership.
The AI industry violently sorted itself this week: a $20K telehealth startup is hitting $1.8B in revenue with 2 employees while OpenAI burned $1M/day on Sora before killing it, Ant…
42 sources · 9 min Read →