When you do the math, humans still rule

A team of 11 prominent mathematicians led by Harvard’s Lauren Williams has launched First Proof, an independent challenge that released 10 recently solved but unpublished research problems (spanning number theory, algebraic combinatorics, spectral graph theory, symplectic topology and numerical linear algebra) on Feb. 5, with encrypted solutions to be revealed on Feb. 13. Preliminary testing found GPT-5.2 Pro and Gemini 3.0 Deepthink solved only two of the ten problems, underscoring limits of current AI in making creative conceptual advances even as it performs well on algorithmic or well-trodden tasks — a finding likely to temper claims about imminent AI replacement of top research talent and modestly affect narratives around AI capability rather than near-term market fundamentals.

Analysis

Market structure: Near-term winners remain AI infrastructure and cloud incumbents (NVDA, GOOGL, MSFT) because research-level failures by LLMs blunt near-term revenue re‑ramping for app-layer vendors; expect chip pricing power to persist with NVIDIA (NVDA) benefiting from constrained supply and multi-year data‑center demand. Losers are small-cap pure‑play “AI replaces experts” vendors and consultancies that priced steep revenue growth into valuations—those business models need human-in-the-loop spend to remain viable. Risk assessment: Tail risks include a rapid breakthrough (10–15% probability in 12 months) that materially accelerates automation, and regulatory/patent litigation risk (20–25% over 2 years) that could slow model deployment. Hidden dependencies: progress hinges on compute availability, labeled data quality, and specialized researcher talent; model-release cadence (e.g., GPT/Gemini upgrades) and the Feb 13 reveal are high‑information catalysts that could move sentiment within days. Trade implications: Tactical overweight infrastructure: NVDA (2–3% portfolio), GOOGL (1–2%), MSFT (1–2%) for 6–12 months; hedge narrative risk with small put protection or short exposure to demonstrably overpromised names (example: PLTR) sized <1% notional. Use options to express asymmetric views: buy 3‑month debit call spreads on NVDA sized 0.5–1% if IV percentile <70; target +25–40% return, stop at -50% of premium. Contrarian angles: Consensus underestimates durable demand for human-in‑the‑loop services (legal, biotech, consulting). Consider long ACN (1% weight, 6–12 months) as a play on outsourced expert workflows; historical parallels (automation hype cycles 2000/2012) show infrastructure winners consolidate while many app-layer names disappoint, so prefer balance‑sheet‑strong platform bets over speculative AI narratives.

AllMind

AllMind

When you do the math, humans still rule

Analysis

AllMind AI Terminal

Market Sentiment

Key Decisions for Investors