MU, WDC, SNDK fall: Why Google’s TurboQuant is rattling memory stocks

Memory stocks sold off after Google's TurboQuant announcement: SanDisk -5.7%, Micron -3%, Western Digital -4.7% and Seagate -4% as the Nasdaq 100 rose. Google claims TurboQuant can compress key-value cache to 3 bits (6x reduction in tested models) and deliver up to 8x performance on H100 GPUs without retraining; TurboQuant will be presented at ICLR 2026 (PolarQuant at AISTATS 2026). Analysts note the technology is bullish for the AI cost curve if broadly adopted (Wells Fargo) but may not materially reduce 3–5 year DRAM/flash demand due to supply constraints (Lynx), with Citrini Research and others downplaying immediate structural demand destruction; Lynx reiterated a $700 Micron 2028 view and buyer interest on the pullback.

Analysis

Google (and other large cloud/inference operators) are the asymmetric beneficiary: a software-led reduction in per-inference memory cost acts like a non-dilutive margin expansion lever for inference at scale and can be deployed incrementally within existing data centers. That advantage is not binary — it compounds over millions of daily inferences, enabling either lower prices to win share or higher incremental margins to fund more model scale, with meaningful P&L impact visible within 6–12 months if adoption is broad. Memory suppliers face a two-path dynamic. In the near term (days–months) headline uncertainty can compress multiples and cause volatile inventory markdowns; in the medium term (12–36 months) constrained wafer/plant lead times mean even a solid compression play may not fully eliminate demand for DRAM/NAND because training workloads, model proliferation, and non-KV use-cases retain growth. That creates a scenario where prices fall first, capex is deferred, and then a supply pinch can re-emerge — a classic boom/bust capex cycle for memory makers. Second-order winners include cloud cost-optimization vendors, inference-optimized ASIC designers, and software-layer middleware that monetizes reduced memory per token; losers include mid-cycle DRAM/NAND OEMs and third-party hardware resellers who sell on capacity rather than differentiated performance. Key catalysts to watch are adoption signals from hyperscalers (product rollouts, pricing moves), independent benchmarks in production-like environments, and patent/licensing stances that determine whether this becomes an industry standard or a Google-specific moat. Tail risks: rapid, broad adoption could structurally lower long-term ASPs for memory and accelerate consolidation; conversely, slow/fragile real-world results or proprietary lock-in would restore the status quo and cause a mean reversion rally in memory names. Market reaction will be driven by a sequence of short-term headline repricing, analyst model updates over 1–3 quarters, and supply-side responses over 12–36 months.

AllMind

AllMind

MU, WDC, SNDK fall: Why Google’s TurboQuant is rattling memory stocks

Analysis

AllMind AI Terminal

Market Sentiment

Key Decisions for Investors