
Large language models are increasingly deployed as therapists but, the author argues, they can only handle simple support tasks and cannot replicate the continuous assessment, expert judgment and intervention skills required for rare, complex or high‑risk psychological cases—even small failure rates (the piece cites analogous 0.01% driving‑incident gaps) are unacceptable in clinical care. The article contends LLMs prioritize plausibility over accuracy and that closing the remaining gap toward full effectiveness would demand exponential new training data and is effectively intractable, with documented harms including rises in suicidal ideation and suspicion of psychosis in some users. As a result, LLMs may be useful adjuncts but cannot replace professional psychologists, and training and service models should refocus on identification, prevention and treatment of the small subset of cases LLMs cannot safely manage.
The article argues that large language models (LLMs) are being deployed as therapists and can provide simple support but cannot replicate the continuous assessment, expert judgment and intervention skills of professional psychologists; it cites documented harms including increases in suicidal ideation and suspicion of psychosis linked to LLM therapy (Regehr et al., 2022). The author emphasizes that most clinical value resides in identifying and managing a small subset of high‑risk or complex cases that require intensive training, and that clients often cannot observe this ongoing risk management. Technically, the piece asserts LLMs prioritize plausibility over accuracy (van Rooij and Guest, 2025) and that closing the remaining gap toward full effectiveness would require exponential additional training data—an asymptotic barrier described by Guerzhoy (2024) and framed via a .01% failure analogy from driving safety. The distinction is drawn between actuarial models focused on accuracy and LLMs optimized for naturalistic language generation, implying persistent reliability limits for clinical use. Implications for the market and care models are that LLMs may serve as adjuncts but not replacements, shifting clinician training toward rare complex cases and creating regulatory, liability and validation as key risk vectors for AI mental‑health vendors. Sentiment signals are moderately negative (score -0.5) with a modest market impact score (0.28) and no direct per‑ticker bias for UBER (0.0), suggesting limited immediate equity disruption but meaningful long‑term operational and reputational risk for pure‑play LLM therapy providers.
AI-powered research, real-time alerts, and portfolio analytics for institutional investors.
Request a DemoOverall Sentiment
moderately negative
Sentiment Score
-0.50
Ticker Sentiment