AI vs Human: Bernstein on how to use LLMs?

Bernstein's latest research indicates that Large Language Models (LLMs) are highly effective for information synthesis and standardized financial tasks, such as summarizing earnings calls and processing data, with performance significantly enhanced by iterative and structured prompting. However, LLMs continue to exhibit limitations in judgment-intensive areas like building investment theses or company models, frequently producing errors and lacking the necessary analytical depth, underscoring the persistent need for human oversight and qualitative judgment in complex financial analysis.

Analysis

A new Bernstein research report indicates a clear functional boundary for the current application of Large Language Models (LLMs) in financial analysis. The models demonstrate high efficacy in standardized, information-synthesis tasks, such as summarizing earnings calls over multiple years, where iterative prompting improved average performance scores from 3.8 to 4.3 out of 5. However, LLMs consistently underperform in tasks requiring deep analytical judgment and qualitative assessment. Attempts to use AI for building investment theses or company models resulted in outputs with factual errors and a lack of analytical depth, with performance scores only marginally improving from 3.0 to 3.4 despite structured prompting. The research underscores the critical dependency on prompt quality, citing a University of Southern California study where minor phrasing changes altered up to 8.5% of responses. While AI has surpassed human performance in structured domains like IT helpdesk support (AI SelfScore of 29.4 vs. human 23.1) and specific medical diagnostics (F1 score of 0.886 vs. human 0.838), its application in finance remains constrained by its inability to effectively process proprietary "walled data" and replicate nuanced human judgment, reinforcing the continued necessity of human oversight for complex investment decisions.

AllMind

AllMind

AI vs Human: Bernstein on how to use LLMs?

Analysis

AllMind AI Terminal

Market Sentiment

Key Decisions for Investors