Apple’s upgraded AI models underwhelm on performance

Apple's newly announced AI models, powering Apple Intelligence features, have underperformed compared to existing models from competitors like OpenAI, Google, and Meta, according to Apple's own benchmarks; specifically, the on-device model was rated comparably to Google and Alibaba models, while the server model lagged behind OpenAI's GPT-4o and Meta's Llama 4 Scout. Despite improvements in tool use, efficiency, and language understanding through an expanded training dataset, the benchmark results reinforce concerns about Apple's ability to compete in the rapidly evolving AI landscape, especially given past delays and customer lawsuits related to AI capabilities.

Analysis

Apple's (AAPL) recent unveiling of its AI models, "Apple On-Device" and "Apple Server," intended for its Apple Intelligence suite, indicates a performance deficit relative to established competitors, based on the company's own published benchmarks. Human testers rated the text generation quality of the offline "Apple On-Device" model (approximately 3 billion parameters) as merely "comparable" to, not exceeding, similarly-sized models from Google (GOOGL, GOOG) and Alibaba (BABA). More critically, Apple's data center model, "Apple Server," was found to underperform OpenAI's year-old GPT-4o. In image analysis capabilities, human evaluators preferred Meta's (META) Llama 4 Scout over "Apple Server," even though Llama 4 Scout itself generally lags behind top-tier models from other AI labs. These results substantiate concerns regarding Apple's AI research division's challenges in keeping pace within the highly competitive AI landscape, further compounded by past underwhelming AI features, an indefinitely delayed Siri upgrade, and customer lawsuits alleging unfulfilled AI promises. While Apple notes improvements in tool-use, efficiency, and multilingual capabilities (around 15 languages) due to an expanded training dataset, the comparative benchmark performance suggests a persistent gap with leading AI innovators.

AllMind

AllMind

Apple’s upgraded AI models underwhelm on performance

Analysis

AllMind AI Terminal

Market Sentiment

Key Decisions for Investors