Back to News
Market Impact: 0.35

Microsoft takes on AI rivals with three new foundational models

MSFTGOOGLGOOG
Artificial IntelligenceTechnology & InnovationProduct LaunchesAntitrust & CompetitionCompany FundamentalsManagement & Governance

Microsoft AI released 3 foundational multimodal models — MAI-Transcribe-1 (speech-to-text), MAI-Voice-1 (audio generation) and MAI-Image-2 (video generation) — now available on Microsoft Foundry (and two on MAI Playground). MAI-Transcribe-1 is marketed as 2.5x faster than Azure Fast and starts at $0.36/hour; MAI-Voice-1 can generate 60 seconds of audio in 1 second and is priced at $22 per 1M characters; MAI-Image-2 pricing is $5 per 1M text-input tokens and $33 per 1M image-output tokens. The models were built by the MAI Superintelligence team led by Mustafa Suleyman (formed Nov 2025); Microsoft has invested more than $13B in the lab and says it remains committed to its OpenAI partnership while pursuing an in-house stack to compete with Google and OpenAI.

Analysis

Microsoft pushing a proprietary multimodal stack is less about immediate revenue and more about control over margins and distribution over the next 12–24 months; owning both models and the cloud path allows Microsoft to internalize a larger portion of incremental AI spend and compress the take-rates of independent model hosts. Expect Azure consumption to get a non-linear lift from integrated product hooks (Office/Teams/Foundry) that can convert trial usage into paid enterprise contracts, materially improving blended ARR growth and gross margins if adoption follows through. The competitive second-order hit is on OpenAI/Google economics and on third-party model marketplaces: aggressive pricing and bundling from Microsoft will force rivals into either cheaper wholesale deals or higher sales/marketing spend, compressing their free-cash-flow conversion. Hardware winners/losers hinge on mix — higher Azure-hosted inference lifts demand for datacenter GPUs, but Microsoft’s push toward custom silicon and closer supply partnerships creates a durable advantage in controlling cost per token and service latency. Key risks: regulatory scrutiny on bundling and exclusivity, a material technical shortfall in MAI model quality vs incumbents, or a re-tightening of the OpenAI contractual relationship can reverse the narrative quickly (days–months). Watch adoption telemetry (Foundry ARR cadence, Azure AI consumption growth, enterprise pilot-to-paid conversion) over the next 3–9 months—failure to accelerate should flatten the re-rating and amplify downside for MSFT while removing pricing pressure on GOOG/GOOGL.