Back to News
Market Impact: 0.55

Microsoft stitches transactional databases to Fabric analytics system

MSFTITPLTR
Artificial IntelligenceTechnology & InnovationCybersecurity & Data PrivacyProduct LaunchesCompany Fundamentals
Microsoft stitches transactional databases to Fabric analytics system

Microsoft is integrating its Cosmos DB and SQL Server transactional databases into its Fabric analytics platform, aiming to streamline AI and analytics workloads by bringing them closer to transactional data. This integration leverages the Apache Parquet file format within Microsoft's OneLake environment, eliminating the need for data replication and facilitating real-time machine learning model development directly on top of the data. Gartner notes this move simplifies integration within data management infrastructure, enabling near real-time reporting on operational data and supporting vector indexes for GenAI applications.

Analysis

Microsoft is significantly enhancing its Fabric analytics and data lake platform by integrating its core transactional databases, SQL Server and Cosmos DB, a strategic move designed to streamline the development of AI features within transactional systems. This integration, unveiled at Microsoft's Build conference, centralizes data from diverse sources—including SQL Server, Cosmos DB, data warehouses, and data lakes—within the OneLake environment, utilizing open-source Apache Parquet and Delta Lake formats. Arun Ulag, Azure's corporate vice president for data, emphasized that this co-location of transactional data with analytics workloads obviates the need for data replication, enabling direct machine learning model development on current data, thereby boosting efficiency and speed. For instance, incorporating Cosmos DB's global secondary index into Fabric is anticipated to accelerate queries and reduce latency without adversely affecting transactional performance. Gartner Senior Director Aaron Rosenbaum contextualized this development as part of a broader industry trend towards simplified and automated data management integration, noting that the 'near real-time' replication facilitates direct reporting on operational data in PowerBI against other Fabric assets. Rosenbaum also underscored CosmosDB's pivotal role in Azure-based GenAI applications, citing its support for real-time interactions and vector indexes, with these embeddings being transferable to Fabric for analytics. While Microsoft continues to support PostgreSQL extensions for document databases to offer customer choice, Ulag reiterated that SQL Server and Cosmos DB remain the company's primary offerings in their respective domains, highlighting a strategy that reinforces its core data services while embracing flexibility.