Q1 2024 NVIDIA Corp Earnings Call

May 24, 2023

May 24th, 2023 based on information currently available to us. Except where it's required by law, we assume no obligation to update any such statement. During this call, we will discuss non-GAAP financial measures. You can find a reconciliation of these non-GAAP financial measures to GAAP financial measures in our CFO commentary, which is posted on our website. And with that, I will turn the call over to Colette.

Speaker 1: May 24, 2023, based on information currently available to us. Except as required by law, we assume no obligation to update any such statements.

Speaker 1: During this call, we will discuss non-GAAP financial measures. You can find a reconciliation of these non-GAAP financial measures to GAAP financial measures in our CFO commentary, which is posted on our website. And with that, let me turn the call over to Colette.

Colette M. Kress: Thanks, Simona. Q1 revenue was $7.19 billion, up 19% sequentially and down 13% year on year. Strong sequential growth was driven by record data center revenue, with our gaming and professional visualization platforms emerging from channel inventory corrections. Starting with Data Center, record revenue of $4.28 billion was up 18% sequentially and up 14% year on year due to strong growth of our accelerated computing platform worldwide. Generative AI is driving exponential growth in compute requirements and a fast transition to NVIDIA accelerated computing, which is the most versatile, most energy efficient, and the lowest TCO approach to train and deploy AI. Generative AI drives significant upside in demand for our products, creating opportunities and broad-based global growth across our market. Let me give you some color across our three major customer categories.

Speaker 2: Thanks, Simona. You can see that the revenue was 7.19 billion, up 19% sequentially and down 13% year-on-year.

Speaker 2: Strong sequential growth was driven by record data center revenue, with our gaming and professional visualization platforms emerging from channel inventory corrections.

Speaker 2: Starting with Data Center, record revenue of $4.28 billion was up 18% sequentially and up 14% year-on-year on strong growth of our accelerated computing platform worldwide.

Speaker 2: Generative AI is driving exponential growth in compute requirements and a fast transition to NVIDIA Accelerated Computing, which is the most versatile, most energy efficient, and the lowest TCO approach to train and deploy AI.

Speaker 2: Generative AI drove significant upside and demand for our products, creating opportunities and broad-based global growth across our markets.

Colette M. Kress: Cloud service providers, or CSPs, consumer internet companies, and enterprises. First, CSPs around the world are racing to deploy our flagship Hopper and Ampere architecture GPUs to meet the surge in interest from both enterprise and consumer AI applications for training and inference. Multiple CSPs announced the availability of H100 on their platforms, including private previews at Microsoft Azure, Google Cloud, and Oracle Cloud Infrastructure, upcoming offerings at AWS, and general availability at emerging GPU-specialized cloud providers like Cori and Lambda. In addition to enterprise AI adoption, these CSPs are serving strong demand for H100 from generative AI pioneers. Second, consumer internet companies are also at the forefront of adopting generative AI and deep learning-based recommendation systems, driving strong growth. For example, Meta has now deployed its H100-powered Teton AI supercomputer for its AI production and research teams.

Speaker 2: Let me give you some color across our three major customer categories, called service providers or CSPs, consumer internet companies, and enterprises.

Speaker 2: First, CSPs around the world are racing to deploy our flagship Hopper and Ampere architecture GPUs to meet the surge in interest from both enterprise and consumer AI applications for training and inference.

Speaker 2: Multiple CSPs announced the availability of H100 on their platforms, including private previews at Microsoft Azure, Google Cloud, and Oracle Cloud Infrastructure, upcoming offerings at AWS, and general availability at emerging GPU specialized cloud providers like CoreWeave and Lambda.

Speaker 2: In addition to enterprise AI adoption, these CSPs are serving strong demand for H100 from generative AI pioneers.

Speaker 2: Second, consumer internet companies are also at the forefront of adopting generative AI and deep learning based recommendation systems driving strong growth. For example, Meta has now deployed its H100 powered brand Teton.

Colette M. Kress: Third, enterprise demand for AI and accelerated computing is strong. We are seeing momentum in verticals such as automotive, financial services, healthcare, and telecom, where AI and accelerated computing are quickly becoming integral to customers' innovation roadmaps and competitive positioning. For example, Bloomberg announced it has a 50 billion parameter model, Bloomberg GPT, to help with financial natural language processing tasks, such as sentiment analysis, named entity recognition, news classification, and question answering. Auto insurance company CCC intelligence solutions is using AI for estimating repair, and AT&T is working with us on AI to improve fleet dispatches so their field technicians can better serve customers.

Speaker 2: AI supercomputer for its AI production and research teams.

Speaker 2: Third, enterprise demand for AI and accelerated computing is strong. We are seeing momentum in verticals such as automotive, financial services, healthcare, and telecom, where AI and accelerated computing are quickly becoming integral to customers' innovation roadmap.

Speaker 2: and competitive positioning. For example, Bloomberg announced it has a 50 billion parameter model, Bloomberg GPT, to help with financial natural language processing tasks such as sentiment analysis, named entity classification, news classification, and question answering.

Speaker 2: Auto insurance company CCC Intelligence Solutions is using AI for estimating repairs.

Speaker 2: And AT&T is working with us on AI to improve fleet dispatches so their field technicians can better serve customers.

Colette M. Kress: Among other enterprise customers using NVIDIA AI are Deloitte for logistics and customer service and Amgen for drug discovery and protein engineering. This quarter, we started shipping DGX-H100, our Hopper generation AI system, which customers can deploy on-premises. And with the launch of DGX Cloud through our partnership with Microsoft Azure, Google Cloud, and Oracle Cloud Infrastructure, we deliver the promise of NVIDIA DGX to customers from the cloud

Speaker 2: Among other enterprise customers using NVIDIA AI are Devoit for logistics and customer service, and Amgen for drug discovery and protein engineering.

Speaker 2: This quarter, we started shipping DGX H100, our hopper generation AI system, which customers can deploy on-prem. And with the launch of DGX Cloud through our partnership with Microsoft Azure, Google Cloud, and Oracle Cloud Infrastructure, we deliver the promise.

Colette M. Kress: Whether customers deploy DGX on-premises or via DGX Cloud, they get access to NVIDIA AI software, including NVIDIA Base Command, end-to-end AI frameworks, and pre-trained models. We provide them with a blueprint for building and operating AI, spanning our expertise across systems, algorithms, data processing, and training methods. We also announced NVIDIA AI Foundations, which are model foundry services available on DGX Cloud that enable businesses to build, refine, and operate custom large language models and generative AI models trained with their own proprietary data, created for unique domain-specific tasks. They include NVIDIA NEMO for large language models, NVIDIA PICASO for images, video, and 3D, and NVIDIA BIONEMO for life sciences. Each service has six elements.

Speaker 2: of NVIDIA DGX to customers from the cloud. Whether the customers deploy DGX on-prem or via DGX cloud, they get access to NVIDIA AI software, including NVIDIA base command and AI frameworks and pre-trained models.

Speaker 2: We provide them with a blueprint for building and operating AI, spanning our expertise across systems, algorithms, data processing, and training methods.

Speaker 2: We also announced NVIDIA AI Foundations, which are model foundry services available on DGX Cloud that enable businesses to build, refine, and operate custom large language models and generative AI models trained with their own proprietary data.

Speaker 2: created for unique domain-specific tasks. They include NVIDIA NEMO for large language models, NVIDIA Picasso for images, NVIDIA video and 3D, and NVIDIA BioNEMO for life sciences.

Colette M. Kress: Pre-trained models, frameworks for data processing and curation, proprietary knowledge-based vector databases, systems for fine-tuning, aligning, and guardrailing, optimized inference engines, and support from NVIDIA experts to help enterprises fine-tune models for their custom use cases. ServiceNow, a leading enterprise services platform, is an early adopter of DGX Cloud and MIMA. They're developing custom large language models trained on data specifically for the ServiceNow platform. Our collaboration will let ServiceNow create new enterprise-grade generative AI offerings for the thousands of enterprises worldwide running on the ServiceNow platform, including for IT departments, customer service teams, employees, and developers. Generative AI is also driving a step function increase in inference work.

Speaker 2: Each service has six elements, pre-trained models, frameworks for data processing and curation, proprietary knowledge base vector databases,

Speaker 2: systems for fine-tuning, aligning, and guard railing, optimized inference engines, and support from NVIDIA experts to help enterprises fine-tune models for their custom use cases.

Speaker 2: ServiceNow, a leading enterprise services platform, is an early adopter of DGX Cloud and NIMO. They're developing custom large language models trained on data specifically for the ServiceNow platform. Our collaboration will let ServiceNow create new enterprise grade and

Speaker 2: generative AI offerings for the thousands of enterprises worldwide running on the ServiceNow platform, including for IT departments,

Speaker 2: customer service teams, employees, and developers.

Speaker 2: Generative AI is also driving a step function increase in inference workflows. Because of their size and complexities, these workflows require acceleration. The latest MLPerf industry benchmark released in April showed NVIDIA's inference platforms deliver performance, that is orders of magnitude, ahead of time.

Colette M. Kress: Because of their size and complexities, these workloads require acceleration. The latest MLPerf industry benchmark released in April showed NVIDIA's inference platforms deliver performance that is orders of magnitude ahead of the industry with unmatched versatility across diverse workloads. To help customers deploy generative AI applications of scale, at GTC, we announced four major new inference platforms that leverage the NVIDIA AI software stack. These include the L4 Tensor Core GPU for AI video, L40 for Omniverse and graphics rendering, H100 NBL for large language models, and the Grace Hopper Superchip for LLMs and also recommendation systems and vector databases.

Speaker 2: software stack. These include L4 Tensor Core GPU for AI video, L40 for Omniverse and graphics rendering, H100 NBL for large language models, and the Grace Hopper Superchip for LLMs and also recommendation systems and vector databases.

Colette M. Kress: Google Cloud is the first CSP to adopt our L4 inference platform with the launch of its G2 virtual machines for generative AI inference and other workloads, such as Google Cloud Dataproc, Google Alpha Fold, and Google Cloud Immersive Stream, which render 3D and AR experiences. In addition, Google is integrating our Triton inference server with Google Kubernetes Engine and its cloud-based Vertex AI platform. In networking, we saw strong demand at both CSPs and enterprise customers for generative AI and accelerated computing, which require high-performance networking, like NVIDIA's Mellanox networking platform. However, demand relating to general purpose CPU infrastructure remains soft.

Speaker 2: Google Cloud is the first CSP to adopt our L4 inference platform with the launch of its G2 virtual machines for generative AI inference and other workflows, such as Google Cloud Dataproc, Google Alpha Fold, and Google Cloud Immersive Stream.

Speaker 2: which render 3D and AR experiences. In addition, Google is integrating our Triton inference server with Google Kubernetes engine and its cloud-based Vertex AI platform.

Speaker 2: In networking, we saw strong demand at both CSPs and enterprise customers for generative AI and accelerated computing, which require high-performance networking, like NVIDIA's Mellanox networking platforms.

Colette M. Kress: As generative AI applications grow in size and complexity, high-performance networks have become essential for delivering accelerated computing at data center scale to meet the enormous demands of both training and inferencing. Our 400-gig Quantum II InfiniBand platform is the gold standard for AI-dedicated infrastructure with broad adoption across major cloud and consumer internet platforms, such as Microsoft Azure. With a combination of in-network computing technology and the industry's only end-to-end data center scale optimized software stack, customers routinely enjoy a 20% increase in throughput for their sizable infrastructure investment. For multi-tenant cloud transitioning to support generative AI, our high-speed Ethernet platform with Bluefield 3 DPUs and Spectrum 4 Ethernet switching offers the highest available Ethernet network performance. Bluefield-3 is in production and has been adopted by multiple Hyperscale and CSP customers, including Microsoft, Azure, Oracle Cloud, PowerWeave, Baidu, and others.

Speaker 2: Demand relating to general purpose CPU

Speaker 2: As generative AI applications grow in size and complexity, high performance networks become essential for delivering accelerated computing at data center scale to meet the enormous demands of both training and inferencing.

Speaker 2: Our 400 gig Quantum 2 InfiniBand platform is the gold standard for AI dedicated infrastructure with broad adoption across major cloud and consumer internet platforms such as Microsoft Azure.

Speaker 2: With the combination of in-network computing technology and the industry's only end-to-end data center scale optimized software stack, customers routinely enjoy a 20% increase in throughput for their sizable infrastructure investment.

Speaker 2: For multi-tenant cloud transitioning to support generative AI, our high-speed Ethernet platform with Bluefield 3 DPUs and Spectrum 4 Ethernet switching offers the highest available Ethernet network performance.

Speaker 2: Bluefield 3 is in production and has been adopted by multiple hyperscale and CSP customers, including Microsoft, Azure, Oracle Cloud, PowerWeave, Baidu, and others. We look forward to sharing more about our 400 gig spectrum for accelerated AI networking platform next week at the Computex conference in Taiwan.

Colette M. Kress: We look forward to sharing more about our 400 gigabit spectrum for accelerated AI networking platform next week at the Computex conference in Taiwan. Lastly, our GRACE data center CPU is sampling with customers. At this week's International Supercomputing Conference in Germany, the University of Bristol announced a new supercomputer based on the NVIDIA GRACE CPU Superchip, which is 6x more energy efficient than the previous supercomputer. This adds to the growing momentum for GRACE with both CPU-only and CPU-GPU opportunities across AI and cloud and supercomputing applications. The coming wave of Bluefield-3, Grace, and Grace Hopper superchips will enable a new generation of super energy-efficient, accelerated data systems. Now, let's move to gaming. Gaming revenue of $2.24 billion was up 22% sequentially and down 38% year on year.

Speaker 2: Lastly, our GRACE data center CPU is sampling with customers. At this week's International Supercomputing Conference in Germany, the University of Bristol announced a new supercomputer based on the NVIDIA GRACE CPU Superchip, which is 6x more energy efficient. The new supercomputer is a supercomputer based on the NVIDIA GRACE CPU Superchip.

Speaker 2: than the previous supercomputer. This adds to the growing momentum for GRACE with both CPU only and CPU GPU opportunities across AI and cloud and supercomputing applications.

Speaker 2: The coming wave of Blue Field 3 GRACE and Grace Hopper superchips will enable a new generation of super energy efficient accelerated data centers.

Speaker 2: Now, let's move to gaming. Gaming revenue of $2.24 billion was up 22% sequentially and down 38% year on year. Strong sequential growth was driven by sales of the 40 series GeForce RTX GPUs for both notebooks and desktops.

Colette M. Kress: Strong sequential growth was driven by sales of the 40 series GeForce RTX GPUs for both notebooks and, Overall, end demand was solid and consistent with seasonality, demonstrating resilience against a challenging consumer spending backdrop. The GeForce RTX 40 Series GPU laptops are off to a great start, featuring four NVIDIA inventions, RTX Path Tracing, DLSS3 AI Rendering, Reflex Ultra Low Latency R They deliver tremendous gains in industrial design, performance, and battery life for gamers and creators. And like our desktop offerings, the 40-series laptops support the NVIDIA Studio platform of software technologies, including acceleration for creative data science and AI workflows and Omniverse, giving content creators unmatched tools and capabilities. In desktop, we ramped the RTX 4070, which joined the previously launched RTX 4090, 4080, and 4070 Ti GPUs. The RTX 4070 is nearly 3x faster than the RTX 2070 and offers our large install base a spectacular upgrade.

Speaker 2: Overall, demand was solid and consistent with seasonality demonstrating resilience against a challenging consumer spending backdrop. The GeForce RTX 40 series GPU laptops were off to a great start, featuring four NVIDIA inventions.

Speaker 2: RTX path tracing, DLSS3 AI rendering, Reflex ultra-low latency rendering, and MaxQ energy-efficient technologies.

Speaker 2: They deliver tremendous gains in industrial design, performance, and battery life for gamers and creators.

Speaker 2: Unlike our desktop offerings, 40 Series laptops support the NVIDIA Studio platform of software technologies, including acceleration for creative data science and AI workflows, and Omniverse, giving content creators unmatched tools and capabilities.

Speaker 2: In desktop, we ramped the RTX 4070, which joined the previously launched RTX 4090, 4080, and the 4070 Ti GPUs.

Speaker 2: The RTX 4070 is nearly 3x faster than the RTX 2070 and offers our large install base a spectacular upgrade.

Colette M. Kress: Last week, we launched the 60 family, RTX 4060 and 4060 Ti, bringing our newest architecture to the world's core gamers starting at just $299. These GPUs, for the first time, provide 2X the performance of the latest gaming console at mainstream price points. The 4060 Ti is available starting today, while the 4060 will be available in July.

Speaker 2: Last week we launched the 60 family RTX 4060 and 4060 TI, bringing our newest architecture to the world's core gamers starting at just $299.

Speaker 2: These GPUs for the first time provide 2x the performance of the latest gaming console at mainstream price points. The 4060 Ti is available starting today, while the 4060 will be available in July . Creative AI will be transformative to gaming and content creation.

Colette M. Kress: Generative AI will be transformative for gaming and content creation from development to runtime. At the Microsoft Build Developer Conference earlier this week, we showcased how Windows PCs and workstations with NVIDIA RTX GPUs will be AI-powered at their core. NVIDIA and Microsoft have collaborated on end-to-end software engineering spanning from the Windows operating system to the NVIDIA graphics drivers and NEMO LLF framework to help make Windows on NVIDIA RTX Tensor Core GPUs a supercharged platform for generative AI. Last quarter, we announced a partnership with Microsoft to bring Xbox PC games to GeForce Now. The first game from this partnership, Gears 5, is now available, with more set to be released in the coming months.

Speaker 2: NVIDIA and Microsoft have collaborated on end-to-end software engineering, spanning from the Windows operating system to the NVIDIA graphics drivers and NEMO LLF framework to help make Windows on NVIDIA RTX Tensor Core GPUs a supercharged platform for generative AI.

Speaker 2: Last quarter, we announced a partnership with Microsoft to bring Xbox PC games to GeForce Now. The first game from this partnership, Gears 5, is now available with more set to be released in the coming months.

Colette M. Kress: There are now over 1,600 games on GeForce Now, the richest content available on any cloud gaming service. Moving to Pro Visualization, revenue of $295 million was up 31% sequentially and down 53% year-on-year.

Speaker 2: There are now over 1,600 games on GeForce Now, the richest content available on any cloud gaming service.

Speaker 2: Moving to Pro Visualization.

Speaker 2: Revenue of $295 million was up 31% sequentially and down 53% year on year. Sequential growth was driven by stronger workstation demand across both mobile and desktop phone factors, with strength in key verticals such as public sector, healthcare, and automotive.

Colette M. Kress: sequential growth was driven by stronger workstation demand across both mobile and desktop phone types, with strength in key verticals such as public sector, healthcare, and automotive. We believe the channel inventory correction is behind us. The ramp of our Ada Lovelace GPU architecture in workstations kicked off a major product cycle. At GTC, we announced six new RTX GPUs for laptops and desktop workstations, with further rollouts planned in the coming quarters. Generative AI is a major new workload for NVIDIA-powered workstations. Our collaboration with Microsoft transforms Windows into the ideal platform for creators and designers harnessing generative AI to elevate their creativity and productivity. At GTC, we announced NVIDIA Omniverse Cloud, an NVIDIA fully managed service running in Microsoft Azure that includes the full suite of Omniverse applications and NVIDIA OBX infrastructure. Using this full stack cloud environment, customers can design, develop, deploy, and manage industrial metaverse applications. NVIDIA Omniverse Cloud will be available starting in the second half of this year. Microsoft will also connect Office 365 applications with OmniBird.

Speaker 2: We believe the channel inventory correction is behind us.

Speaker 2: The ramp of our Ada Lovelace GPU architecture in workstations kicks off a major product cycle. At GTC, we announced six new RTX GPUs for laptops and desktop workstations.

Speaker 2: with further rollouts planned in the coming quarters. Generative AI is a major new workload for NVIDIA-powered workstations. Our collaboration with Microsoft transforms Windows into the ideal platform for creators and designers harnessing generative AI to elevate their creativity and productivity.

Speaker 2: At GTC, we announced NVIDIA Omniverse Cloud, an NVIDIA fully managed service running in Microsoft Azure that includes the full suite of Omniverse applications and NVIDIA OVX infrastructure. In this full stack cloud environment.

Speaker 2: customers can design, develop, deploy, and manage industrial Metaverse applications.

Speaker 2: NVIDIA Omniverse Cloud will be available starting in the second half of this year.

Speaker 2: Microsoft NVIDIA will also connect Office 365 applications with Omniverse.

Colette M. Kress: Omniverse Cloud is being used by companies to digitalize their workflows from design and engineering to smart factories and 3D content generation for marketing. The automotive industry has been a leading early adopter of Omniverse, including companies such as BMW Group, Unilotus, General Motors, and Jaguar Land Rover. Moving to the automotive industry

Speaker 2: Omniverse Cloud is being used by companies to digitalize their workflows from design and engineering to smart factories and 3D content generation for marketing.

Speaker 2: The automotive industry has been a leading early adopter of omniverse including companies such as BMW Group ULE Lotus General Motors and Jaguar Land Rover.

Colette M. Kress: Revenue was $296 million, up 1% sequentially, and up 114% from a year ago. Our strong year-on-year growth was driven by the ramp of the NVIDIA DRIVE Orin across a number of new energy applications. As we announced in March, our automotive design wind pipeline over the next six years now stands at $14 billion, up from $11 billion a year ago, giving us visibility into continued growth over the coming years. sequentially, growth moderated as some NEV customers in China are adjusting their production schedules to reflect slower-than-expected demand growth.

Speaker 2: Moving to automotive.

Speaker 2: Revenue was $296 million, up 1% sequentially, and up 114% from a year ago. Our strong year-on-year growth was driven by the ramp of the NVIDIA DRIVE ORIN across a number of new energy bills.

Speaker 2: As we announced in March, our automotive design wind pipeline over the next six years now is down at 14 billion, up from 11 billion a year ago, giving us visibility into continued growth over the coming years.

Speaker 2: Sequentially, growth moderated as some NEV customers in China are adjusting their production schedules to reflect slower than expected demand growth. We expect this dynamic to linger for the rest of the calendar year.

Colette M. Kress: We expect this dynamic to linger for the rest of the calendar year. During the quarter, we expanded our partnership with BYD, the world's leading manufacturer of NEVs. Our new design plan will extend BYD's use of the DRIVE Orin to its next-generation, high-volume DYNASTY and OCEAN series of vehicles, set to start production in calendar 2024. Moving on to the rest of the P&L. Gap gross margins were 64.6%, and non-gap gross margins were 66.8%. Gross margins have now largely recovered to prior peak levels, and we have absorbed higher costs and offset them by innovating and delivering higher-valued products as well as products incorporating more and more software. In addition, sequentially, GAAP operating expenses were down 3%, and non-GAAP operating expenses were down 1%.

Speaker 2: During the quarter, we expanded our partnership with BID, the world's leading manufacturer of NEVs. Our new design win will extend BYD's use of the drive Oren to its next generation, high-volume, Dynasty, and Ocean series of vehicles.

Speaker 2: set to start production in calendar 2024. Moving to the rest of the P&L, gap gross margins were 64.6%, non-gap gross margins were 66.8%. Gross margins have now largely recovered to prior peak levels as we have absorbed higher costs.

Speaker 2: and offset them by innovating and delivering higher valued products, as well as products incorporating more and more software.

Speaker 2: Sequentially, GAAP operating expenses were down 3% and non-GAAP operating expenses were down 1%.

Colette M. Kress: We have held OPEX at roughly the same level over the past four quarters while working through the inventory corrections in gaming and professional visualization. We now expect to increase investments in the business while also delivering operating leverage. We returned $99 million to shareholders in the form of cash dividends.

Speaker 2: We have held OpEx at roughly the same level over the last past four quarters while working through the inventory corrections in gaming and professional visualization. We now expect to increase investments in the business while also delivering operating leverage.

Colette M. Kress: At the end of Q1, we have approximately $7 billion remaining under our share repurchase authorization through December 2023. Now, let me turn to the outlook for the second quarter of fiscal 2020. Total revenue is expected to be $11 billion, plus or minus 2%. We expect this sequential growth to largely be driven by data centers, reflecting a steep increase in demand related to generative AI and large language models. This demand has extended our data center visibility out a few quarters, and we have procured substantially higher supply for the second half of the year. Gap and non-gap gross margins are expected to be 68.6% and 70%, respectively, plus or minus 50 basis points.

Speaker 2: We returned $99 million to shareholders in the form of cash dividends. At the end of the Q1, we have approximately $7 billion remaining under our share repurchase authorization through December 2023. Let me turn to the outlook for the second quarter of fiscal 2020.

Speaker 2: Total revenue is expected to be $11 billion, plus or minus 2%. We expect this sequential growth to largely be driven by data center, reflecting a steep increase in demand related to generative AI and large language models.

Speaker 2: This demand has extended our data center visibility out a few quarters, and we have procured substantially higher supply for the second half of the year.

Speaker 2: Gap and non-gap gross margins are expected to be 68.6% and 70% respectively, plus or minus 50 basis points.

Colette M. Kress: GAAP and non-GAAP operating expenses are expected to be approximately $2.71 billion and $1.9 billion, respectively. Gap and non-gap, other income, and expenses are expected to be an income of approximately $90 million, excluding gains and losses from non-affiliated investment. Gap and non-gap tax rates are expected to be 14%, plus or minus 1%, excluding any discrete items.

Speaker 2: GAAP and non-GAAP operating expenses are expected to be approximately $2.71 billion and $1.9 billion, respectively.

Speaker 2: Gap and non-gap, other income and expenses are expected to be an income of approximately 90 million, excluding gains and losses from non-affiliated investments.

Speaker 2: Gap and non-gap tax rates are expected to be 14%, plus or minus 1%, excluding any discrete items.

Colette M. Kress: Capital expenditures are expected to be approximately $300 to $350 million. Further financial details are included in the CFO commentary and other information available on our IR website. In closing, let me highlight some of the upcoming event. Jensen will give the Computex keynote address in person in Taipei this coming Monday, May 29th local time, which will be Sunday evening in the US. In addition, we will be attending the BofA Global Technology Conference in San Francisco on June 6, and the Rosenblatt Virtual Technology Summit on the Age of AI on June 7th, and the New Street Future of Transportation Virtual Conference on June 12th. Our earnings call to discuss the results of our second quarter, fiscal 24, is scheduled for Wednesday, August 23rd.

Speaker 2: Capital expenditures are expected to be approximately $300 to $350 million.

Speaker 2: Further financial details are included in the CFO commentary and other information available on our IR website. In closing, let me highlight some of the upcoming events. Jensen will give the Computech keynote address in person in Taipei this coming Monday, May 29 local time, which will be Sunday evening in the US.

Speaker 2: In addition, we will be attending the B of A Global Technology Conference in San Francisco on June 6.

Speaker 2: and Rosenblatt Virtual Technology Summit on the Age of AI on June 7th, and the New Street Future of Transportation Virtual Conference on June 12th. Our earnings call to discuss the results of our second quarter, fiscal 24, is scheduled for Wednesday, August 23rd.

Operator: Well, that covers our opening remarks. We're now going to open the call for questions. Operator, would you please pull for questions? Thank you. At this time, I'd like to remind everyone that in order to ask a question, press star then the number 1 on your telephone keypad. We ask that you please limit yourself to one question.

Speaker 2: Well, that covers our opening remarks. We're now going to open the call for questions. Operator, would you please pull for questions?

Speaker 3: Thank you. At this time I'd like to remind everyone in order to ask a question, press star then the number one on your telephone keypad. We ask that you please limit yourself to one question. We'll pause for just a moment to compile the Q&A roster. We'll take our first question from Tashia Hari with Goldman Sachs. Your line's open.

Toshiya Hari: We'll pause for just a moment to compile the Q&A roster. We'll take our first question from Toshiya Hari with Goldman Sachs. Your line's open. Hi, good afternoon.

Colette M. Kress: Thank you so much for taking the question, and congratulations on the strong results and incredible outlook. Just one question on the data center. Colette, you mentioned the vast majority of the sequential increase in revenue this quarter will come from data centers. I was curious what the construct is there, and if you could speak to what the key drivers are from April to July. And perhaps more importantly, you talked about visibility into the second half of the year. I'm guessing it's more of a supply problem at this point. What kind of sequential growth beyond the July quarter can your supply chain support at this point? Thank you. Okay, so there are a lot of different questions there. So let me see if I can start.

Speaker 4: Hi, good afternoon. Thank you so much for taking the question and congrats on the strong results and incredible outlook. Just one question on data center. Collette, you mentioned the vast majority of the sequential increase in revenue this quarter will come from data center. I was curious what the construct is there if you can speak to

Speaker 4: supply chain support at this point. Thank you.

Colette M. Kress: And I'm sure Jensen will have some follow-up comments. So when we talk about our sequential growth that is expected between Q1 and Q2, our generative AI, and large language models are driving this surge in demand. And it's a broad base across both our consumer internet companies, our CSPs, our enterprises, and our AI startups. It is also interested in both of our architectures, both of our latest Hopper architecture, as well as our Ampere architecture. This is not surprising, as we generally sell both of our products at the same time.

Speaker 2: Okay, so a lot of different questions there, so let me see if I can start, and I'm sure Jensen will have some following-up comments. So, when we talk about our sequential growth that were expected between Q1 and Q2, our generative AI, large language models are driving this surge into that, and it's a broad...

Speaker 2: This is not surprising as we generally often sell both of our architectures at the same time. This is also a key area where deep recommendators are driving growth and we also expect to see growth both in our computing as well as in our networking business.

Colette M. Kress: This is also a key area where deep recommenders are driving growth. And we also expect to see growth both in our computing business and in our networking business. So those are some of the key things that we have baked in when we think about the guidance that we've provided for Q2.

Speaker 2: So those are some of the key things that we have baked in when we think about the guidance that we have provided to Q2. We also surfaced in our opening remarks that we are working on both supply today for this quarter, but we have also procured a substantial amount of supply for the second half.

Colette M. Kress: We also mentioned in our opening remarks that we are working on both supply today for this quarter, but we have also procured a substantial amount of supply for the second half. We have some significant supply chain flow to serve our significant customer demand that we see. And this is demand that we see across a wide range of different customers. They are building platforms for some of the largest enterprises but also setting things up at CSPs and the large consumer Internet companies. So we have visibility right now for our data center demand that has probably extended out a few quarters, and this led us to working on quickly procuring that substantial supply for the second. I'm going to pause there and see if Jensen wants to add a little bit more. I thought that was great.

Speaker 2: We have some significant supply chain flow to serve our significant customer demand that we see, and this is demand that we see across a wide range of different customers. They are building platforms for some of the largest enterprises.

Speaker 2: but also setting things up at the CSPs and the large consumer Internet companies. So we have visibility right now for our data center demand that has probably extended out a few quarters. And this led us to working on quickly procuring that substantial supply for the second half.

Speaker 2: I'm going to pause there and see if Jensa wants to add a little bit more.

Speaker 3: I thought that was great. Thank you. Next we'll go to CJ Muse with Evercore ISI. Your line is open.

Next, we'll go to CJ Muse with Evercore ISI. Your line is open. Yeah, good afternoon. Thank you for taking the time to answer the question. You know, I guess with data center, you know, essentially doubling quarter on quarter, you know, two natural kinds of questions that relate to one another come to mind. Number one, you know, where are we in terms of driving acceleration into servers to support, you know, AI. And as part of that, as you deal with longer cycle times, with TSMC and your other partners, how are you thinking about managing, you know, the commitments there with, you know, where you want to manage your lead times in the coming years, you know, to best kind of best match that supply and demand? Thanks so much.

Speaker 3: Yeah, good afternoon. Thank you for taking the question. You know, I guess with Data Center, you know, essentially doubling quarter on quarter, you know, two natural kind of questions that relate to one another come to mind. Number one, you know, where are we in terms of driving acceleration into servers to support?

Speaker 3: you know, AI. And as part of that, as you deal with longer cycle times with TSMC and your other partners, how are you thinking about managing, you know, the commitments there with, you know, where you want to manage your lead times, you know, in the coming years, you know, to best kind of match that supply and demand. Thanks so much.

Jensen Huang: Yes, CJ, thanks for the question. I'll start backwards. Remember, we were in full production for both Ampere and Hopper when the chat GPT moment came, and it helped everybody crystallize how to... transition from the technology of large language models to a product and service based on a chat bot. The integration of Guardrails and alignment systems with reinforcement learning human feedback.

Speaker 5: Yes, CJ, thanks for the question. I'll start backwards. We were in full production of both Ampere and Hopper.

Speaker 5: when the chat GPT moment came.

Speaker 5: And it helped everybody crystallize how to

Speaker 5: transition from the technology of large language models

Speaker 5: to a product and service based on a chatbot.

Speaker 5: the integration of

Speaker 5: guardrails and alignment systems with reinforcement learning human feedback.

Jensen Huang: Knowledge Vector Databases for Proprietary Knowledge, Connection to Search, all of that came together in a really wonderful way, and it's, you know, the reason why I call it the iPhone moment. All the technology came together and helped everybody realize what an amazing product it could be and what capabilities it could have. And so we were already in full production. Embry is, The supply chain flow and our supply chain are very significant, as you know. And we build supercomputers in volume, and these are giant systems, and we build them in volume. It includes, of course, the GPUs, but on our GPUs, the system boards have 35,000 other components, and the networking and the fiber optics and the incredible transceivers and, you know, the NICs, the smart NICs, the switches, all of that has to come together in order for us to stand up a data center. And so we were already in full production, and when the moment came, we had to really significantly increase our procurement substantially for the second half, as Colette said.

Speaker 5: knowledge vector databases for proprietary knowledge.

Speaker 5: connection to search, all of that came together in a really wonderful way. And the reason why I call it the iPhone moment, all the technology came together and helped everybody realize what an amazing product it can be and what capabilities it can have.

Speaker 5: And so we were already in full production.

Speaker 5: we were already in full production.

Speaker 5: supply chain flow and our supply chain is very significant as you know. And we build supercomputers in volume and these are giant systems and we build them in volume. It includes of course the GPUs, but....

Speaker 5: on our GPUs, the system boards have 35,000 other components. And the networking and the fiber optics and the incredible transceivers and, you know, the NICs, the smart NICs, the switches, all of that has to come together in order for us to stand up a data center. And so we were already in full production when the moment came.

Speaker 5: we have to really significantly increase our procurement substantially for the second half, as Collette said.

Jensen Huang: Now, let me talk about the bigger picture and why the entire world's data centers are moving towards accelerated computing. It's been known for some time, and you've heard me talk about it, that accelerated computing is a full-stack problem, but it's a full-stack challenge. But if you could successfully do it in a large number of application domains, it's taken us 15 years, it's sufficient that almost the entire data center's major applications could be accelerated. You could reduce the amount of energy consumed and the amount of cost for a data center by an order of magnitude. It takes a lot of money to do it because you have to do all the software and everything and you got to, you have to build all the systems and so on and so forth. But you know, we've been at it for 15 years, and what happened is that when Jared today and I came along, it triggered a killer app for this computing platform that's been in preparation for some time. And so now we see ourselves in two simultaneous transitions.

Speaker 5: Now let me talk about the bigger picture and why the entire world's data centers are moving towards accelerated computing. It's been known for some time, and you've heard me talk about it, that accelerated computing is a full-stack problem, but it's a full-stack challenge. But if you could successfully do it in a large number of application domains, it's taken the last 15 years.

Speaker 5: You have to build all the systems and so on and so forth, but we've been at it for 15 years.

Speaker 5: And what happened is when generative AI came along.

Speaker 5: It triggered a killer app for this computing platform that's been in preparation for some time. And so now we see ourselves in two simultaneous transitions.

Jensen Huang: The world's $1 trillion data center is nearly populated entirely by CPUs today. And, and, and I, you know, $1 trillion, $250 billion a year, it's growing, of course. But over the last four years, you know, call it a trillion dollars worth of infrastructure installed. And it's all completely based on CPUs and dumb NICs. It's basically unaccelerated.

Speaker 5: The world's $1 trillion data center is...

Speaker 5: nearly populated entirely by CPUs today.

Speaker 5: And, you know, $1 trillion is $250 billion a year. It's growing, of course, but over the last four years, you know, call it a trillion dollars worth of infrastructure installed. And it's all completely based on CPUs and dumb nicks.

Speaker 5: It's basically un-accelerated. In the future, it's fairly clear now with generative AI becoming the primary workload of most of the world's data centers generating information, it is very clear now that, and the fact that accelerated computing is so energy efficient, that the budget of a data center.

Jensen Huang: In the future, it's fairly clear now that with generative AI becoming the primary workload of most of the world's data centers generating information, it is very clear now that the fact that accelerated computing is so energy efficient that the budget of a data center will shift very dramatically towards accelerated computing, and you're seeing that now. We're going through that moment right now as we speak, while the world's data center CapEx budget is limited, at the same time, we're seeing incredible orders to retool the world's data centers. So I think you're starting, you're seeing the beginning of, you know, call it a 10 year transition to basically recycle or reclaim the world's data centers and build them out as accelerated computing. You'll have a pretty dramatic shift in the Spend of a data center from traditional computing to accelerated computing with smart NICs, smart switches, you know, of course, GPUs, and the workload is gonna be predominantly generative AI. Okay, we'll move on to our next question. Vivek Arya with BofA Securities Your line is open.

Speaker 5: will shift very dramatically towards accelerated computing. And you're seeing that now. We're going through that moment right now as we speak. While the World's Data Center CapEx budget is limited, at the same time, we're seeing incredible orders.

Speaker 5: to retool the world's data centers.

Speaker 5: So I think you're starting, you're seeing the beginning of, you know, call it a 10-year transition to

Speaker 5: basically recycle or reclaim the world's data centers and build it out as accelerated computing. You'll have a pretty dramatic shift in the...

Speaker 5: spend of a data center from traditional computing and to accelerated computing with smart nigs, smart switches, you know, of course, GPUs, and the workload is going to be predominantly generative AI.

Speaker 5: Okay, we'll move to our next question. Zivik Arya with B of A Securities. Your line is open.

Vivek Arya: Thanks for the question. Colette, I just wanted to clarify: does visibility mean data center sales can continue to grow sequentially in Q3 and Q4, or do they sustain at Q2 levels? I just wanted to clarify that.

Speaker 6: Thanks for the question. I just wanted to clarify, does visibility mean data center sales can continue to grow sequentially in Q3 and Q4, or do they sustain at Q2 levels? I just wanted to clarify that. And then, Jensen, my question is that given this very strong demand environment, what does it do to the competitive landscape? Does it invite more competition in terms of custom basics?

Vivek Arya: And then Jensen, my question is that, you know, given this very strong demand environment, what does it do to the competitive landscape? Do you think it invites more competition in terms of custom ASICs? You know, does it invite more competition in terms of other GPU solutions or other kinds of solutions? You know, how do you see the competitive landscape changing over the next two to three years? Yeah, Vivek, thanks for the question. Let me see if I can add a little bit more color.

Speaker 6: Does it invite more competition in terms of other GPU solutions or other kinds of solutions? How do you see the competitive landscape change over the next 2-3 years?

Speaker 2: Yeah Vivek, thanks for the question. Let me see if I can add a little bit more color. We believe that the supply that we will have for the second half of the year will be substantially larger than H1. So we are expecting not only the demand for H1, but also the demand for H1.

Colette M. Kress: We believe that the supply that we will have for the second half of the year will be substantially larger than H1. So we are expecting not only the demand that we just saw in this last quarter, the demand that we have in Q2 for our forecast, but also planning on seeing something in the second half of the year. We just have to be careful here that we're not here to guide the second half.

Speaker 2: that we just saw in this last quarter, the demand that we have in Q2 for our forecast, but also timing on seeing something in the second half of the year. We just have to be careful here, but we're not here to guide the second half.

Colette M. Kress: But yes, we do plan a substantial increase in the second half by recording competition. We have competition from every direction. Startups, really, really well-funded and innovative startups.

Speaker 2: But yes, we do plan a substantial increase in the second half compared to the first half.

Speaker 2: yes, we do plan a substantial increase in the second half compared to the first half. by recording competition

Speaker 5: We have competition from every direction. Startups, really, really well funded and innovative startups, countless of them all over the world. We have competitions from existing

Jensen Huang: We have competition from existing semiconductor companies, we have competition from CSPs with internal projects, and many of you know about most of these, and so we're mindful of competition all the time, and we face competition all the time. NVIDIA's value proposition at the core is that we are the lowest cost solution. We're the lowest TCO solution, and the reason for that is because accelerated computing is two things that I talk about often, which are it's a full stack problem. It's a full stack challenge.

Speaker 5: existing semiconductor companies. We have competition from CSPs with internal projects. And many of you know about most of these. And so we're mindful of competition all the time and we get competition all the time.

Speaker 5: And various value proposition at the core is we are the lowest cost solution. We are the lowest TCO solution and the reason for that is because accelerated computing

Speaker 5: The NVIDIA's value proposition at the core is we are the lowest cost solution. We are the lowest TCO solution and the reason for that is because accelerated computing is...

Speaker 5: There's two things that I talk about often, which is it's a full stack problem. It's a full stack challenge You have to engineer all of the software and all the libraries and all the algorithms Integrate them into and optimize the frameworks and optimize it for the architecture of not just one chip, but the architecture of an entire data center.

Jensen Huang: You have to engineer all of the software and all the libraries and all the algorithms, integrate them into and optimize the frameworks, and optimize them for the architecture of not just one chip but the architecture of an entire data center, all the way into the frameworks, all the way into the models. And the amount of engineering and distributed computing, fundamental computer science work, is really quite extraordinary. It is the hardest computing challenge we know, and so number one, it's a full-spec challenge, and you have to optimize it across the whole thing and across just the mind-blowing number of stacks.

Speaker 5: It is the hardest computing as we know.

Speaker 5: And so number one, it's a full-stack challenge, and you have to optimize it across the whole thing and across just a...

Jensen Huang: We have 400 acceleration libraries. As you know, the amount of libraries and frameworks that we accelerate is pretty mind-blowing. The second part is that generative AI is a large-scale problem, and it's a data center-scale problem. It's another way of thinking that the computer is the data center, or the data center is the computer. It's not the chip.

Speaker 5: mind-blowing number of stacks. We have 400 acceleration libraries. As you know, the amount of libraries and frameworks that we accelerate is pretty mind-blowing. The second part is that generative AI is a large-scale problem and it's a data center scale problem. It's another way of thinking.

Jensen Huang: It's the data center, and it's never happened like this before. And in this particular environment, your networking operating system, your distributed computing engines, your understanding of the architecture of the networking gear, the switches, and the computing systems, the computing fabric, that entire system is your computer, and that's what you're trying to operate. And so in order to get the best performance, you have to understand the full stack and understand data center scale, and that's what accelerated computing is. The second thing is that I...

Speaker 5: that the computer is the data center, or the data center is the computer. It's not the chip, it's the data center, and it's never happened like this before.

Speaker 5: And in this particular environment...

Speaker 5: Your networking operating system, your distributed computing engines, your understanding of the architecture of the networking gear, the switches, and the computing systems, the computing fabric, that entire system is your computer and that's what you're trying to operate.

Speaker 5: And so in order to get the best performance, you have to understand full stack and understand data center scale. And that's what accelerated computing is. The second thing is that utilization, which talks about the amount of the types of applications that you can accelerate.

Jensen Huang: Utilization, which talks about the number of types of applications that you can accelerate and the versatility of your architecture, keeps that utilization high. If you can do one thing and do one thing only incredibly fast, then your data center is largely underutilized. It's hard to scale that out.

Speaker 5: and the versatility of your architecture keeps the utilization high. If you can do one thing and doing one thing only incredibly fast, then your data center is largely underutilized.

Speaker 5: and it's hard to scale that out. NVIDIA's universal GPU, the fact that we accelerate so many stacks, makes our utilization incredibly high. And so number one is throughput, and that's software intensive problems, a data center architecture problem, and second is utilization versatility problem. And the third is just data center expertise.

Jensen Huang: NVIDIA's universal GPU and the fact that we accelerate so many stacks make our utilization incredibly high. Number one is throughput. That's a software-intensive problem. It's a data center architecture problem. The second is utilization, or versatility.

Speaker 5: You know, we've built five data centers of our own and we've helped companies all over the world build data centers.

Speaker 5: and we integrate our architecture into all the world's clouds. From the moment of delivery of the product to standing up and the deployment, the time to operations of a data center is measured. If you're not good at it and you're not proficient at it, it could take months.

Jensen Huang: And the third is just data center expertise. We've built five data centers of our own, and we've helped companies all over the world build data centers. And we integrate our architecture into all the world's clouds. From the moment of delivery of the product to standing up and deployment, the time to operations of a data center is measured. Not, you know, it can if you're not good at it and you're not not not not proficient at it. It could take months.

Speaker 5: you know, standing up a supercomputer, let's see, you know, some of the largest supercomputers in the world were installed about a year and a half ago and now they're coming online. And so it's not, you know, you know, it's not, you know, it's not, you know, it's not

Speaker 5: unheard of to see a delivery to operations of about a year. Our delivery to operations measured in weeks, and we've taken data centers and supercomputers and we've turned it into products, and the expertise of the team in doing that is incredible. And so our value proposition is in the final analysis, all of this technology.

Jensen Huang: You know, setting up a supercomputer, let's see, some of the largest supercomputers in the world were installed about a year and a half ago, and now they're coming online. And so it's not, it's not, you know, Unheard of to see a delivery to operations of about a year. Our delivery to operations is measured in weeks, and we've taken data centers and supercomputers, and we've turned them into products. And the expertise of the team in doing that is incredible. And so our value proposition is, in the final analysis, all of this technology translates into the infrastructure, the highest throughput and the lowest possible cost. And so I think our market is, of course, very competitive, very large. But the challenge is really, really great. Next, we go to Aaron Rakers with Wells Fargo.

Speaker 5: translates in the infrastructure, the highest throughput and lowest possible cost. And so I think our market is of course very competitive, very large, but the challenge is really, really great.

Speaker 5: Next we go to Aaron Rakers with Wells Fargo. Your line is open.

Speaker 7: Yeah, thank you for taking the question and congrats on the clutter. As we kind of think about unpacking the various different growth drivers of the data center business going forward, I'm curious, Colette, of just how we should think about the monetization effect of software, considering that the expansion of your cloud service agreements continues to APC Mac generation on needing to change from that point forward?

Aaron Rakers: Yeah, thank you for taking the question and congrats on the quarter. As we kind of think about unpacking the various different growth drivers of the data center business going forward. I'm curious, Colette, about just how we should think about the monetization effect of software considering that the expansion of your cloud service agreements continues to grow. I'm curious about where we are at in terms of that approach in terms of the AI enterprise software suite and other, you know, drivers of software only revenue going forward. Thanks for the question.

Speaker 2: is really important to our accelerated platforms. Not only do we have a substantial amount of software that we are including in our newest architecture and essentially all products that we have, we are now with many different models to help customers.

Speaker 2: start their work in generative AI and accelerated computing. So anything that we have here from DGX Cloud on providing those services, helping them build models, or as you've discussed, the importance of NVIDIA AI Enterprise, essentially that operating system for AI.

Colette M. Kress: Software is really important to our accelerated platforms. Not only do we have a substantial amount of software that we are including in our newest architecture and, essentially, all products that we have, but we are now with many different models to help customers start their work in generative AI and accelerated computing. So anything that we have here from DGX Cloud for providing those services, helping them build models, or, as you've discussed, the importance of NVIDIA AI Enterprise, essentially that operating system for AI. So all things should continue to grow as we go forward, both the architecture and the infrastructure, as well as the availability of the software and our ability to monitor and stop with it as well. I'll turn it over to Jensen, see if he needs to... Yeah, we can see, in real time, the growth of generative AI and CSPs, both for training the models, refining the models, as well as deploying the models.

Speaker 2: So, all things should continue to grow as we go forward, both the architecture and the infrastructure, as well as the both availability of the software and our ability to monitor and stopall applications.

I'll turn it over to Jensen to see if he needs to... Yeah, we can see in real time the growth of generative AI and CSPs, both for training the models, refining the models, as well as deploying the models. So, as Colette said earlier...

inference is now a major driver of accelerated computing because generative AI is used so capably in so many applications already. There are two segments that require

a new stack of software. And the two segments are enterprise and industrials. Enterprise requires a new stack of software because many enterprises need to have all the capabilities that we've talked about, whether it's large language models, the ability to adapt them for your proprietary use case and your proprietary data.

Colette M. Kress: As Colette said earlier, inference is now a major driver of accelerated computing because generative AI is used so capably in so many applications already. There are two segments that require a new stack of software, and the two segments are enterprise and industrial. Enterprises need a new stack of software because many enterprises need all the capabilities that we've talked about, whether it's large language models, the ability to adapt them for your proprietary use case and your proprietary data, and alignment to your own principles and your own operating domains. You want to have the ability to do that in a high-performance computing sandbox, and we call that DGX Cloud, and to create that model.

and align it to your own principles and your own operating domains.

You want to have the ability to be able to do that in a high performance computing sandbox, and we call that DGX Cloud.

and to create that model. Then you want to deploy your chatbot or your AI in any cloud because you have services and agreements with multiple cloud vendors and depending on the applications you want to deploy it on various clouds.

For the enterprise, we have NVIDIA AI Foundation for helping you create custom models, and we have NVIDIA AI Enterprise. NVIDIA AI Enterprise is the only GPU accelerated stack in the world that is enterprise safe and enterprise supported.

Jensen Huang: Then you want to deploy your chatbot or your AI in any cloud because you have services and you have agreements with multiple cloud vendors, and depending on the applications, you might deploy it on various clouds. And for the enterprise, we have NVIDIA AI Foundation for helping you create custom models, and we have NVIDIA AI Enterprise. NVIDIA AI Enterprise is the only accelerated stack, GPU accelerated stack in the world that is enterprise safe and enterprise supported. You know there are constant patches that you have to do.

There are constant patching that you have to do. There are 4,000 different packages that build up NVIDIA AI Enterprise and represents the operating engine, end-to-end operating engine of the entire AI workflow. It's the only one of its kind, from data ingestion, data processing.

Obviously, in order to train an AI model, you have a lot of data you have to process and package up and curate and align. And there's just a whole bunch of stuff that you have to do to the data to prepare it for training. That amount of data could consume some 40, 50, 60% of your computing time. And so data processing is a very big deal. And then the second aspect of it is training the model, refining the model, and the third is deploying model. So again, you want to implement this model made by Voice HQ as well.

Jensen Huang: There are 4,000 different packages that build up NVIDIA AI Enterprise and represent the operating engine, end-to-end operating engine of the entire AI workflow. It's the only one of its kind from data ingestion to data processing. You know, obviously, in order to train an AI model, you have a lot of data you have to process and package up and curate and align and there's just a whole bunch of stuff that you have to do to the data to prepare it for training. That amount of data could consume some 40, 50, 60 percent of your computing time, and so data processing is a very big deal. And then the second aspect of it is training the model, refining the model, and the third is deploying the model for inferencing.

for inferencing.

NVIDIA AI Enterprise supports and patches and security patches continuously all of those 4000 packages of software.

And for an enterprise that wants to deploy their engine, just like they want to deploy Red Hat Linux, this is incredibly complicated software in order to deploy that in every cloud, and as well as on-prem, it has to be secure, it has to be supported. And so a beauty of enterprise.

is the second part. The third is omniverse. Just as people are starting to realize that you need to align an AI to ethics, the same for robotics, you need to align the AI for physics. This is learnability, that's why you do it, or you're not able to measure it yet.

And aligning an AI for ethics includes a technology called reinforcement learning human feedback. In the case of industrial applications and robotics, it's reinforcement learning omniverse feedback.

Jensen Huang: NVIDIA AI Enterprise supports and patches and security patches continuously all of those 4,000 packages of software, and and uh, for an enterprise that wants to deploy their engines just like they want to deploy Red Hat Linux, this is you know incredibly complicated software. In order to deploy that in every cloud and as well as on-premises, it has to be secure, and it has to be supported, and so NVIDIA AI Enterprise is the second part. The third is Omniverse.

And Omniverse is a vital engine for software defined in robotic applications and industries. And so Omniverse also needs to be a cloud service platform. And so our software stack, the three software stacks, AI Foundation, AI Enterprise, AI Syscom, and the three software stacks, AI Foundation, AI Enterprise, AI Enterprise,

and Omniverse runs in all of the world's clouds that we have partnerships, DGX cloud partnerships with. Azure, we have partnerships on both AI as well as Omniverse with GCP and Oracle. We have great partnerships in DGX cloud for AI.

Jensen Huang: Just as people are starting to realize that you need to align an AI for ethics, the same for robotics, you need to align the AI for physics. And aligning an AI for ethics includes a technology called reinforcement learning human feedback. In the case of industrial applications and robotics, it's reinforcement learning Omniverse feedback, and Omniverse is a vital engine for software defined in robotic applications and industries.

and AI enterprises integrated into all three of them. And so I think the, in order for us to extend the reach of AI beyond the cloud and into the world's enterprise and into the world's industries.

Jensen Huang: And so Omniverse also needs to be a cloud service platform. And so our software stack, the three software stacks, AI Foundation, AI Enterprise, and Omniverse, runs on all of the world's clouds that we have partnerships with, EGX cloud partnerships with. Azure, we have partnerships on both AI as well as Omniverse.

You need two new types of, you need new software stacks in order to make that happen. And by putting it in the cloud, integrated into the world CSP clouds, it's a great way for us to partner with the sales and the marketing team and the leadership team of all the cloud vendors. Next we'll go to the next slide.

Jensen Huang: With GCP and Oracle, we have great partnerships in the DGX cloud for AI, and AI Enterprise is integrated into all three of them. And so, I think in order for us to extend the reach of AI beyond the cloud and into the world's enterprise and into the world's industries, you need new software stacks in order to make that happen. And by putting it in the cloud, integrated into the world's CSP clouds, it's a great way for us to partner with the sales and marketing team and the leadership team of all the cloud vendors. Next, we'll go to Timothy Arcuri with UBS.

low latency of InfiniBand for AI, but can you sort of talk about the attach rate of your InfiniBand solutions to what you're shipping on the core compute side, and maybe whether that's similarly crowding out ethernet like you are on the compute side. And then the clarification, Colette, is that there wasn't a share buyback despite you still having about $7 billion on the on.

Timothy Michael Arcuri: Your line has been called... Thanks a lot. I had a question, and then I had a clarification as well. So the first question is, Jensen, on the InfiniBand versus Ethernet argument, can you sort of speak to that debate and maybe how you see it playing out? I know you need the low latency of InfiniBand for AI, but can you sort of talk about the attach rate of your InfiniBand solutions to what you're shipping on the core compute side, and maybe whether that's similarly crowding out Ethernet like you are on the compute side? And then the clarification, Colette, is that there wasn't a share buyback despite you still having about $7 billion in the share repo authorization. Was that just timing?

We did not repurchase anything in this last quarter, but we do repurchase opportunistically, and we'll consider that as we go forward as well. Thank you.

Inciniband and Ethernet are target different applications in a data center.

They both have their place. InfiniBand had a record quarter. We're going to have a giant record year. InfiniBand has an exceptional roadmap.

Colette M. Kress: Thanks. Colette, how about you go first and take this question? Sure. That is correct. We have $7 billion available in our current authorization for repurchases. We did not repurchase anything in this last quarter.

it's going to be really incredible. The two networks are very different. InfiniBand is designed for an AI factory, if you will.

Colette M. Kress: But we do repurchase opportunistically, and we'll consider that as we go forward as well. Thank you. Infiniband and Ethernet are target different applications in a data center. But they both have their place.

If that data center is running a few applications for a few people for a specific use case, and it's doing it continuously, and that infrastructure costs you, you know, pick a number, $500 million.

is running a few applications for a few people for a specific use case, and it's doing it continuously. And that infrastructure costs you, pick a number, $500 million. The difference between farms and jobs in our case

Jensen Huang: InfiniBand had a record quarter. We're going to have a giant record year, and InfiniBand has a really amazing, Quantum InfiniBand has an exceptional roadmap. It's going to be really incredible.

InfiniBand and Ethernet could be 15-20% in overall throughput.

and Ethernet could be 15-20% in overall throughput. And if you spent $500 million.

and an infrastructure and the difference is 10 to 20 percent and it's a hundred million dollars. Inciniband is basically free. That's the reason why people use it.

and the difference is 10 to 20 percent. It's $100 million. InfiniBand is basically free. That's the reason why people use it. InfiniBand is effectively free.

Jensen Huang: The two networks are very different. InfiniBand is designed for an AI factory, if you will, where the data center is running a few applications for a few people for a specific, you know, use case, and it's doing it continuously. And that infrastructure costs you, you know, pick a number $500 million. The difference between InfiniBand and Ethernet could be 15, 20% in overall throughput. And if you've spent $500 million on infrastructure, and the difference is 10 to 20 percent.

The difference in data center throughput is just, you know, it's too great to ignore. And you're using it for that one application. And so, however, if your data center is a cloud data center and it's multi-tenant, it's a bunch of little jobs, a bunch of little jobs and it's shared by millions of people, then Ethernet is really the right answer. There's a new segment in the middle where the cloud is becoming a...

generative AI cloud. It's not an AI factory per se, but it's still a multi-tenant cloud, but it wants to run generative AI workloads. This new segment is a wonderful opportunity. And at Computex, I referred to it at the last GTC, at Computex, we're going to announce a major product line for this segment, which is for...

And to $100 million, InfiniBand is basically free. That's the reason why people use it. InfiniBand is effectively, the difference in data center throughput is just, you know, it's too great to ignore, and you're using it for that one application. And so, however, if your data center is a cloud data center, and it's multi-tenant, it's a bunch of little jobs, a bunch of little jobs, and it's shared by millions of people, then Ethernet is really the right answer. There's a new segment in the middle where the cloud is becoming a generative AI cloud. It's not an AI factory per se, It's still a multi-tenant cloud, but it wants to run generative AI workloads. This new segment is a wonderful opportunity, and at Computex, I referred to it at the last GTC. At Computex, we're going to announce a major product line for this segment, which is for Ethernet-focused generative AI application type clouds, but InfiniBand is doing fantastically, and we're doing record numbers quarter on quarter, year on year. Next, we'll go to Stacy Rasgon with Bernstein Research. Your line is open.

Ethernet-focused generative AI application type of clouds. But InfiniBand is doing fantastically, and we're doing record numbers quarter on quarter, year on year. Next we'll go to Stacey Redd.

because inference basically scales

with like the usage versus like training is more of a one and done. And can you give us some sort of, even if it's just like a, you know, qualitatively, like, if you think influence is bigger than training or vice versa, like if it's bigger, how much bigger? Like the opportunity is at 5x, is it 10x?

Does anything you can give us on those two workloads within generative AI be helpful? Yeah, I'll work backwards. You're never done with training.

Jensen Huang: Hi guys, thanks for taking my question. I had a question on inference versus training for generative AI. So you're talking about inferences being a very large opportunity, I guess. There are two sub-parts to that. Is that because inference basically scales with the usage versus like training, which is more of a one and done, and can you give us some sort of even if it's just like a qualitatively like if influence is bigger than training or vice versa, like if it's bigger, how much bigger is it like the opportunity? Is it 5x, is it 10x? If there's anything you can give us on those two I'll work it backwards.

You're always, every time you deploy you're collecting new data. When you collect new data you train with the new data. And so you're never done with vectorizing all of the collected, unstructured data that you have.

And so whether you're building a recommender system, a large language model, a vector database, these are probably the three major applications of the three core engines, if you will, of the future of computing. In the next lecture, our picket 2006 on the introduction of couple of quades.

There's a whole bunch of other stuff, but obviously these are three very important ones. They're always, always running. You're going to see that more and more companies realize they have a factory for intelligence.

an intelligence factory. And in that particular case, it's largely dedicated to training and processing data and vectorizing data and learning representation of the data, so on and so forth.

Jensen Huang: You're never done with training. You're always, you're always, every time you deploy, you're collecting new data. When you're collecting data, you train with that, with the new data. And so you're never done training.

The inference part of it are APIs that are either open APIs that can be connected to all kinds of applications, APIs are integrated into workflows, but APIs of all kinds, there will be hundreds of APIs in a company. Some of them they built themselves, some of them part of the company.

Jensen Huang: You're never done producing and processing a vector database that augments the large language model. You're never done vectorizing all of the collected unstructured data that you have. And so whether you're building a recommender system, a large language model, or a vector database, these are probably the three major applications of the three core engines, if you will, of the future of computing. And there is a whole bunch of other stuff. But obviously, these are three very important ones. They're always, always running.

that many of them could come from companies like ServiceNow and Adobe that we partner with in AI foundations. And they'll create a whole bunch of generative AI APIs that companies can then connect into their workflows or use as an application. And of course, there'd be a whole bunch of internet service companies.

And so I think you're seeing for the very first time simultaneously a very significant growth in the segment of AI factories.

Jensen Huang: You're going to see that more and more companies realize they have a factory for intelligence, an intelligence factory. And in that particular case, it's largely dedicated to training and processing data and vectorizing data and, you know, learning the representation of the data, so on, so forth. The inference part of it is APIs that are either open APIs that can be connected to all kinds of applications; they are integrated into workflows, but APIs of all kinds. There would be hundreds of APIs in a company. Some of them they built themselves.

as well as a market that, a segment that really didn't exist before, but now it's growing exponentially, practically by the week.

for AI inference with APIs. The simple way to think about it in the end is that the world has a trillion dollars of data center installed and it used to be 100% CPUs. In the future, we know we've heard it in enough places.

And I think this year's ISC keynote was actually about the end of Moore's Law. We've seen it in a lot of places now that you can't reasonably scale out data centers with general purpose computing and that accelerated computing is the path forward. And now it's cutting.

Jensen Huang: Some of them, many of them could come from companies like ServiceNow and Adobe that we're partnering with and AI Foundations, and they'll create a whole bunch of generative APIs that companies can then connect to their workflows or use as an application. And, of course, there'll be a whole bunch of internet service companies. And so I think you're seeing for the very first time, simultaneously, a very significant growth in the segment of AI factories. As well as, you know, a market that, a segment that really didn't exist before, but now it's growing exponentially, practically by the week, for AI inference with APIs. I.

it's got a killer app, it's called Generative AI. And so the easiest way to think about that is your trillion dollar infrastructure. Every quarter's capital, CapEx budget would lean very heavily into Generative AI, into accelerated computing infrastructure everywhere from

from the number of GPUs that would be used in the CapEx budget to the accelerated switches and accelerated networking chips that connect them all. The easiest way to think about that is over the next four, five, ten years, most of that $1 trillion and then compensating, adjusting for all the growth in data centers still.

Jensen Huang: The simple way to think about it, in the end, is that the world has a trillion dollars of data centers installed, and they used to be 100% CPUs. In the future, we know, we've heard it in enough places, and I think this year's ISC keynote was actually about the end of Moore's Law. We've seen it in a lot of places now that you can't reasonably scale out data centers with general purpose computing and that accelerated computing is the path forward. And now it's got a killer app.

this is a really big opportunity around large language models, but the cloud customers are also talking about trying to reduce cost per query by very significant amounts. And you can talk about the ramifications of that for you guys, is that where some of the specialty inference products that you launched at GTC come in and just, you know, how are you gonna help your customers get the cost per query down? Yeah, that's a great question.

Jensen Huang: It's called Generative AI. And so the easiest way to think about that is your trillion-dollar infrastructure. Every quarter's capital, or CapEx budget, would lean very heavily into Generative AI, into accelerated computing infrastructure, everywhere from the number of GPUs that would be used in the CapEx budget to the accelerated switches and accelerated networking chips that connect them all. The easiest way to think about that is that over the next four, five, ten years, most of that trillion dollars would be spent on accelerating, and then compensating for all the growth in It will be largely Generative AI, so that's probably the easiest way to think about that. And that's training as well as improving. Next, we'll go to Joseph Moore with Morgan Stanley. Your line is open.

whether you start by building a large language model, and you use that large language model, very large version, and you could distill them into medium, small, and tiny size. And the tiny size ones you could put in your phone, in your PC, and so on and so forth. And they all have good, you know, they all have...

have, it seems surprising, but they all can do the same thing. But obviously, obviously the zero shot or the generalizability of the large language model, the biggest one, is much more versatile and it can do a lot more amazing things. And...

And the large one would teach the smaller ones how to be good AIs. And so you use the large one to generate prompts to align the smaller ones and so on and so forth. And so you start by building very large ones and then you also have to train a whole bunch of smaller ones. Now that's exactly the reason why we have so many different sizes of our inference. That's all that I announced.

Joseph Moore: Great, thank you. I wanted to follow up on that. In terms of the focus on inference, it's pretty clear that this is a really big opportunity around large language models. But the cloud customers are also talking about trying to reduce the cost per query by very significant amounts. You can talk about the ramifications of that for you guys.

L4, L40, H100 NDL, which also has H100, and then we have H100 HGX, and then we have H100 multi-node with NVLink. So you could have model sizes of any kind that you'd like.

Jensen Huang: Is that where some of the specialty inference products that you launched at GTC come in and just, you know, how are you going to help your customers get the cost per query down? Yeah, that's a great question. Whether you start by building a large language model, and you use that large language model, a very large version, and you could distill it into medium, small, and tiny sites?

The other thing that's important is these are models, but they're connected ultimately to applications. And the applications could have image in, video out, video in, text out, image in, proteins out, text in, 3D out, video in in the future, 3D graphics out.

Jensen Huang: And the tiny sized ones you could put in your phone and your PC and so on and so forth. And they all have good, you know, they all have, have, it seems surprising, but they can do the same thing. But obviously, obviously, the zero shot or the generalizability of the large language model, the biggest one is much more versatile, and it can do a lot more amazing things. And, And the large one would teach the smaller ones how to be good AIs. And so you use the large one to generate prompts to align the smaller ones, and so on and so forth. And so you start by building very large ones, and then you also have to train a whole bunch of smaller ones. That's exactly the reason why we have so many different sizes of our inference.

So the input and the output requires a lot of pre and post processing. The pre and post processing can't be ignored. This is one of the things that most of the specialized chip arguments fall apart. And it's because the model itself is only, call it 25% of the overall processing of inference.

The rest of it is about pre-processing, post-processing, security, decoding, all kinds of things like that.

And so I think the multimodality aspect of inference, the multidiversity of inference, that it's going to be done in the cloud on-prem, it's going to be done in multi-cloud, that's the reason why we have NVIDIA AI Enterprise in all the clouds, it's going to be done on-prem, it's the reason why we have great partnership with Dell, which is the new.

Jensen Huang: You saw that I announced L4, L40, H100 NVL, which also has H100, and then we have H100 HGX, and then we have H100 multi-node with NVLink. And so you could have model sizes of any kind that you like. The other thing that's important is these are models, but they're connected ultimately to applications. And the applications could have image in, video out, video in, text out, you know, image in, proteins out, text in, 3D out, video in the future, 3D graphics out, you know, so that the input and the output require a lot of pre and post processing. The pre- and post-processing can't be ignored, and this is one of the things that most of the specialized chip arguments fall apart, and it's because the model itself is only, you know, call it 25% of the data, in the overall processing of inference.

AI is so broad, you need to have some very fundamental capabilities like what I just described in order to really address the whole space of it.

Next we'll go to Harlan Sir with J.P. Morgan. Your line is open.

Hi, good afternoon and congratulations on the strong results and execution. I really appreciate more of the focus or some of the focus today on your networking products. It's really an integral part to maximize the full performance of your compute platforms. I think your data center networking business is driving up

about a billion dollars of revenues per quarter plus or minus. Now that's two and a half X growth from three years ago, right, when you guys acquired Mellanox, so very strong growth. But given the very high attach of your InfiniBand, Ethernet solutions, your accelerated compute platforms, is the networking run rate stepping up in line with your compute shipment? And then what is the team doing to further unlock more networking bandwidth going forward, just to keep pace with the significant increase in compute complexity, data sets?

Jensen Huang: The rest of it is about pre-processing, post-processing, security, you know, decoding, all kinds of things like that. And so I think the multi-modality aspect of inference, the multi-diversity of inference, that it's going to be done in the cloud on-prem, it's going to be done in the multi-cloud, that's the reason why we have NVIDIA AI Enterprise in all the It's going to be done on-premises, it's the reason why we have a great partnership with Dell, which we just announced the other day, called Project Helix, that's going to be integrated into third-party services, and it's the reason why we have a great partnership with ServiceNow and Adobe, because they're going to be creating a whole bunch of generative AI capabilities. And so the diversity and the reach of generative Next, we'll go to Harlan Sur with J.P. Morgan. Your line is open.

requirements for lower latency, better traffic predictability, and so on. Yeah, Erlin, I really appreciate that. Nearly everybody who thinks about AI, they think about that chip, that accelerator chip, and in fact misses the whole point nearly completely. And I've mentioned before that accelerated computing is about the same.

tens of thousands of GPUs. How do you connect tens of thousands of GPUs if the operating system of the data center, which is the infrastructure, is not insanely great? And so that's the reason why we're so obsessed about networking in the company.

Harlan Sur: Hi, good afternoon, and congratulations on the strong result in execution. You know, I really appreciate more of the focus, or some of the focus today on your networking products. I mean, it's really an integral part to sort of maximize the full performance of your compute platforms. I think your data center networking business is driving up about a billion dollars of revenues per quarter, plus or minus. That's two and a half times the growth from three years ago, right, when you guys acquired Mellanox.

And one of the great things that we have, we have, you know, Mellanox, as you know quite well, was the world's highest performance and the unambiguous leader in high-performance networking. It's the reason why our two companies are together. You also see that our network

Jensen Huang: So very strong growth. But given the very high attach rate of your InfiniBand, Ethernet solutions, and your accelerated compute platforms, is the networking run rate stepping up in line with your compute shipments? And then what is the team doing to further unlock more networking bandwidth going forward, just to keep pace with the significant increase in compute complexity, data sets, requirements for lower latency, better traffic predictability, and so on. Yeah, Harlan, I really appreciate that. Nearly, nearly everybody who thinks about AI thinks about that chip, that accelerator chip. And, and I, in fact, missed the whole point nearly completely.

expands starting from NVLink, which is a computing fabric with really super low latency. And it communicates using memory references, not network packets. And then we take NVLink, we connect it inside multiple GPUs, and I describe it.

going beyond the GPU. And I'll talk a lot more about that at Computex in a few days. And then that gets connected to InfiniBand, which includes the NIC, the Smart NIC, BlueField 3 that we're in full production with, and the switches. All of the fiber optics that are optimized end to end.

Jensen Huang: And, and I mentioned before that accelerated computing is about the stack, about the software, and networking. Remember, we announced a very, very early version of this networking stack called Doka. And we have the acceleration library called Magnum IO. These two pieces of software are some of the crown jewels of our company. Nobody ever talks about it because it's hard to understand, but it makes it possible for us to connect, you know, tens of thousands of GPUs. But how do you connect tens and tens of thousands of GPUs if the operating system of the data center, which is the infrastructure, is not insanely great? And so, that's the reason why we're so, so obsessed with networking in the company.

These things are running at incredible line rates. And then beyond that, if you want to connect this smart AI factory, this AI factory into your computing fabric, we have a brand new type of Ethernet that we'll be announcing.

And so this whole area of the computing fabric extending, connecting all of these GPUs and computing units together all the way through the networking, through the switches, the software stack is insanely complicated. And so I'm delighted you understand it. We don't break it out particularly.

because we think of the whole thing as a computing platform, as it should be. We sell it to all of the world's data centers as components so that they can integrate it into whatever style or architecture that they would like, and we can still run our software stack. That's the reason why we break it up.

Jensen Huang: And one of the great things that we have, we have Mellanox, as you know quite well, which is the world's highest performance and the unambiguous leader in high-performance networking. It is the reason why our two companies are together. You also see that our network expands starting from NVLink, which is a computing fabric with really super-low latency, and it communicates using memory references, not network packets.

Way more complicated the way that we do it, but it makes it possible for NVIDIA's computing architecture to be integrated into anybody's data center in the world from cloud of all different kinds to on prem of all different kinds all the way out to the edge to 5g and so this this This way of doing it is really really complicated, but it gives us incredible reach

Jensen Huang: And then we take NVLink, we connect it inside multiple GPUs, and I've described going beyond the GPU, and I'll talk a lot more about that at Computex in a few days. And then that gets connected to InfiniBand, which includes the NIC and the SmartNIC BlueField 3 that we're in full production with, and the switches, all of the fiber optics that are optimized end-to-end. These things are running at incredible line rates.

And our last question will come from Matt Ramsey with TD Callan. Your line is open. Thank you very much. Congratulations, Jensen, and to the whole team. One of the things I wanted to dig into a little bit is the DGX cloud offering. You guys have been working on this for some time behind the scenes where you sell in the hardware to your hybrid.

Jensen Huang: And then beyond that, if you want to connect this smart AI factory to your computing fabric, we have a brand new type of Ethernet that we'll be announcing at Computex. And so this whole area of the computing fabric expanding, connecting all of these GPUs and computing units together all the way through the networking, through the switches, the software stack is insanely complicated. And so I'm delighted you understand it.

Jensen Huang: But we don't break it out particularly because we think of the whole thing as a computing platform as it should be. We sell it to all of the world's data centers as components so that they can integrate it into whatever style or architecture that they would like, and we can still run our software stack. That's the reason why we break it up.

buying for their own first party internal workloads versus their own sort of third party, their own customers versus what of that big upside in data center going forward is systems that you're selling in with potential to support your DGX cloud offerings and what you've learned since you've launched it about the potential of that business. Thanks

Jensen Huang: It's way more complicated the way that we do it, but it makes it possible for NVIDIA's computing architecture to be integrated into anybody's data center in the world, from clouds of all different kinds to on-prem of all different kinds all the way out to the edge to 5G. And so this way of doing it is really, really complicated, but it gives us incredible reach. And our last question will come from Matt Ramsay with TD Cowan. Your line is open.

Yeah, thanks, Matt. Without being too specific about numbers, but the ideal scenario, the ideal mix is something like 10% NVIDIA DGX cloud and 90% the CSPs clouds.

And the reason, and our DGX cloud is the NVIDIA stack, it's the pure NVIDIA stack. It is architected the way we like and achieves the best possible performance. It gives us the ability to partner very deeply with the CSPs to create the highest performing.

Matt Ramsay: Thank you very much. Congratulations, Jensen, and to the whole team. One of the things I wanted to dig into a little bit is the DGX cloud offering. You guys have been working on this for some time behind the scenes where you sell the hardware to your hyperscale partners and then lease it back for your own business. The rest of us kind of found out about it publicly a few months ago, and as we look forward over the next number of quarters, as Colette discussed, to high visibility in the data center business, maybe you could talk a little bit about the mix you're seeing of hyperscale customers buying for their own first party internal workloads versus their own sort of third party, their own customers versus what has that big upside in data centers going forward, which is systems Thanks. Yeah, thanks, Matt.

Number one. Number two, it allows us to partner with the CSPs to create markets. Like for example, we're partnering with Azure.

to bring Omniverse Cloud to the world's industries. And the world's never had a system like that. The computing stack with all the generative AI stuff, and all the 3D stuff, and the physics stuff, incredibly large database, really high speed networks and low latency networks.

That kind of a virtual industrial virtual world has never existed before. And so we partnered with Microsoft to create Omniverse Cloud inside Azure Cloud.

And so that allows us, number two, to create new applications together and develop new markets together. And we go to market as one team and we benefit by getting customers on our computing platform and they benefit by having us in their cloud, number one, but number two,

Jensen Huang: Without being too specific about numbers, but the ideal scenario, the ideal mix is something like, you know, 10% NVIDIA DGX cloud and 90% CSP cloud. And our DGX cloud is the NVIDIA stack. It's the pure NVIDIA stack.

the amount of data and services, security services, and all of the amazing things that Azure and GCP and OCI have, they can instantly have access to that through Omniverse Cloud. It's a huge win-win. For the customers, the way that NVIDIA's cloud works.

Jensen Huang: It is architected the way we like and achieves the best possible performance. It gives us the ability to partner very deeply with the CSPs to create the highest performing infrastructure, number one. Number two, it allows us to partner with the CSPs to create markets. Like, for example, we're partnering with Azure to bring Omniverse Cloud to the world's industry. And the world's never had a system like that.

for these early applications, they could do it anywhere. So one standard stack runs in all the clouds and if they would like to take their software and run it on the CSP's cloud themselves and manage it themselves, we're delighted by that because NVIDIA AI Enterprise.

Jensen Huang: The computing stack with all the generative AI stuff and all the 3D stuff and the physics stuff, incredibly large databases and really high-speed networks and low-latency networks, that kind of virtual industrial world has never existed before. And so we partnered with Microsoft to create Omniverse Cloud inside Azure Cloud. And so it allows us, number two, to create new applications together and develop new markets together. And we go to market as one team, and we benefit by getting customers on our computing platform, and they benefit by having us in their cloud. But number two, the amount of data and services and security services and all of the amazing things that Azure and GCP and OCI have, they can instantly have access to that through Omniverse Cloud. And so it's a huge win-win.

NVIDIA AI foundations and long-term, this is going to take a little longer, but NVIDIA Omniverse will run into CSP's clouds. Our goal really is to drive architecture, to partner deeply in creating new markets and the new applications that we're doing, and provide our customers with the flexibilities.

to run NVIDIA everywhere, including on-prem. And so those were the primary reasons for it, and it's worked out incredibly. Our partnership with the three CSPs and that we currently have DGX cloud in.

And their sales force and marketing teams, their leadership teams is really quite spectacular. It works great.

Jensen Huang: And for customers, the way that NVIDIA's cloud works, for these early applications, they could do it anywhere. So one standard stack runs in all the clouds. And if they would like to take their software and run it on the CSP's cloud themselves and manage it themselves, we're delighted by that. Because NVIDIA AI Enterprise, NVIDIA AI Foundations, and, long term, this is going to take a little longer, but NVIDIA Omniverse will run in the CSP's clouds. So our goal really is to drive architecture, to partner deeply in creating new markets and the new applications that we're doing, and provide our customers with the flexibility to run NVIDIA everywhere, including on premises. And so, those were the primary reasons for it. And, and it's worked out incredibly, our partnership with the three CSPs and that we currently have DGX Cloud in their Salesforce and marketing teams, their leadership teams, is really quite spectacular. It works great.

Thank you. I'll now turn it back over to Jensen Wong for closing remarks. The computer industry is going through two simultaneous transitions.

accelerated computing and generative AI. CPU scaling has slowed, yet computing demand is strong.

And now with generative AI, supercharged. Accelerated computing, a full stack and data center scale approach that NVIDIA pioneered is the best path forward. There's a trillion dollars installed in the global data center infrastructure based on the general purpose computing method of the last era. Companies are now racing to deploy accelerated computing for the generative AI era.

Jensen Huang: Thank you. I'll now turn it back over to Jensen Huang for closing remarks. The computer industry is going through two simultaneous transitions: Accelerated Computing and Generative AI. CPU scaling has slowed.

can generate amazing content and with models to fine-tune, guardrail, align to guiding principles, and ground to facts, generative AI is emerging from labs and is on its way to industrial applications. As we scale with...

Yet computing demand is strong, and now with generative AI. Supercharged.

cloud and internet service providers, we are also building platforms for the world's largest enterprises, whether within one of our CSP partners or on-prem with Dell Helix.

Accelerated computing, a full stack and data center scale approach that NVIDIA pioneered, is the best path forward. There are a trillion dollars installed in the global data center infrastructure based on the general purpose computing method of the last era. Companies are now racing to deploy accelerated computing for the generative AI era. Over the next decade, most of the world's data centers will be accelerated. We are significantly increasing our supply to meet their surging demand. Large language models can learn information encoded in many forms.

whether on a leading enterprise platform like ServiceNow and Adobe, or bespoke with NVIDIA AI foundations, we can help enterprises leverage their domain expertise and data to harness generative AI securely and safely. We are ramping a wave of products in the coming quarters. Thank you for taking time out of your thoughts today.

Guided by large language models, generative AI models can generate amazing content. And with models to fine-tune, guardrail, align to guiding principles, and ground to ground to facts, generative AI is emerging from labs and is on its way to industrial applications. As we scale with cloud and intranet service providers, we are also building platforms for the world's largest enterprises. Whether within one of our CSP partners or on-premises with Dell Helix, whether on a leading enterprise platform like ServiceNow and Adobe, or bespoke with Nvidia AI Foundations, we can help enterprises leverage their domain expertise and data to harness generative AI securely and safely. We are ramping a wave of products in the coming quarters, including H100, our Grace and Grace Hopper Superchips, and our Bluefield 3 and Spectrum 4 networking platforms. They are all in production. They will help deliver data center scale computing that is also energy efficient and sustainable computing. Join us next week at Computex, and we'll show you what's next. Thank you. This concludes today's conference call. You may now disconnect.

including H100, our Grace and Grace Hopper superchips, and our Bluefield 3 and Spectrum 4 networking platform.

They are all in production. They will help deliver data center scale computing that is also energy efficient and sustainable computing.

Join us next week at Computex and we will show you what's next. Thank you. This concludes today's conference call. You may now disconnect.

This video was early As only 10 it took about 4 hours to upload

Q1 2024 NVIDIA Corp Earnings Call

Request a DemoDemo

NVDA

NVIDIA

Earnings