Nvidia fleshes out generative AI vision from PC, workstation to cloud

Published August 8, 2023

Nvidia at Siggraph outlined an AI vision where developers will create, test and optimize generative AI models and large language models (LLMs) on a PC and workstation and then scale them via data centers or the cloud.

Not surprisingly, this vision includes a heavy dose of Nvidia GPUs. PC makers already highlighted that systems were on deck for generative AI training and workloads.

The two headliners during CEO Jensen Huang's keynote were Nvidia RTX workstations as well as Nvidia AI Workbench. Nvidia AI Workbench is a toolkit to enable developers to create, test and customize models on a PC or workstation and then move them to deploy in data centers, public clouds or Nvidia DGX Cloud.

AI Workbench includes a simplified interface with models housed at Hugging Face, GitHub and Nvidia NGC that can be combined with custom data and shared. AI Workbench will be included in systems from Dell Technologies, Hewlett Packard Enterprise, HP Inc., Lambda, Lenovo and Supermicro.

To go along with AI Workbench, Nvidia launched Nvidia AI Enterprise 4.0, its enterprise software platform for production deployments. AI Enterprise 4.0 includes Nvidia NeMo, Triton Management Service, Base Command Manager Essentials as well as integration with public cloud marketplaces from Google Cloud, Microsoft Azure and Oracle Cloud.

As for the Nvidia RTX workstations, the systems will include Nvidia's RTX 6000 Ada Generation GPUs, AI Enterprise and Omniverse Enterprise software. These systems will include up to four RTX 6000 Ada Generation GPUs with 48GB of memory and up to 5,828 TFLOPS of AI performance and 192GB of GPU memory. These systems will be announced by OEMs in the fall.

Among other Nvidia items from Siggraph:

The company announced Nvidia OVX servers with the new Nvidia L40S GPU, which is designed for AI training and inference, 3D designs, visualization and video processing. The Nvidia L40S will be available starting in the fall. ASUS, Dell Technologies, GIGABYTE, HPE, Lenovo, QCT and Supermicro will offer OVX systems with L40S GPUs.
Nvidia launched a new release of Nvidia Omniverse for developers and enterprises using 3D tools and applications. Omniverse uses the OpenUSD framework and adds generative AI features. Additions include modular app building, new templates, better efficiency and native RTX spatial integration. Nvidia also launched new Omniverse Cloud APIs.
The company also rolled out frameworks, resources and services to speed up adoption of OpenUSD (Universal Scene Description). OpenUSD is a 3D framework that connects software tools, data types and APIs for building virtual worlds. The APIs include ChatUSD, a LLM copilot so developers and ask questions and generate code, RunUSD, which translates OpenUSD files to create rendered images, DeepSearch, an LLM for semantic search through untagged assets, and USD-GDN Publisher, which will publish OpenUSD experiences to Omniverse Cloud in a click.

Constellation Research’s take

Constellation Research analyst Andy Thurai said:

“Nvidia NeMO is an end-to-end framework for building foundational models that can be a pain to build. Nvidia AI workbench will allow for the cloning of AI projects and allow developers a workspace to build LLMs.

The new service and the interface will allow users to train or retrain models from Hugging Face in the Nvidia DGX cloud whether on a public cloud platform such as GCP or Azure or a Nvidia private cloud. Notoriously missing was AWS.

Nvidia claims the compute power today is built for older technologies and workloads. According to Nvidia, modern workloads must be run on the newer chips such as the Nvidia GPUs and Grace Hopper super chips. To that end, the Dual GH200 (a combination of Grace CPU and Hopper GPU into a Grace Hopper) is one of the most powerful processors ever. In that combination, Nvidia aims to get a lower capital expense for the processing power required (lower capex) and lower operational costs with energy consumption and a much faster inference thereby reducing the opex. If this dream were to come true, Nvidia can kill the mighty Intel's x86 business by demonstrating that Grace Hopper can process AI technologies, particularly the AI training workloads, with 20x less power and 12x less cost than the comparable CPU-based processing technologies.

In short, Nvidia has claimed the AI workloads, both training and inferencing, must be run by Nvidia based chips to be more efficient than rivals Intel and AI chip companies like SambaNova Systems."

Larry Dignan

Editor in Chief of Constellation Insights
Constellation Research

Larry Dignan is Editor in Chief of Constellation Insights at Constellation Research, where he leads editorial coverage focused on enterprise technology, digital transformation, and emerging trends shaping the future of business. He oversees research-driven news, analysis, interviews, and event coverage designed to help technology buyers and vendors navigate complex markets with clarity and context. ...

Insight News

February 27, 2026

Amazon, OpenAI forge multi-faceted partnership: Dissecting the deal

Data to Decisions

Amazon and OpenAI expanded a partnership and the initial headlines will revolve around the $110 billion raised to give the LLM giant a $730 billion valuation. However, the Amazon i...

Larry Dignan

Insight News

February 26, 2026

Dell delivers blowout Q4, outlook on AI infrastructure boom

Tech Optimization

Dell Technologies delivered blowout fourth quarter results and raised its outlook on strong AI infrastructure demand. ...

Larry Dignan

Insight News

February 26, 2026

CoreWeave tops $5 billion in revenue for 2025, projects more hypergrowth 2026, 2027

Tech Optimization

CoreWeave topped $5 billion in annual revenue as the company said, quot;demand continues to intensify.quot; CoreWeave said 2026 revenue will be between $12 billion to $13 billion w...

Larry Dignan

Insight News

February 26, 2026

Salesforce's Agentic Work Unit: What you need to know

Data to Decisions

Salesforce has a new metric--Agentic Work Unit (AWU). The big question is whether enterprises will adopt it. ...

Larry Dignan

Insight News

February 26, 2026

ServiceNow integrates Moveworks, launches Autonomous Workforce, EmployeeWorks

Future of Work

ServiceNow launched Autonomous Workforce, an AI specialist on the Now Platform designed for specialized roles in service desk, employee service and security operations. The company...

Larry Dignan

Insight News

February 26, 2026

DSAG: SAP customer spending selective, AI drags

SAP customers are being more selective in their spending, moving slowly to S/4HANA and often deciding to extend maintenance for on-premises deployments, according to the German-spe...

Larry Dignan

Nvidia fleshes out generative AI vision from PC, workstation to cloud

Constellation Research’s take

Amazon, OpenAI forge multi-faceted partnership: Dissecting the deal

Dell delivers blowout Q4, outlook on AI infrastructure boom

CoreWeave tops $5 billion in revenue for 2025, projects more hypergrowth 2026, 2027

Salesforce's Agentic Work Unit: What you need to know

ServiceNow integrates Moveworks, launches Autonomous Workforce, EmployeeWorks

DSAG: SAP customer spending selective, AI drags

Published

Author

Research

Analyst Services

Videos

Communities

Events

Insights Live

Nvidia fleshes out generative AI vision from PC, workstation to cloud

Constellation Research’s take

Results

Published

Author

Business Themes

Audience Role

Hot Topics

Related Blog Posts