GPU virtualization for private AI: Broadcom and AMD enable shared inferencing, trusted model sharing, vector DB and model manager atop VMware Cloud to meet SLAs and cut latency • DigiBanker

Broadcom Inc. and Advanced Micro Devices Inc., are engineering infrastructure purpose-built for private AI with GPU virtualization at its core. Running on AMD’s latest processors and GPUs, the joint platform is designed to deliver an open, streamlined stack that meets enterprise demands for both scale and trust, according to Kumaran Siva, corporate vice president of strategic business development at AMD. The partnership is enabling advanced virtualization across AI workloads. GPU virtualization allows multiple application servers to access trusted models and inference engines simultaneously, improving scalability across the AI private cloud, according to Paul Turner, vice president of products, VMware Cloud Foundation Division, at Broadcom. “We are building an ability for our inferencing engine,” he said. “You can actually do shared inferencing at the same model. We already have model sharing that we can actually manage as trusted models that you share across your enterprise. Now you’re going to be able to deploy them onto your GPUs and actually have multiple application servers actually share those.” The partnership is also developing a streamlined software stack to simplify private AI deployment. With components such as a vector database and model manager, the system supports GPU virtualization for rapid deployment and improved workload performance in the AI private cloud, according to Siva. A key goal of the collaboration between Broadcom and AMD is to help enterprises extract more value from their hardware. With AI workloads running at scale, virtualization is essential to meeting SLAs, reducing latency and ensuring dynamic workload placement. These capabilities are foundational to a responsive private cloud environment, according to Turner. The companies are also focused on optimizing the hardware-software stack for enterprise AI. Deep integration between AMD hardware and VMware Cloud Foundation enables seamless workload mobility and high concurrency across AI models. That level of infrastructure harmony is only possible with GPU virtualization at the core, according to Siva.

Read Article