Google kicked off Google I/O this afternoon by speaking for greater than an hour about its quite a few advances in synthetic intelligence. The corporate mentioned its new PaLM 2 giant language mannequin (LLM) for generative AI, which powers the Bard chatbot instrument. It is a foundational pillar for including AI-infused options throughout Google’s product portfolio, together with Google Maps, Google Images, and Gmail (amongst others).
With that in thoughts, there’s a want for some critical horsepower within the cloud to energy fashions within the wild, as thousands and thousands (and ultimately billions) of customers ship requests for operations as mundane as eradicating an individual lingering within the background of an image to composing a whole electronic mail for you primarily based on a brief textual content immediate. That is the place Google’s new A3 GPU supercomputer comes into focus. Google says the brand new A3 supercomputers are “purpose-built to coach and serve probably the most demanding AI fashions that energy at the moment’s generative AI and huge language mannequin innovation” whereas delivering 26 exaFlops of AI efficiency.
Every A3 supercomputer is filled with 4th technology Intel Xeon Scalable processors backed by 2TB of DDR5-4800 reminiscence. However the actual “brains” of the operation come from the eight Nvidia H100 “Hopper” GPUs, which have entry to three.6 TBps of bisectional bandwidth by leveraging NVLink 4.0 and NVSwitch.
Based on Google, A3 represents the primary production-level deployment of its GPU-to-GPU information interface, which permits for sharing information at 200 Gbps whereas bypassing the host CPU. This interface, which Google calls the Infrastructure Processing Unit (IPU), ends in a 10x uplift in out there community bandwidth for A3 digital machines (VM) in comparison with A2 VMs.
“Google Cloud’s A3 VMs, powered by next-generation NVIDIA H100 GPUs, will speed up coaching and serving of generative AI purposes,” stated Ian Buck, VP for hyperscale and high-performance computing at NVIDIA. “On the heels of Google Cloud’s lately launched G2 cases, we’re proud to proceed our work with Google Cloud to assist remodel enterprises around the globe with purpose-built AI infrastructure.”
If your online business needs to leverage A3 digital machines, the one technique to acquire entry is by filling out Google’s A3 Preview Curiosity Type to affix the Early Entry Program. However as Google clearly states, plugging in your data does not assure a spot in this system.