A custom-built rack for the Maia 100 AI Accelerator and its “sidekick” inside a thermal chamber at a Microsoft lab in Redmond, Washington. The sidekick acts like a automotive radiator, biking liquid to and from the rack to chill the chips as they deal with the computational calls for of AI workloads.
John Brecher | Microsoft
Microsoft unveiled two chips at its Ignite convention in Seattle on Wednesday.
The primary, its Maia 100 synthetic intelligence chip, may compete with Nvidia’s extremely sought-after AI graphics processing models. The second, a Cobalt 100 Arm chip, is aimed toward normal computing duties and will compete with Intel processors.
Money-rich know-how corporations have begun giving their purchasers extra choices for cloud infrastructure they’ll use to run purposes. Alibaba, Amazon and Google have performed this for years. Microsoft, with about $144 billion in money on the finish of October, had 21.5% cloud market share in 2022, behind solely Amazon, based on one estimate.
Digital-machine situations working on the Cobalt chips will turn into commercially accessible by way of Microsoft’s Azure cloud in 2024, Rani Borkar, a company vice chairman, advised CNBC in an interview. She didn’t present a timeline for releasing the Maia 100.
Google introduced its unique tensor processing unit for AI in 2016. Amazon Internet Companies revealed its Graviton Arm-based chip and Inferentia AI processor in 2018, and it introduced Trainium, for coaching fashions, in 2020.
Particular AI chips from cloud suppliers would possibly have the ability to assist meet demand when there is a GPU scarcity. However Microsoft and its friends in cloud computing aren’t planning to let corporations purchase servers containing their chips, not like Nvidia or AMD.
The corporate constructed its chip for AI computing based mostly on buyer suggestions, Borkar defined.
Microsoft is testing how Maia 100 stands as much as the wants of its Bing search engine’s AI chatbot, the GitHub Copilot coding assistant and GPT-3.5-Turbo, a big language mannequin from Microsoft-backed OpenAI, Borkar mentioned. OpenAI has fed its language fashions with massive portions of data from the web, they usually can generate e mail messages, summarize paperwork and reply questions with a couple of phrases of human instruction.
The GPT-3.5-Turbo mannequin works in OpenAI’s ChatGPT assistant, which grew to become standard quickly after changing into accessible final yr. Then corporations moved shortly so as to add comparable chat capabilities to their software program, rising demand for GPUs.
“We have been working throughout the board and [with] all of our completely different suppliers to assist enhance our provide place and help lots of our clients and the demand that they’ve put in entrance of us,” Colette Kress, Nvidia’s finance chief, mentioned at an Evercore convention in New York in September.
OpenAI has beforehand educated fashions on Nvidia GPUs in Azure.
Along with designing the Maia chip, Microsoft has devised {custom} liquid-cooled {hardware} known as Sidekicks that slot in racks proper subsequent to racks containing Maia servers. The corporate can set up the server racks and the Sidekick racks with out the necessity for retrofitting, a spokesperson mentioned.
With GPUs, benefiting from restricted knowledge heart area can pose challenges. Firms typically put a couple of servers containing GPUs on the backside of a rack like “orphans” to stop overheating, fairly than filling up the rack from high to backside, mentioned Steve Tuck, co-founder and CEO of server startup Oxide Pc. Firms typically add cooling programs to scale back temperatures, Tuck mentioned.
Microsoft would possibly see sooner adoption of Cobalt processors than the Gaia AI chips if Amazon’s expertise is a information. Microsoft is testing its Groups app and Azure SQL Database service on Cobalt. Thus far, they’ve carried out 40% higher than on Azure’s current Arm-based chips, which come from startup Ampere, Microsoft mentioned.
Previously yr and a half, as costs and rates of interest have moved increased, many corporations have sought out strategies of constructing their cloud spending extra environment friendly, and for AWS clients, Graviton has been considered one of them. All of AWS’ high 100 clients at the moment are utilizing the Arm-based chips, which might yield a 40% price-performance enchancment, Vice President Dave Brown mentioned.
Shifting from GPUs to AWS Trainium AI chips could be extra sophisticated than migrating from Intel Xeons to Gravitons, although. Every AI mannequin has its personal quirks. Many individuals have labored to make a wide range of instruments work on Arm due to their prevalence in cell units, and that is much less true in silicon for AI, Brown mentioned. However over time, he mentioned, he would count on organizations to see comparable price-performance features with Trainium as compared with GPUs.
“We’ve got shared these specs with the ecosystem and with quite a lot of our companions within the ecosystem, which advantages all of our Azure clients,” she mentioned.
Borkar mentioned she did not have particulars on Maia’s efficiency in contrast with options akin to Nvidia’s H100. On Monday, Nvidia mentioned its H200 will begin delivery within the second quarter of 2024.
WATCH: Nvidia notches tenth straight day of features, pushed by new AI chip announcement