Microsoft has developed its own custom AI chip that can train large language models and potentially avoid a costly dependency on Nvidia. Microsoft has also developed its own Arm-based processor for cloud workloads. The two custom-made silicon chips are intended to power Azure data centers and prepare the company and its enterprise customers for a future full of AI. Microsoft has stated that they are therefore not intended for sale. Microsoft and other technology companies face high costs of providing AI services, which can cost ten times more than services like search engines.
On Wednesday, at the Microsoft Ignite conference, Microsoft announced two custom chips designed to accelerate AI workloads internally across its Azure cloud computing service: Microsoft Azure Maia 100 AI Accelerator and the Microsoft Azure Cobalt 100 processor.
Microsoft designed Maia specifically to run large language models such as GPT 3.5 Turbo and GPT-4, which underlie the Azure OpenAI and Microsoft Copilot (formerly Bing Chat) services. Maia has 105 billion transistors manufactured using a 5nm TSMC process. Meanwhile, Cobalt is a 128-core ARM processor designed to run traditional computing tasks like powering Microsoft Teams. Microsoft also has no plans to sell them, preferring them for internal use only:
Sent by Microsoft
The chips will be deployed in Microsoft’s data centers early next year and will initially power the company’s services such as Microsoft Copilot or Azure OpenAI Service. They will join a growing portfolio of products from industry partners to help meet the growing demand for efficient, scalable and sustainable computing power, as well as the needs of customers seeking to benefit from the latest advances in cloud and AI.
The chips represent the final piece of the puzzle that will enable Microsoft to deliver infrastructure systems – covering everything from the range of chips, software and servers to racks and cooling systems – designed from the ground up to accommodate internal and customer workloads can be optimized mind.
A strategic decision…
Announced last year, the H100 is Nvidia’s latest flagship AI chip and follows the A100, a roughly $10,000 chip considered a workhorse for AI applications.
Developers use the H100 to build advanced language models (LLM), which are at the heart of AI applications like OpenAI’s ChatGPT. These systems are expensive to operate and require powerful computers to process trabytes of data for several days or weeks. They also rely on significant computing power so that the AI model can generate text, images or predictions.
Training AI models, especially large models like GPT, requires hundreds of high-end Nvidia GPUs working together.
Microsoft’s Azure Maia AI chip and Arm-powered Azure Cobalt processor will launch in 2024 after a surge in demand for Nvidia’s H100 GPUs this year. The demand for these GPUs is so high that some have even fetched over $40,000 on eBay.
Nvidia H100 GPUs go for $40,000 on eBay. pic.twitter.com/7NOBI8cn3k
— John Carmack (@ID_AA_Carmack) April 14, 2023
Microsoft actually has a long history in silicon development, says Rani Borkar, head of Azure hardware systems and infrastructure at Microsoft.
Microsoft helped develop silicon for the Xbox more than 20 years ago and even helped develop chips for its Surface devices. This effort builds on that experience, Borkar explains. In 2017, we began designing the cloud hardware stack architecture and began this journey that put us on the right path to developing our new custom chips.
The new Azure Maia AI chip and Azure Cobalt processor are both developed in-house at Microsoft and combine a deep overhaul of the entire cloud server stack to optimize power, performance and cost. We are rethinking cloud infrastructure for the AI era and optimizing literally every layer of that infrastructure,” says Borkar.
Rani Borkar, corporate vice president for Azure Hardware Systems and Infrastructure (AHSI) at Microsoft
…which also takes into account the chip shortage
With chip shortages driving up prices for Nvidia’s coveted AI GPUs, several companies have developed or are considering developing their own AI accelerator chips, including Amazon, OpenAI, IBM and AMD. Microsoft also felt the need to develop custom chips to bring its own services to the fore.
During its announcement, the company stated:
Sent by Microsoft
Chips are the workhorses of the cloud. They control billions of transistors that process the huge streams of ones and zeros flowing through data centers. This work ultimately allows you to do almost anything on your screen, from sending an email to generating an image in Bing with a simple phrase.
Just as you have control over every design choice and detail when building a home, Microsoft sees the addition of internally developed chips as a way to ensure every part is ready for Microsoft Cloud and AI workloads. The chips are housed on custom server boards and placed in custom racks that easily integrate into existing Microsoft data centers. Hardware will work hand in hand with co-developed software to unlock new capabilities and opportunities.
Develop hardware and software together
The company’s new Maia 100 AI accelerator will power some of the largest internal AI workloads running on Microsoft Azure. Additionally, OpenAI has provided feedback on Azure Maia, and Microsoft’s detailed insights into how OpenAI’s workloads run on infrastructure tailored to its large language models will help inform Microsoft’s future designs.
*Since our initial partnership with Microsoft, we have worked together to shape Azure’s AI infrastructure at every level to support our models and our unparalleled training needs, said Sam Altman, CEO of OpenAI. We were excited when Microsoft first revealed its designs for the Maia chip, and we worked together to refine it and test it with our models. Azure’s end-to-end AI architecture, now optimized all the way down to silicon with Maia, paves the way for training better models and making those models more cost-effective for our customers.
The Maia 100 AI Accelerator is also designed specifically for the Azure hardware stack, said Brian Harry, a Microsoft engineer who leads the Azure Maia team. This vertical integration and alignment of chip design with the broader AI infrastructure designed specifically for Microsoft’s workloads could lead to huge performance and efficiency gains, he said.
“Azure Maia is designed specifically for AI and to achieve absolute maximum hardware utilization,” he said.
Meanwhile, the Cobalt 100 processor is based on Arm architecture, a type of energy-efficient chip design, and is optimized to deliver superior efficiency and performance in cloud-native offerings, said Wes McCullough, vice president of materials engineering. The choice of Arm technology was an important part of Microsoft’s sustainability goal. The goal is to optimize performance per watt across all data centers, which essentially means providing more computing power for each unit of energy consumed.
Diploma
No technology company is one, and Microsoft is no exception. The company plans to continue relying on third-party chips, both for supply needs and likely to satisfy its tangled web of business partnerships. Microsoft will also add the latest Nvidia H200 Tensor Core GPU to its fleet next year to support larger model inference [sic] without an increase in latency, the company says, pointing to Nvidia’s recently announced AI processing GPU. It also adds virtual machines accelerated by AMD MI300X Azure.
How do the new chips* perform? Microsoft hasn’t released any benchmarks yet, but the company seems happy with the performance-per-watt ratio of the chips, particularly the Cobalt. “We believe this will enable us to provide our customers with better, faster, lower cost and higher quality solutions,” said Scott Guthrie, executive vice president of Microsoft’s cloud and AI group.
Source: Microsoft
And you ?
What are the advantages and disadvantages of developing your own AI chips compared to buying chips from external suppliers?
How could the Maia AI Accelerator processor be game-changer for Microsoft in the AI space, especially against rivals like Google, Amazon and Meta?
What are the challenges and risks associated with producing custom AI chips, particularly in terms of cost, quality and safety?
What are the potential application areas of the Maia AI Accelerator processor, both for developers and end users?
What are the environmental impacts of the production and use of high-performance AI chips? How can Microsoft reduce its carbon footprint in this area?