At the moment, MLCommons printed outcomes of its MLPerf Inference v3.1 efficiency benchmark for GPT-J, the 6 billion parameter giant language mannequin, in addition to pc imaginative and prescient and pure language processing fashions.
Intel submitted outcomes for Habana Gaudi2 accelerators, 4th Gen Intel Xeon Scalable processors, and Intel Xeon CPU Max Sequence. The outcomes present Intel’s aggressive efficiency for AI inference and reinforce the corporate’s dedication to creating synthetic intelligence extra accessible at scale throughout the continuum of AI workloads – from shopper and edge to the community and cloud.
‘As demonstrated by means of the latest MLCommons outcomes, we have now a powerful, aggressive AI product portfolio, designed to satisfy our clients’ wants for high-performance, high-efficiency deep studying inference and coaching, for the entire spectrum of AI fashions – from the smallest to the biggest – with main value/efficiency.’
Sandra Rivera, Intel govt vice chairman and normal supervisor of the Knowledge Heart and AI Group
Why It Issues: Constructing on the MLCommons AI coaching replace from June and the Hugging Face efficiency benchmarks that validate that Gaudi2 can outperform Nvidia’s H100 on a state-of-the-art imaginative and prescient language mannequin, immediately’s outcomes additional reinforce that Intel provides the one viable various to Nvidia’s H100 and A100 for AI compute wants.
Each buyer has distinctive issues, and Intel is bringing AI all over the place with merchandise that may handle inference and coaching throughout the continuum of AI workloads. Intel’s AI merchandise give clients flexibility and selection when selecting an optimum AI answer primarily based on their very own respective efficiency, effectivity and price targets, whereas serving to them break from closed ecosystems.
Concerning the Habana Gaudi2 Outcomes: The Habana Gaudi2 inference efficiency outcomes for GPT-J present sturdy validation of its aggressive efficiency.
Gaudi2 inference efficiency on GPT-J-99 and GPT-J-99.9 for server queries and offline samples are 78.58 per second and 84.08 per second, respectively.
Gaudi2 delivers compelling efficiency vs. Nvidia’s H100, with H100 exhibiting a slight benefit of 1.09x (server) and 1.28x (offline) efficiency relative to Gaudi2.
Gaudi2 outperforms Nvidia’s A100 by 2.4x (server) and 2x (offline).
The Gaudi2 submission employed FP8 and reached 99.9% accuracy on this new information kind.
With Gaudi2 software program updates launched each six to eight weeks, Intel expects to proceed delivering efficiency developments and expanded mannequin protection in MLPerf benchmarks.
Concerning the Intel Xeon Outcomes
Intel submitted all seven inference benchmarks, together with GPT-J, on 4th Gen Intel Xeon Scalable processors. These outcomes present nice efficiency for general-purpose AI workloads, together with imaginative and prescient, language processing, speech and audio translation fashions, in addition to the a lot bigger DLRM v2 suggestion and ChatGPT-J fashions. Moreover, Intel stays the one vendor to submit public CPU outcomes with industry-standard deep studying ecosystem software program.
The 4th Gen Intel Xeon Scalable processor is right for constructing and deploying general-purpose AI workloads with the preferred AI frameworks and libraries. For the GPT-J 100-word summarization activity of a information article of roughly 1,000 to 1,500 phrases, 4th Gen Intel Xeon processors summarized two paragraphs per second in offline mode and one paragraph per second in real-time server mode.
For the primary time, Intel submitted MLPerf outcomes for Intel Xeon CPU Max Sequence, which offers as much as 64 gigabytes (GB) of high-bandwidth reminiscence. For GPT-J, it was the one CPU in a position to obtain 99.9% accuracy, which is essential for purposes for which the very best accuracy is of paramount efficiency.
Intel collaborated with its unique gear producer (OEM) clients to ship their very own submissions, additional showcasing AI efficiency scalability and large availability of general-purpose servers powered by Intel Xeon processors that may meet customer support degree agreements (SLAs).
About Intel
Intel (Nasdaq: INTC) is an {industry} chief, creating world-changing know-how that permits international progress and enriches lives. Impressed by Moore’s Regulation, we constantly work to advance the design and manufacturing of semiconductors to assist handle our clients’ best challenges. By embedding intelligence within the cloud, community, edge and each form of computing machine, we unleash the potential of knowledge to remodel enterprise and society for the higher.
Contact:
Tel: 312-360-5123
Fax: 312-601-4332
Electronic mail: internet.queries@computershare.com
Internet: www.computershare.com
(C) 2023 Digital Information Publishing, supply ENP Newswire
