MLCommons releases results from its latest MLPerf AI inference benchmark test


MLCommons today released the latest results of its MLPerf Inference benchmark test, which compares the speed of artificial intelligence systems from different hardware manufacturers.

MLCommons is an industry organization that develops open source AI tools. As part of its work, the organization conducts benchmark tests to compare the speed of different AI-optimized hardware systems. MLCommons benchmark tests help data center operators compare the performance of products from different vendors when purchasing new hardware.

Today MLCommons released the results of the latest edition of its MLPerf inferential test. MLPerf Inference was designed to compare how well a data center system performs inference, or the task of running an already trained AI model.

More than 20 companies participated in the last edition of the test. Attendees included Nvidia Corp., the leading provider of data center graphics processors, and Intel Corp. and several other major chipmakers.

The companies compared the speed of their AI systems by having them perform inference using six neural networks. The six neural networks each focus on a different use case, namely image classification, object recognition, medical image segmentation, speech-to-text, speech processing, and e-commerce recommendations.

Participants in the MLPerf Inference test achieved 5,300 individual performance scores, 37% more than the previous round. Participants also generated 2,400 measurements of the amount of power consumed by their systems while performing inference.

Nvidia’s flagship data center GPU, the H100, set several performance records during testing. The H100 (pictured) can perform certain inference tasks up to 30 times faster than Nvidia’s previous flagship data center GPU. It has more than 80 billion transistors, as well as a number of machine learning optimizations not found in the company’s previous products.

“Making its debut in the MLPerf industry-standard AI benchmarks, NVIDIA H100 Tensor Core GPUs set world records in inference across all workloads and delivered up to 4.5x the performance of previous generation GPUs,” said Dave Salvator , Senior Product Marketing Manager at Nvidia, detailed in a blog entry today. “The H100, also known as the Hopper, raised the bar for performance per accelerator across all six neural networks in the round.”

Compared to the previous generation of Nvidia’s flagship GPU, the H100 delivered the most significant increase in performance when running the BERT-Large Neural Network. BERT-large is a neural network optimized for natural language processing. It is based on the Transformer architecture, an approach to designing AI models that is widely used in the field of natural language processing.

Nvidia’s H100 chip contains a module specially optimized for running AI models based on the Transformer architecture. According to Nvidia, the module reduces the amount of data that neural networks have to process to produce results. The less data a neural network has to process to complete a calculation, the faster it can make decisions.

The H100 isn’t the only product that Nvidia evaluated in the MLPerf Inference test. The company also tested the speed of its Jetson Orin System-on-Chip, an energy-efficient processor designed to power robots. The processor delivered five times the peak performance of Nvidia’s predecessor while consuming half the power.

Image: Nvidia

Show your support for our mission by joining our community of experts, Cube Club and Cube Event. Join the community that includes Andy Jassy, ​​CEO of Amazon Web Services and, Michael Dell, Founder and CEO of Dell Technologies, Pat Gelsinger, CEO of Intel, and many more luminaries and experts.


Comments are closed.