Home / Technology / MLPerf: Google's cloud TPUs and Nvidia's IA Tesla V100 training recordings

MLPerf: Google's cloud TPUs and Nvidia's IA Tesla V100 training recordings



Nvidia and Google Cloud set performance records in AI training time, according to the latest results from the MLPerf reference group. Benchmarks help AI practitioners adopt common standards for measuring the performance and speed of the hardware used to train AI models.

MLPerf v0.6 examines the training performance of machine learning acceleration hardware in 6 common usage categories. Among the results announced today: Nvidia's Tesla V100 Tensor Core GPUs have used a Nvidia DGX SuperPOD to complete the ResNet-50 on-site training for image classification in 80 seconds. In contrast, the same task using a DGX-1 station in 2017 took 8 hours to complete model training. The reinforcement learning with Minigo, an open source implementation of the AlphaGoZero model, took place in 13.5 minutes, which is also a new record.

At Nvidia, the latest results of the training index are mainly the result of advances in software.

"In the space of just seven months on the same DGX-2 station, our customers can now benefit from performance up to 80% better, which is explained by all the software improvements. , all the work done by our ecosystem, "said a spokesman for the company. says in a phone call.

The Google Cloud TPU v3 pods also demonstrated record results in terms of automatic translation from English to German of the Transformer model in 51 seconds. The TPU modules also achieved record performance in the ResNet-50 image classification repository with the ImageNet dataset, and were trained in another object detection category in 1 minute and 12 seconds.

The Google Cloud TP3 v3 pods that can leverage the power of more than 1,000 TPU chips were made available for the first time in public beta in May.

Submissions to the latest round of comparative training tests were made by Intel, Google and Nvidia. Nvidia and Google have demonstrated that they are making some of the world's fastest hardware for the training of artificial intelligence models when MLPerf shared the first benchmark results of the training in December 2018.

This news follows the launch of MLPerf's reference inference tests for computer vision and language translation last month. The results of the first MLPerf inference reference will be reviewed in September and released to the public in October, said the co-chair of the MLPerf inference working group, David Kanter, at VentureBeat at the time of writing. ;a telephone interview.

MLPerf is a group of 40 organizations that play a key role in the creation of hardware and artificial intelligence models, such as Amazon, Arm, Baidu, Google and Microsoft.


Source link