GPU computing: Accelerate the learning curve in depth



[ad_1]

Artificial Intelligence (AI) may be what everyone is talking about, but to involve it is not simple. You will need a more than decent understanding of mathematics and theoretical data science, plus an understanding of neural networks and fundamentals of deep learning – not to mention a good working knowledge tools needed to turn these theories into models and practical applications. 19659002] You will also need an abundance of processing power – beyond what is required by the most demanding standard applications. One way to get this is via the cloud, but, because in-depth learning models can take days or even weeks to come with the goods, this can be extremely expensive. In this article, therefore, we will examine local alternatives and why the humble graphics controller is now the essential accessory for the potential AI developer.

Enter the GPU

If you & # 39; When reading this, it is safe to assume that you know what a Central Processing Unit (CPU) is and how powerful the latest Intel and AMD chips are. But if you're an AI developer, processors are not enough. They can perform the processing, but the amount of unstructured data that needs to be analyzed to build and train deep learning models can leave them to the maximum for weeks. Even multi-core processors struggle against deep learning, where the Graphics Processing Unit (GPU) comes in.

Again, you probably know GPUs very well. But to summarize, we're talking about specialized processors originally developed to handle complex image processing – for example, to allow us to watch high-definition movies or to participate in 3D multiplayer games or enjoy virtual reality simulations. GPUs are particularly suited to processing arrays – which processors have a hard time managing – and this is also suitable for specialized applications such as deep learning. In addition, much more specialized GPU cores can be crammed into the processing die than with a processor. For example, while with an Intel Xeon you can currently get up to 28 cores per socket, a GPU can have thousands – all capable of processing AI data simultaneously.

Because all these hearts are highly specialized, they can & # 39; t run an operating system or manage basic business logic, so you always need one or more processors. What these systems can do, however, is massively accelerating processes such as deep learning, offloading CPU processing involved to all GPU core systems.

The GPU in Practice

So much for the theory, When it comes to practice, there are a number of GPU vendors with products aimed at everything from gaming to the HPC specialty market (High Performance Computing) and AI. This market was developed by Nvidia with its Pascal GPU architecture, which has long been the model to follow for others.

In terms of actual products, you can enter the AI ​​for a nominal fee using a GPU. An Nvidia GeForce GTX 1060, for example, can be obtained for just £ 270 (including VAT), and offers 1,280 CUDA cores – the core Nvidia GPU technology. This sounds like a big problem, but in reality, it is far from enough to meet the needs of serious AI developers.

For professional use of AI, Nvidia has much more powerful and scalable GPUs based on its Pascal and newer technology. architecture, Volta, which integrates CUDA cores with Nvidia Tensor's new core technology specifically to meet in-depth learning. Tensor cores can provide up to 12 times the performance of peak teraflops (TFLOPS) of CUDA equivalents for deep learning and 6 times the flow for inference – when models of 39, deep learning are actually used.

The first Volta-based product is the Tesla V100, which features 640 new AI-specific tensor cores in addition to 5,120 HPC CUDA cores, all supported by 16GB or 32GB of 2nd generation HBM2 memory.

  gpustesla-v100-sxm.jpg

In addition to a PCIe adapter, the Tesla V100 is available as an SXM module to plug into NVIDIA's high-speed NVLink bus.


Image: Nvidia

The V100 is available as a standard PCIe adapter (starting at around £ 7,500) or as a smaller SXM module designed to be inserted into a special motherboard socket that, together with PCIe connectivity , allows for connected together using Nvidia's high-speed NVLink bus technology. Originally developed to support first-generation Tesla GPU products (based on Pascal), NVLink has since been upgraded to support up to six GPU-based links with a combined bandwidth of 300 GB / s. NVLink is also available with a new Quadra adapter and others based on Volta architecture; Also, this is the pace of change in this market, there is now a switched interconnection – NVSwitch – allowing to link up to 16 GPUs with a bandwidth of 2.4TB / sec.

AI

] Of course, GPUs themselves are not very useful, and when it comes to serious AI and other HPC applications, there are several ways to implement them. The first is to buy the individual GPUs plus all the other components needed to build a complete system and assemble yourself. However, few professional buyers will be happy to take the DIY route, most preferring to get a ready-made solution – and, more importantly, supported by the supplier – either from Nvidia or one of its partners.

Of course, all solutions ready to use use the same GPU technology but are used in different ways. So, to get an idea of ​​what's on offer, we looked at what Nvidia is selling and a Boston Supermicro alternative.

  gpusdgx-1-anna-voltarack.jpg "data-original =" https: / /zdnet2.cbsistatic.com/hub/i/2018/07/02/78830bce-ec69-4866-84ca-1a58f6ebe478/51692aaf23cb53b67217d0f6ffd017e9/ gpusdgx-1-anna-voltarack.jpg

Take your choice AI: Nvidia (bottom) and Boston (top) deep learning servers in the same rack .


Image: Alan Stevens / ZDNet

The Nvidia AI family

Nvidia wants to be known as the 'AI Computing Company'. and under its brand DGX sells a pair of servers (the DGX-1 and newer DGX-2 more powerful) plus an AI workstation (the DGX station), all built around Tesla V100 GPUs.

  gpusdgx-family.jpg "data-original =" https://zdnet4.cbsistatic.com/hub/i/2018/07/02/838696f9-0e82 -436-c32c-fd61a695a8b3 / a4411acfa82c80a8920151714e931eab / gpusdgx-family. jpg

The sleek Nvidia DGX range of AI-ready platforms is powered by Tesla VX100 GPUs.


Image: Nvidia

Delivered in distinctive gold-cracked housings, DGX Servers and Workstations are ready-to-use solutions that include both a standard hardware configuration and a built-in DGX software stack – a system that is easy to use. Ubuntu Linux operating preload plus a mix of frameworks and development tools required to build AI models

We first looked at the DGX-1 (recommended price of $ 149,000) which is located in a 3U rackmount chassis. Unfortunately, the one who was in the Boston lab was busy building real models, so apart from an outside shot, we could not take pictures of our own. From the others we have seen, however, we know that the DGX-1 is a fairly standard rackmount server with four redundant power supplies. It is also standard on the inside, with a conventional dual-socket motherboard equipped with a pair of Intel Xeon E5-2698 v4 20-core processors plus 512 GB of DDR4 RAM.

A 480 GB SSD is used for the operating system and DGX software stack, with a storage array consisting of four 1.92 TB SSDs for the data. Additional storage can be added if needed, while network connectivity is provided by four Mellanox InfiniBand EDR adapters plus a pair of 10GbE NICs. There is also a dedicated Gigabit Ethernet interface for IPMI remote management.

  gpusdgx-1-in-situ.jpg "data-original =" https://zdnet1.cbsistatic.com/hub/i/2018/07/02/786c30fa-e314-44a4-9366-8eb208758146 / 574458ba3727fbd0af8010b922ccf69f / gpusdgx-1-in-situ.jpg

Can not open the DGX-1 since it was busy training, but here it is difficult to work in Boston Laboratories Limited.


Image: Alan Stevens / ZDNet

The most important GPUs have their own home, on an NVLink card with eight sockets fully equipped with Tesla V100 SXM2 modules. The first version had only 16GB of dedicated HBM, but the DGX-1 can now be specified with 32GB modules

Whatever the memory configuration, with its eight GPUs, the DGX-1 has of 40,960 hearts work plus 5,120 Tensor hearts specific to the AI. According to Nvidia, this equates to 960 teraflops of AI computing power, which is said to make the DGX-1 the equivalent of 25 conventional server racks equipped with processors alone.

It should also be noted that the main Nvidia GPU Technologies. In addition, Tesla V100 GPUs are up to 3 times faster than Pascal-based P100 products with single CUDA cores.

DGX-1 buyers can also benefit from 24/7 support, up-to-date and on-site maintenance directly from Nvidia, although it's a bit expensive at $ 23,300 for a year or 66,500 $ for three years. Yet, given the complex demands of AI, many will see good value and in the UK, customers should expect to pay around £ 123,000 (HT) to get a fully DGX-1 equipped with annual support.

AI becomes personal

  gpusdgx-stationbench.jpg "height =" auto "width =" 370 "data-original =" https://zdnet4.cbsistatic.com/hub/i/r/ 2018/07/02 / 08690f1c -9127-4afa-97ed-f43a4b9accc9 / resize / 370xauto / 19cbd6a8a48becfd5911c41d0501bb01 / gpusdgx-stationbench.jpg

L & # Elegant DGX station on a bench in Boston The laboratories of Limited.


Image: Alan Stevens / ZDNet

Unfortunately, the new DGX-2 with 16 GPUs and the new NVSwitch have not been delivered in time for our review, but we had the opportunity to review the DGX station, which is designed to provide a more affordable platform and iterate the deep neural networks. This HPC workstation will also appeal to businesses looking for an AI development platform before switching to on-premise or cloud-based DGX servers.

Installed in a tower chassis, the DGX Station is based on an Asus motherboard with a single Xeon E5-2698 v4 20 cores instead of two as on the DGX-1 server. The system memory is also halved at 256 GB, and instead of eight GPUs, the DGX Station has four Tesla V100 modules implemented as PCIe cards but with a complete NVLink interconnection linking them together.

Storage is shared between a 1.92GB system SSD and an array of three similar disks for the data. Both 10GbE ports provide the necessary network connectivity and there are three DisplayPort interfaces for local displays up to 4K resolution. Water cooling is part of the norm and the end result is a very quiet and extremely impressive workstation.

  gpusdgx-stationinternals.jpg "data-original =" https://zdnet4.cbsistatic.com/hub/i/ 2018/07/02 / 702accc7-4a30-4029-818b-4726942521c1 / b2897bd3b28683949106bbfc41b0e95a / gpusdgx-stationinternals. jpg

We had the opportunity to see inside the DGX station at the elegant pace where there is only one. 39, a Xeon processor, 256 GB of RAM, four Tesla V100 GPUs and plenty of piping for water cooling.


Image: Alan Stevens / ZDNet

With half the number of GPUs, the DGX station provides 480 teraflops of computing power AI. That's probably half of what you get with the DGX-1 server, but it's a lot more affordable, with a price of $ 69,000 plus $ 10,800 for a 24/7 support year or $ 30,800 for three years.

British buyers will have to find around £ 59,000 (HT) for the hardware of an Nvidia partner with a one-year support contract, although we have seen a number of promotions – including one to buy one for free & # 39; offer! – which are worth the detour. Educational discounts are also available

Boston Anna Volta XL

The third product we reviewed was the recently launched Anna Volta XL from Boston. This is actually the equivalent of the Nvidia DGX-1 and is powered similarly by two Xeons plus eight Tesla V100 SXM2 modules. They are all configured inside a Supermicro rack-mounted server with many more customization options than the DGX-1.

  gpusanna-voltaproduct.jpg "data-original =" https://zdnet3.cbsistatic.com/hub/i/ 2018/07/02 / 3352f514-23b2-414a-92f6-7e80b055a06c / 3b7a2d5bddc7f2e2bc47cecf95b94622 / gpusanna-voltaproduct. jpg

The Anna Volta XL of Boston has two Xeon processors and eight Tesla V100 GPUs in a customizable Supermicro server platform.


Image: Supermicro

A little larger than the Nvidia server, the Anna Volta XL is a 4U platform with redundant power supplies (2 + 2) and separate drawers for the conventional CPU server and its GPU subsystem. Any Xeon with a TDP of 205W or less can be specified – including the latest Skylake processors, which Nvidia has yet to offer on its DGX-1 product.

  gpusanna-voltacpu-tray.jpg "data-original =" https: //zdnet4.cbsistatic.com/hub/i/2018/07/02/744292b2-83f3-46b7-9b96-34a0c2ecbb30/f1419352e0bf07bc1c6ce948fd07b981/gpusanna- voltacpu-tray.jpg

There are 24 DIMM slots available alongside the Xeons to take up to 3TB of DDR4 system memory and, for storage, sixteen 2.5-inch drive bays that can accommodate 16 SATA / SAS drives or 8 NVMe disks. The network connection is via two 10 GbE network ports with a dedicated port for IPMI remote management. You also have six PCIe slots (four in the GPU tray and two in the CPU tray), so you can add InfiniBand or Omni-Path connectivity as needed.

The GPU drawer is quite spartan, filled by a Supermicro NVLink motherboard with sockets for Tesla V100 SXM2 modules, each with a large heat sink on top. The performance of the GPU is naturally the same as for the DGX-1, although the overall system throughput depends on the CPU / RAM configuration of Xeon

  gpusanna-voltav100-modules.jpg "data-original =" https: / / zdnet4. cbsistatic.com/hub/i/2018/07/02/9e4de1dd-7961-48b5-8691-47178dd3d674/e4e2fb6601d377df2c2c4cab534bca6e/gpusanna-voltav100-modules.jpg

The all-important The Tesla V100 modules are mounted on an NVLink card in the top of the Boston Volta Anna server (one of the radiators has been removed for the photo).


Image: Alan Stevens / ZDNet

The price of Anna Volta is much lower than that of the Nvidia server: Boston quotes $ 119,000 for a similar specification to the DGX-1 (a savings of $ 30,000 on the current price). For UK buyers, this represents around £ 91,000 (excluding VAT). The AI ​​software stack is not included in the Boston price, but most of what is required is open source; Boston also offers a number of competitive maintenance and support services.

And that's pretty much all about this booming market. In terms of GPU hardware, there is really no difference between the products we reviewed, so it all depends on the preferences and the budget. And with other suppliers preparing to join the fray, prices are already starting to fall as demand for these specialized AI platforms increases

RECENT AND RELATED CONTENT ]

Nvidia reveals 32GB Titan V Nvidia makes a special 32GB edition of its most powerful PC graphics card, the Titan V.

Google Cloud extends its GPU portfolio with Nvidia Tesla V100
] Nvidia Tesla V100 GPUs are now publicly available in beta on Google Compute Engine and Kubernetes Engine.

Nvidia Expands the New GPU Cloud to HPC Applications
With more than 500 high-performance computing applications integrating GPU acceleration, Nvidia aims to make them easier to access.

The NVIDIA HGX-2 GPU combines IA and HPC for next-generation computing (TechRepublic)
NVIDIA's new GPU computing appliance to replace NVIDIA's 300 brings its accelerator Fastest GPU at IBM Cloud to Boost Workloads AI, HPC
(TechRepublic)
This combination can help businesses and data scientists create cloud-based native applications that generate new business value.

[ad_2]
Source link