NVIDIA DGX-1 (Pascal) Professional Computing Solution

8x Tesla P100

Product status: Official | Last Update: 2020-05-14 | Report Error
Overview
Manufacturer
NVIDIA
Original Series
Tesla Pascal
Release Date
April 5th, 2016
Graphics Processing Unit
GPU Model
8× GP100
Architecture
Pascal
Fabrication Process
16 nm
Die Size
8× 610 mm2
Transistors Count
8× 15.3B
Transistors Density
25.1M TRAN/mm2
CUDAs
8× 3584 (28672)
SMP
8× 56 (448)
GPCs
8× 6 (48)
TMUs
8× 224 (1792)
ROPs
8× 96 (768)
Clocks
Base Clock
1328 MHz
Boost Clock
1480 MHz
Memory Clock
704 MHz
Effective Memory Clock
1408 Mbps
Memory Configuration
Memory Size
8× 16384 (131072) MB
Memory Type
HBM2
Memory Bus Width
8× 4096 (32768)-bit
Memory Bandwidth
8× 720.9 (5767.2) GB/s

Physical
Interface
SXM 1.0
TDP/TBP
3200 W
API Support
DirectX
12.0
Vulkan
1.0
OpenGL
4.5
OpenCL
3.0

Performance
Pixel Fillrate
1.1 TPixels/s
Texture Fillrate
2.7 TTexel/s
Peak FP32
84.9 TFLOPS
Peak FP64
42.4 TFLOPS
FP32 Perf. per Watt
26.5 GFLOPS/W
FP32 Perf. per mm2
17.4 GFLOPS/mm2




 ModelCoresBoost ClockMemory ClockMemory Config.
Thumbnail
NVIDIA DGX-1 (Pascal)
 
28672
 
1480 MHz
 
1.4 Gbps
 
1024 GB HB2 4096b
Thumbnail
NVIDIA Tesla P40
 
3840
 
1531 MHz
 
7.2 Gbps
 
24 GB GD5 384b
Thumbnail
NVIDIA Tesla P100 SMX2
 
3584
 
1480 MHz
 
1.4 Gbps
 
16 GB HB2 4096b
Thumbnail
NVIDIA Tesla P100 PCIe
 
3584
 
1328 MHz
 
1.4 Gbps
 
16 GB HB2 4096b
Thumbnail
NVIDIA Tesla P4
 
2560
 
1075 MHz
 
6 Gbps
 
8 GB GD5 256b
 ModelCoresBoost ClockMemory ClockMemory Config.
Thumbnail
NVIDIA DGX-1 (Pascal)
 
28672
 
1480 MHz
 
1.4 GB/s
 
1024 GB HB2 4096b
Thumbnail
NVIDIA Quadro GP100
 
3584
 
1442 MHz
 
700 MB/s
 
16 GB HB2 4096b
Thumbnail
NVIDIA Tesla P100 SMX2
 
3584
 
1480 MHz
 
1.4 GB/s
 
16 GB HB2 4096b
Thumbnail
NVIDIA Tesla P100 PCIe
 
3584
 
1328 MHz
 
1.4 GB/s
 
16 GB HB2 4096b

NVIDIA today unveiled the NVIDIA DGX-1, the world’s first deep learning supercomputer to meet the unlimited computing demands of artificial intelligence. The NVIDIA DGX-1 is the first system designed specifically for deep learning — it comes fully integrated with hardware, deep learning software and development tools for quick, easy deployment. It is a turnkey system that contains a new generation of GPU accelerators, delivering the equivalent throughput of 250 x86 servers.

The DGX-1 deep learning system enables researchers and data scientists to easily harness the power of GPU-accelerated computing to create a new class of intelligent machines that learn, see and perceive the world as humans do. It delivers unprecedented levels of computing power to drive next-generation AI applications, allowing researchers to dramatically reduce the time to train larger, more sophisticated deep neural networks.

NVIDIA designed the DGX-1 for a new computing model to power the AI revolution that is sweeping across science, enterprises and increasingly all aspects of daily life. Powerful deep neural networks are driving a new kind of software created with massive amounts of data, which require considerably higher levels of computational performance.”Artificial intelligence is the most far-reaching technological advancement in our lifetime,” said Jen-Hsun Huang, CEO and co-founder of NVIDIA. “It changes every industry, every company, everything. It will open up markets to benefit everyone. Data scientists and AI researchers today spend far too much time on home-brewed high performance computing solutions. The DGX-1 is easy to deploy and was created for one purpose: to unlock the powers of superhuman capabilities and apply them to problems that were once unsolvable.”

Powered by Five Breakthroughs
The NVIDIA DGX-1 deep learning system is built on NVIDIA Tesla P100 GPUs, based on the new NVIDIA Pascal GPU architecture. It provides the throughput of 250 CPU-based servers, networking, cables and racks — all in a single box.

The DGX-1 features four other breakthrough technologies that maximize performance and ease of use. These include the NVIDIA NVLink high-speed interconnect for maximum application scalability; 16nm FinFET fabrication technology for unprecedented energy efficiency; Chip on Wafer on Substrate with HBM2 for big data workloads; and new half-precision instructions to deliver more than 21 teraflops of peak performance for deep learning.

Together, these major technological advancements enable DGX-1 systems equipped with Tesla P100 GPUs to deliver over 12x faster training than four-way NVIDIA Maxwell architecture-based solutions from just one year ago.

The Pascal architecture has strong support from the artificial intelligence ecosystem.
“NVIDIA GPU is accelerating progress in AI. As neural nets become larger and larger, we not only need faster GPUs with larger and faster memory, but also much faster GPU-to-GPU communication, as well as hardware that can take advantage of reduced-precision arithmetic. This is precisely what Pascal delivers,” said Yann LeCun, director of AI Research at Facebook.

Andrew Ng, chief scientist at Baidu, said: “AI computers are like space rockets: The bigger the better. Pascal’s throughput and interconnect will make the biggest rocket we’ve seen yet.”

“Microsoft is developing super deep neural networks that are more than 1,000 layers,” said Xuedong Huang, chief speech scientist at Microsoft Research. “NVIDIA Tesla P100’s impressive horsepower will enable Microsoft’s CNTK to accelerate AI breakthroughs.”

Comprehensive Deep Learning Software Suite
The NVIDIA DGX-1 system includes a complete suite of optimized deep learning software that allows researchers and data scientists to quickly and easily train deep neural networks.

The DGX-1 software includes the NVIDIA Deep Learning GPU Training System (DIGITS), a complete, interactive system for designing deep neural networks (DNNs). It also includes the newly released NVIDIA CUDA Deep Neural Network library (cuDNN) version 5, a GPU-accelerated library of primitives for designing DNNs.
It also includes optimized versions of several widely used deep learning frameworks — Caffe, Theano and Torch. The DGX-1 additionally provides access to cloud management tools, software updates and a repository for containerized applications.

System Specifications
The NVIDIA DGX-1 system specifications include:

  • Up to 170 teraflops of half-precision (FP16) peak performance
  • Eight Tesla P100 GPU accelerators, 16GB memory per GPU
  • NVLink Hybrid Cube Mesh
  • 7TB SSD DL Cache
  • Dual 10GbE, Quad InfiniBand 100Gb networking
  • 3U – 3200W

Optional support services for the NVIDIA DGX-1 improve productivity and reduce downtime for production systems. Hardware and software support provides access to NVIDIA deep learning expertise, and includes cloud management services, software upgrades and updates, and priority resolution of critical issues. More information is available here.

Availability
General availability for the NVIDIA DGX-1 deep learning system in the United States is in June, and in other regions beginning in the third quarter direct from NVIDIA and select systems integrators.