NVIDIA RTX A5000 Professional Graphics Card

Product status: Official | Last Update: 2021-04-30 | Report Error
Overview
Manufacturer
NVIDIA
Original Series
Workstation Ampere
Launch Date
April 13th, 2021
Board Model
NVIDIA PG136/PG132
Graphics Processing Unit
GPU Model
GA102
Architecture
Ampere
Fabrication Process
8 nm (SAMSUNG 8N)
Die Size
628 mm2
Transistors Count
28.3B
Transistors Density
45.1M TRAN/mm2
CUDAs
8192
Tensor Cores
256
RT Cores
64
SM
64
TMUs
256
ROPs
64
Clocks
Base Clock
1697 MHz
Boost Clock
1697 MHz
Memory Clock
2000 MHz
Effective Memory Clock
16000 Mbps
Memory Configuration
Memory Size
24576 MB
Memory Type
GDDR6
Memory Bus Width
384-bit
Memory Bandwidth
768.0 GB/s
Physical
Interface
PCI-Express 4.0 x16
Width
11.17 cm
Length
26.67 cm
Height
2-slot
Power Connectors
1× 8-pin
TDP/TBP
230 W
Recommended PSU
600 W
Multi-GPU Support
2-way
Display Outputs
DisplayPort
4 × (DP 1.4a)
API Support
DirectX
12.2
Vulkan
1.2
OpenGL
4.6
OpenCL
3.0

Performance
Pixel Fillrate
108.6 GPixels/s
Texture Fillrate
434.4 GTexel/s
Peak FP32
27.8 TFLOPS
FP32 Perf. per Watt
120.9 GFLOPS/W
FP32 Perf. per mm2
44.3 GFLOPS/mm2




 ModelCoresBoost ClockMemory ClockMemory Config.
Thumbnail
NVIDIA RTX A6000
 
10752
 
1800 MHz
 
16 Gbps
 
48 GB G6 384b
Thumbnail
NVIDIA RTX A5000
 
8192
 
1697 MHz
 
16 Gbps
 
24 GB G6 384b
Thumbnail
NVIDIA RTX A4000
 
6144
 
1563 MHz
 
14 Gbps
 
16 GB G6 256b
 ModelCoresBoost ClockMemory ClockMemory Config.
Thumbnail
NVIDIA A40 TBC TBC TBC TBC
Thumbnail
NVIDIA RTX A6000
 
10752
 
1800 MHz
 
16 GB/s
 
48 GB G6 384b
Thumbnail
NVIDIA GeForce RTX 3090
 
10496
 
1695 MHz
 
19.5 GB/s
 
24 GB G6X 384b
Thumbnail
NVIDIA GeForce RTX 3080 Ti TBC TBC TBC TBC
Thumbnail
NVIDIA A10
 
9216
 
1695 MHz
 
12.5 GB/s
 
24 GB G6 384b
Thumbnail
NVIDIA GeForce RTX 3080
 
8704
 
1710 MHz
 
19 GB/s
 
10 GB G6X 320b
Thumbnail
NVIDIA GeForce RTX 3080 LHR
 
8704
 
1710 MHz
 
19 GB/s
 
10 GB G6X 320b
Thumbnail
NVIDIA CMP 90HX TBC TBC TBC TBC
Thumbnail
NVIDIA RTX A5000
 
8192
 
1697 MHz
 
16 GB/s
 
24 GB G6 384b

NVIDIA Ampere Architecture

The NVIDIA RTX A5000 is the one of the most powerful workstation GPUs NVIDIA offers, bringing high performance real-time ray tracing, AI-accelerated compute, and professional graphics rendering to demanding professionals. Building upon the major SM (Streaming Multiprocessor) enhancements from the Turing GPU, the NVIDIA Ampere architecture enhances ray tracing operations, tensor matrix operations, and concurrent executions of FP32 and INT32 operations.

CUDA Cores

The NVIDIA Ampere architecture-based CUDA cores bring up to 2x the single-precision floating point (FP32) throughput compared to the previous generation, providing significant performance improvements for graphics workflows such as 3D model development and compute for workloads such as desktop simulation for computer-aided engineering (CAE). The RTX A5000 enables two FP32 primary data paths, doubling the peak FP32 operations.

Second Generation RT Cores

Incorporating second generation ray tracing engines, NVIDIA Ampere architecture-based GPUs provide incredible ray traced rendering performance. A single RTX A5000 board can render complex professional models with physically accurate shadows, reflections, and refractions to empower users with instant insight. Working in concert with applications leveraging APIs such as NVIDIA OptiX, Microsoft DXR and Vulkan ray tracing, systems based on the RTX A5000 will power truly interactive design workflows to provide immediate feedback for unprecedented levels of productivity. The RTX A5000 is up to 2x faster in ray tracing compared to the previous generation. This technology also speeds up the rendering of ray-traced motion blur for faster results with greater visual accuracy.

Third Generation Tensor Cores

Purpose-built for deep learning matrix arithmetic at the heart of neural network training and inferencing functions, the RTX A5000 includes enhanced Tensor Cores that accelerate more datatypes, and includes a new Fine-Grained Structured Sparsity feature that delivers up to 2X throughput for tensor matrix operations compared to the previous generation. New Tensor Cores will accelerate two new TF32 and BFloat16 precision modes. Independent floating-point and integer data paths allow more efficient execution of workloads using a mix of computation and addressing calculations.

PCIe Gen 4

The RTX A5000 supports PCI Express Gen 4, which provides double the bandwidth of PCIe Gen 3, improving data-transfer speeds from CPU memory for data-intensive tasks like AI and data science.

Higher Speed GDDR6 Memory

Built with 24 GB GDDR6 memory delivering up to 71% greater throughput for ray tracing, rendering, and AI workloads than the previous generation. The RTX A5000 provides a capacious graphics memory footprint to address the largest datasets and models in latency-sensitive professional applications.

Error Correcting Code (ECC) on Graphics Memory

Meet strict data integrity requirements for mission critical applications with uncompromised computing accuracy and reliability for workstations.

Fifth Generation NVDEC Engine

NVDEC is well suited for transcoding and video playback applications for real-time decoding. The following video codecs are supported for hardware-accelerated decoding: MPEG-2, VC-1, H.264 (AVCHD), H.265 (HEVC), VP8, VP9, and AV1.

Seventh Generation NVENC Engine

NVENC can take on the most demanding 4K or 8K video encoding tasks to free up the graphics engine and the CPU for other operations. The RTX A5000 provides better encoding quality than software-based x264 encoders.

Graphics Preemption

Pixel-level preemption provides more granular control to better support time-sensitive tasks such as VR motion tracking.

Compute Preemption

Preemption at the instruction-level provides finer grain control over compute tasks to prevent long-running applications from either monopolizing system resources or timing out.

NVIDIA RTX IO

Accelerating GPU-based lossless decompression performance by up to 100x and 20x lower CPU utilization compared to traditional storage APIs using Microsoft’s new DirectStorage for Windows API. RTX IO moves data from the storage to the GPU in a more efficient, compressed form, and improving I/O performance.