NVIDIA RTX A4000
Sleek Design. Powerful Performance.
The NVIDIA Ampere architecture builds on the power of NVIDIA RTX to deliver the next generation of accelerated visual computing. As millions of professionals continue to work from anywhere, they rely on a wide range of devices to deliver the power and performance they need to work effectively.
The NVIDIA RTX A4000 is the most powerful single-slot GPU for professionals, delivering real-time ray tracing, AI-accelerated compute, and high-performance graphics performance to your desktop. Built on the NVIDIA Ampere architecture, the RTX A4000 combines 48 second-generation RT Cores, 192 third-generation Tensor Cores, and 6144 CUDA cores with 16 GB of graphics memory. So you can engineer next-generation products, design cityscapes of the future, and create immersive entertainment experiences of tomorrow, today, from your desktop workstation. And with a power-efficient, single-slot PCIe form factor that fits into a wide range of workstation chassis, you can do exceptional work without limits.
Incredible Application Performance
Experience fast, interactive performance powered by the latest NVIDIA Ampere architecture-based GPU with ultra-fast, on-board graphics memory technology and optimized software drivers for professional applications.
The NVIDIA RTX A4000 includes 48 RT Cores to accelerate photorealistic ray-traced rendering up to 2x faster than the previous generation. Hardware accelerated Motion BVH (bounding volume hierarchy) improves motion blur rendering performance by up to 7x when compared to previous generation.
With 192 Tensor Cores to accelerate AI workflows, the RTX A4000 provides the compute power necessary for AI development and training workloads, as well as inferencing deployments.
Ensure hardware compatibility and stability through NVIDIA support of the latest OpenGL, DirectX, Vulkan, and CUDA standards, deep independent software vendor (ISV) developer engagements, and certification with over 100 professional software applications.
NGC support gives engineers, researchers, and data scientists access to NVIDIA-tuned, tested, certified, and maintained containers for the top deep learning frameworks, as well as third-party managed high-performance computing (HPC) containers, NVIDIA HPC visualization containers, and partner applications.
Rich, Expansive Visual Workspace
Experience stunning imagery through movie-quality, anti-aliasing techniques, high-dynamic range (HDR) color support, higher refresh rates, and up to 8K screen resolution at 60 Hz from a single cable with the DisplayPort 1.4a standard. Using Display Stream Compression (DSC).
Enhance your desktop workspace experience with NVIDIA RTX Desktop Manager and NVIDIA Mosaic technology. Work across four displays on every NVIDIA RTX professional card with intuitive placement of windows, multiple virtual desktops, and user profiles.
Use advanced multi-display technologies like Quadro Sync II, NVIDIA Mosaic, and Warp and Blend to synchronize images and scale resolution on a display surface with multiple projectors or screens.
Value for IT Administrators
Experience higher-quality products driven by power-efficient hardware and components selected for optimum operational performance, durability, and longevity.
Remotely monitor and manage NVIDIA professional products in your enterprise by integrating the NVIDIA Enterprise Management Toolkit (NVWMI) in your IT asset management framework.
Scale up NVIDIA RTX Enterprise driver deployment to hundreds of workstations using NVWMI's powerful driver installer.
Simplify software driver deployment through a regular cadence of long-life, stable driver releases based on a robust feature-development and quality-assurance process.
NVIDIA Ampere Architecture
The NVIDIA RTX A4000 is the one of the most powerful workstation GPUs NVIDIA offers, bringing high performance real-time ray tracing, AI-accelerated compute, and professional graphics rendering to demanding professionals. Building upon the major SM (Streaming Multiprocessor) enhancements from the Turing GPU, the NVIDIA Ampere architecture enhances ray tracing operations, tensor matrix operations, and concurrent executions of FP32 and INT32 operations.
The NVIDIA Ampere architecture-based CUDA cores bring up to 2x the single-precision floating point (FP32) throughput compared to the previous generation, providing significant performance improvements for graphics workflows such as 3D model development and compute for workloads such as desktop simulation for computer-aided engineering (CAE). The RTX A4000 enables two FP32 primary data paths, doubling the peak FP32 operations.
Second Generation RT Cores
Incorporating second generation ray tracing engines, NVIDIA Ampere architecture-based GPUs provide incredible ray traced rendering performance. A single RTX A4000 board can render complex professional models with physically accurate shadows, reflections, and refractions to empower users with instant insight. Working in concert with applications leveraging APIs such as NVIDIA OptiX, Microsoft DXR and Vulkan ray tracing, systems based on the RTX A4000 will power truly interactive design workflows to provide immediate feedback for unprecedented levels of productivity. The RTX A4000 is up to 2x faster in ray tracing compared to the previous generation. This technology also speeds up the rendering of ray-traced motion blur for faster results with greater visual accuracy.
Third Generation Tensor Cores
Purpose-built for deep learning matrix arithmetic at the heart of neural network training and inferencing functions, the RTX A4000 includes enhanced Tensor Cores that accelerate more datatypes, and includes a new Fine-Grained Structured Sparsity feature that delivers up to 2X throughput for tensor matrix operations compared to the previous generation. New Tensor Cores will accelerate two new TF32 and BFloat16 precision modes. Independent floating-point and integer data paths allow more efficient execution of workloads using a mix of computation and addressing calculations.
PCIe Gen 4
The RTX A4000 supports PCI Express Gen 4, which provides double the bandwidth of PCIe Gen 3, improving data-transfer speeds from CPU memory for data-intensive tasks like AI and data science.
Higher Speed GDDR6 Memory
Built with 16 GB GDDR6 memory delivering up to 23% greater throughput for ray tracing, rendering, and AI workloads than the previous generation. The RTX A4000 provides a capacious graphics memory footprint to address the largest datasets and models in latency-sensitive professional applications.
Error Correcting Code (ECC) on Graphics Memory
Meet strict data integrity requirements for mission critical applications with uncompromised computing accuracy and reliability for workstations.
Fifth Generation NVDEC Engine
NVDEC is well suited for transcoding and video playback applications for real-time decoding. The following video codecs are supported for hardware-accelerated decoding: MPEG-2, VC-1, H.264 (AVCHD), H.265 (HEVC), VP8, VP9, and AV1.
Seventh Generation NVENC Engine
NVENC can take on the most demanding 4K or 8K video encoding tasks to free up the graphics engine and the CPU for other operations. The RTX A4000 provides better encoding quality than software-based x264 encoders.
Pixel-level preemption provides more granular control to better support time-sensitive tasks such as VR motion tracking.
Preemption at the instruction-level provides finer grain control over compute tasks to prevent long-running applications from either monopolizing system resources or timing out.
NVIDIA RTX IO
Accelerating GPU-based lossless decompression performance by up to 100x and 20x lower CPU utilization compared to traditional storage APIs using Microsoft's new DirectStorage for Windows API. RTX IO moves data from the storage to the GPU in a more efficient, compressed form, and improving I/O performance.
Compatible in all systems that accept an NVIDIA RTX A4000
Architecture NVIDIA Ampere Architecture
Process Size 8nm
Transistors 17.4 Billion
Die Size 392.5 mm2
CUDA Cores 6144
Tensor Cores 192
RT Cores 48
Single Precision Performance 19.2 TFLOPS
RT Core Performance 37.4 TFLOPS
Tensor Performance 153.4 TFLOPS
GPU Memory 16 GB GDDR6 with ECC
Memory Interface 256-bit
Memory Bandwidth 448 GB/sec
Display Connectors 4x DisplayPort 1.4a
NVENC | NVDEC1x | 1x (+ AV1 decode)
System Interface PCI Express 4.0 x16
Form Factor 4.4 H x 9.5 L Single Slot
Thermal SolutionActive Fansink
Maximum Power Consumption 140 W
NVIDIA Quadro and RTX Power Guidelines
Power Connector 1x 6-pin PCIe
Max Digital Resolution 7680 x 4320 x36 bpp at 60 Hz
NVIDIA Quadro and RTX Display Resolution Support
NVIDIA 3D Vision and 3D Vision ProSupport via 3-pin mini DIN
Frame LockOptional NVIDIA Quadro Sync II