Nvidia’s Ada Lovelace GPU generation has arrived for visual computing professionals. Nvidia has begun its transition to the next generation of GPU microarchitecture and products, again named after a technologist of yesteryear. Referred to by some as the world’s first computer programmer, Ada Lovelace was a 19th century mathematician who contributed to the evolution of the Babbage mechanical computer.
Succeeding the Ampere generation, Nvidia’s Ada Lovelace GPU microarchitecture focuses on traditional visual markets like CAD, heavily employing 3D graphics and rendering. Nvidia’s first product SKU for a new generation of GPU typically enters at the top end of the market, and Ada’s debut was no exception, showcased by the RTX 6000 Ada Generation GPU: a 300 watt, double-width full-length PCI Express add-in card supported by 48GB of onboard graphics memory.
Available initially in the $8,000 range, consuming two PCI Express slots and up to 300W of power, it’s not a product to fit the budgets and systems for the vast majority of the CAD user base. But as is the norm, a GPU like the RTX 6000 Ada Generation represents just the first incarnation of a new architecture and refreshed product line, and its capabilities will foreshadow the visual processing gains the mainstream of that user base will experience as Ada technology propagates down the RTX GPU line.
The dual-slot RTX 6000 Ada Generation, the first professional RTX GPU built on Ada Lovelace. Image source: Nvidia.
The Ada Generation Architecture and Technology
Big picture, Nvidia’s Ada Generation implements about 2X the raw hardware capabilities of the company’s previous generation focused on visual processing, Ampere — a figure now measured across three hardware metrics, not just one. As described in depth with the introduction of Turing (just prior to Ampere, the predecessor of Ada), Nvidia’s GPUs are allocating dedicated hardware acceleration not just for traditional raster-based 3D graphics, but for physically based rendering, via the tracing of rays through a scene. Prior to Turing, GPUs had most overwhelmingly been measured by the ability of its shader architecture (supported by things like memory and I/O, of course) to crank through conventional 3D graphics processing. But Turing’s introduction of both engines to specifically accelerate ray-tracing and machine learning — RT cores (with those two initials the foundation of the RTX brand) and Tensor Cores, respectively. And, it’s worth emphasizing that those tensor cores also serve an impactful role in speeding the resolve and refinement stages in rendering, as first covered in this column here.
Specifically, like all generations before it, Ada products are built from a foundation of not just one but several versions of silicon chips implementing the architecture. The first chip designed and fabricated is typically the biggest (most transistors and on-chip data storage and processing units) and most powerful, and that’s the case with Ada, as the flagship AD102 chip is the engine upon with the RTX 6000 Ada Generation add-in card is built.
GPU |
RTX 6000 Ada Generation |
RTX A6000 |
Architecture |
Ada |
Ampere |
Flagship* GPU chip |
AD102 |
GA102 |
Process Size |
TSMC 4N NVIDIA Custom Process |
Samsung 8 nm 8N NVIDIA Custom Process |
Transistors |
76.3 billion |
28.3 billion |
Die Size |
608.4 mm2 |
628.4 mm2 |
CUDA Parallel Processing Cores |
18,176 |
10,752 |
Nvidia Tensor Cores |
568 |
336 |
Nvidia RT Cores |
142 |
84 |
GPU Memory |
48GB GDDR6 with ECC |
48GB GDDR6 with ECC |
Memory Interface |
384-bit |
384-bit |
Memory Bandwidth |
960GB/s |
768GB/s |
Max Power Consumption |
300W |
300W |
I/O Interface |
PCI Express 4.0 x16 |
PCI Express 4.0 x16 |
Display Connectors |
DP 1.4 (4) |
DP 1.4 (4) |
Form Factor |
4.4” H x 10.5” L Dual Slot |
4.4” H x 10.5” L Dual Slot |
* Usually the biggest, most powerful and first-to-market, with cost-reduced derivatives following.
Salient hardware metrics for the flagship CPUs behind the RTX 6000 Ada generation GPU and its predecessor, the RTX A6000. Data source: Nvidia.
Ada Propagates Across the RTX Product Lines for Fixed and Mobile Workstations
Nvidia has at least four more Ada chip derivatives already in production, trimmed down from the flagship AD102: the AD103, AD104, AD106, and AD107 with resources scaled down from the AD102’s 18432 shader cores to 10240, 7680, 4608, and 3702, respectively. In fact, one of those was recently tapped to create a lower-cost RTX add-in card sibling, the RTX 4000 SFF (indicating a card size optimized for Small Form Factor fixed workstations).
Built around the AD104 and available with a street price around $1,400 currently, the RTX 4000 SFF — like its 4000-class predecessors — sits just above the mainstream mid-range of the market, though I can guarantee its sales volume will show it’s far from some boutique SKU. The RTX 4000 SFF price will probably drift closer to $1,000 in coming months, while Nvidia likely adds some more SKUs below (such as 3000, 2000, or 1000) and probably a 5000 between the 6000 and 4000. Ada for fixed workstations should be available to the breadth of the CAD community in relatively short order.
Benchmarking the RTX 6000 Ada Generation — An Indication of What’s to Come for the CAD Mass Market
Of course, remember that while a simple blanket statement like, “2X the raw hardware performance,” will give a general indication of how much more real-world performance a component can deliver for the workloads the end-user cares about, that indication is by no means a definitive one. First off, the theoretical maximum throughput of any component depends on how well those internal hardware resources are being utilized. So while a device may have 100% more of the aggregate capabilities — like processing cores, cache, registers, for example — in reality that 100% might translate closer to 50% higher performance running with actual data and applications (and worth adding, I would deem a 50% measurable gain a very commendable generation-to-generation achievement). Second, even if the GPU can manage to run its visual processing at a 50% higher clip, that metric won’t mean anything if other links in the computing chain — CPU, memory, storage, OS, for example — are bogging down.
Still, with those caveats in mind, the best way to ascertain a new GPU’s potential, particularly in comparison to previous generations, is still benchmarking. When it comes to more traditional, interactive 3D graphics, SPEC’s Viewperf (in latest version 2020) remains the go-to benchmark for CAD and other applications heavy in professional visual processing. It will generate real-time 3D graphic scenes typical in interactive design, using sample viewsets from applications including CATIA, Solidworks, Creo, Siemens NX, and 3ds max, pulled directly from real-world projects in manufacturing, design, engineering, and architecture.
A sampling of SPECViewperf2020’s CAD-oriented viewsets, including from CATIA and Solidworks. Image source: SPEC.
Testing with SPECViewperf2020 yielded the following results, run on the same high-performance system, swapping in both the RTX 6000 Ada Generation and previous Ampere-generation RTX A6000, while including the A6000’s predecessor Quadro RTX 6000 for context. The RTX 6000 Ada Generation GPU ran through the CAD-oriented viewsets on average 37% faster than the RTX A6000, with individual speedups ranging from 12% up to 68%.
SpecViewperf 2020 results for CAD-relevant viewsets, normalized to the Ampere Generation RTX A6000. Click image to enlarge.
To reiterate, the RTX 6000 Ada Generation won’t end up in many CAD users’ machines, but it’s very reasonable to expect that upcoming lower-cost Ada-spawned GPUs, like the just-released RTX 4000 SFF and inevitable SKUs to come below on the price curve, will exhibit very similar performance improvements over their respective Ampere predecessor SKUs. That is, a solid bump in the neighborhood of the RTX 6000 Ada Generation’s 37% bump in 3D graphics throughput is something all CAD users will likely see with new workstations or GPU upgrades in the coming months as well.
3D Graphics is No Longer the Only, or Even Most Critical, Visual Processing Workload: See How Much the Ada Generation Improves GPU Rendering, Compute, and Machine-Learning
While still the foundation of most visual computing workflows, 3D graphics is no longer the only GPU function to assess. With the advent of on-chip ray-tracing hardware, along with the GPU’s ever-improving aptitude in GPU computation and machine learning, there’s more value to be exploited in a next-generation product.
Let’s start with arguably the next most valued function, GPU rendering. With additional processing resources across the board — programmable shaders, RT cores, and Tensor cores — we should expect a healthy increase in rendering throughput. Results testing both the RTX 6000 Ada Generation and previous-gen RTX A6000 confirms as much, with the former outperforming the latter by 79% on the popular OctaneBench benchmark (from Otoy’s OctaneRender). Moreover, with OctaneRender able to exploit multiple GPUs, I was able to test the performance of a pair of the A6000s, finding that one RTX 6000 Ada Generation nearly matches the capabilities of two of its predecessors.
OctaneBench rendering benchmark results for the RTX 6000 Ada generation, normalized to the RTX A6000. Click image to enlarge.
A Timely Datapoint: CPU vs. GPU Rendering
Those relying on true physically based rendering to achieve the ultimate image quality for virtual prototypes may not be aware of the hardware that generating those images. Rendering options and add-ons in common CAD applications don’t make it clear whether it’s the CPU or the GPU doing the work behind the scenes. Looking back to the origins of computational rendering, it’s always been the CPU, while the “fixed function” GPUs of the 90’s and 00’s focused exclusively on 3D graphics patterned after APIs like OpenGL and DirectX. Those GPUs couldn’t render, so popular studio-quality applications like Renderman performed everything in software, running on the CPU.
But over time, the GPU gradually evolved in a more programmable device, letting it take on non-graphics workloads, including rendering. Nvidia’s subsequent move (first in Turing) to incorporate dedicated RT and Tensor cores took that rendering capability to dramatic new levels, positioning an RTX-class GPU as the premier engine for optimized raytraced processing. Today, insight into which device does the rendering — and the performance levels it’s capable of — is front and center when configuring a CAD machine that will be rendering more often. Still render fairly rarely with modest-complexity scenes? Or has HQ rendering become a must-have and frequent staple in your workflow? For a CAD buyer looking to select the appropriate combination of CPU and GPU in a new workstation, it sure would be nice to have some bearing on their respective rendering capabilities.
Cue the benchmark provided for Blender, a popular open-source visual processing application that incorporates a high-quality renderer capable of harnessing either the CPU or compatible GPU, if available in the system. With access to Intel’s latest 56-core Xeon W9-3495X, the vendor’s fastest processor for multi-threaded applications (like rendering), I had the ideal opportunity to compare the rendering performance a new, top-end GPU like the with a new, top-end CPU.
Blender Cycles rendering benchmark results for the RTX 6000 Ada generation and RTX A6000 GPUs, normalized to the 56-core Xeon W-3495X CPU. Click image to enlarge.
Remember of course that CPUs and GPUs are apples and oranges, with different computing responsibilities, and we shouldn’t expect a general-processing device like a CPU to outperform a device optimized for any specific algorithm. The value of the CPU is that, as the essential heart of any machine, it can do anything, not that it necessarily should do everything the best. If you buy a workstation with a minimal GPU you still have a CPU that can render should the need arise. However, the above testing illustrates why the selection criteria of a GPU for a modern CAD workflow — one that leans more heavily on rendering than the past — should weight its superior rendering performance.
Another Datapoint on GPU Compute Performance with Ada
It might seem counter-intuitive to refer to rendering above as a “non-graphics” workload. But despite the similar objective of producing 3D images, physically based (typically ray-traced) rendering and raster-based 3D graphics are quite different in their base algorithms. Accordingly, rendering is better classified as a “GPU compute” workload, one of many that have emerged to take advantage of the GPU’s aptitude in processing highly-parallel, floating-point intensive algorithms. For a more detailed dive into the dramatic differences in 3D rendering versus 3D graphics approaches, check out this previous column covering the launch of RTX in Nvidia’s Volta generation GPU, as well as this follow-up introducing RT cores in Turing.
Other key CAD-relevant algorithms amenable to GPU Compute processing (and covered in this column several times in the past) include those common in simulation, including FEA and CFD. To ascertain how much faster Ada can tackle such workloads compared to the previous Ampere, I turned again to SPECworkstation, which incorporates a set of GPU Compute sub-tests, including sequences of rendering (LuxRender), machine learning (Caffe), and molecular biology (Folding@home). On those tests, the RTX 6000 Ada Generation outperformed its RTX A6000 predecessor by 52%, 40% and 38% (respectively), all ample performance gains. (Worth noting, LuxRender does not use Nvidia’s CUDA GPU Compute library for optimal performance, but in this context, its performance gain relative to the previous RTX SKU is a reasonable metric to consider.)
SPECworkstation 3.0.4 GPU Compute benchmark results, normalized to the RTX A6000. Click image to enlarge.
Ada Quick to Roll Out for Mobile Workstations
It turns out that buyers of mobile workstations didn’t have to wait long at all for Ada. In a more recent twist to Nvidia GPU rollout routine, Ada has found its way across mobile GPU offerings more quickly than for its add-in cards targeting fixed workstations. Checking out the specifications for the new Ada mobile RTX GPUs, it’s clear that all the new parts, from the RTX A5500 down, are built (as expected) not from the biggest chip, the flagship AD102, but the size, cost and power-reduced versions AD103, AD104, AD106 and AD107. The bottom two price/performance points are filled out by previous generation Ampere.
|
RTX A5500 |
RTX A4500 |
RTX A4000 |
RTX 3000 |
RTX 2000 |
RTX A1000 |
RTX A500 |
GPU Architecture |
Ada |
Ada |
Ada |
Ada |
Ada |
Ampere |
Ampere |
CUDA Cores |
9,728 |
7,424 |
5,120 |
4,608 |
3,072 |
2,560 |
2,048 |
Tensor Cores |
304 |
232 |
160 |
144 |
96 |
80 |
64 |
RT Cores |
76 |
58 |
40 |
36 |
24 |
20 |
16 |
Graphics Memory (GDDR6 with ECC) |
16GB |
12GB |
12GB |
8GB |
8GB |
6GB |
4GB |
Memory Bandwidth (peak) |
576GB/s |
432GB/s |
432GB/s |
256GB/s |
256GB/s |
168GB/s |
112GB/s |
Nvidia’s 2023 line-up of RTX mobile workstation GPUs. Data source: Nvidia.
The High-End RTX 6000 Ada Generation Portends Good Things for CAD Visualization Opportunities Across the Board
The RTX 6000 Ada Generation add-in card for fixed workstations, along with the RTX 4000 SFF and the breadth of the Ada generation mobile RTX line, are being rolled into OEMs’ workstation models (many in conjunction with the upgrade to Sapphire Rapids Xeon covered in the last two columns). Paired with a price tag in the neighborhood of $8,000, the RTX 6000 Ada Generation benchmarked here will find homes in only a few of the highest-demand workstation applications.
However, its benchmark scores, compared in apples-to-apples context with the previous RTX 6000 clearly indicates a very healthy performance gain measured across all key metrics: not only for the 3D graphics that remains the core of CAD, but in GPU rendering and computation that are continuing the transition from nicety to must-have in modern workflows. The percentage gains one could expect going from 6000 Ampere to 6000 Ada will likely be reflected in the same upgrades in the current 4000, future (presumably) 3000 and 2000 tiers, priced at mainstream price points and shipping broadly across all workstation OEMs’ mobile and fixed models. The incessant pace of visual computing performance marches on, and CAD users and workflows will reap the rewards.
Alex Herrera
With more than 30 years of engineering, marketing, and management experience in the semiconductor industry, Alex Herrera is a consultant focusing on high-performance graphics and workstations. Author of frequent articles covering both the business and technology of graphics, he is also responsible for the Workstation Report series, published by Jon Peddie Research.
View All Articles
Share This Post