As I’ve touched on several times in this column, the era of multicore processors has not delivered universal benefits across all computing applications. That transition from a near-exclusive strategy of speeding execution of a single thread to one that accelerates aggregate throughput by deploying a multitude of processing cores to execute multiple threads in parallel has not been the tide that raises all boats. Among the most notable boats left behind are key workloads in CAD computing, which tends to exhibit more single-thread-limited code than many other high-demand computing applications. Parametric modeling, the cornerstone of many a workflow, is inherently sequential and not amenable to much speedup on more than one processing core.
That reality is one reason — though neither the only nor dominant reason — that Intel has held onto its position as the premier supplier of CPUs in the workstation and gaming segments of the market. The company has long managed, through both architecture and silicon manufacturing, to consistently deliver higher single-thread performance than its rivals, most notably including AMD. Also covered recently and often in this column, though, Intel's de facto monopoly on workstation-caliber computing is now being challenged — more aggressively and effectively than it has in over a decade — by a revitalized AMD riding the success of its Zen processor technology.
Still, the last of Intel's advantages to mitigate (and the one proving to be the most elusive) was the one that inordinately affects CAD performance: single-thread throughput. With the emergence of AMD’s Zen 3 microarchitecture on the Ryzen 5000–generation CPUs, it appears that final advantage is finally ready to fall … and in the process, should open the door even further to renewed acceptance for AMD CPUs for use in CAD environments.
Three Buckets of Workstation-Focused CPUs
For the purposes of this discussion, I’m going to lump workstation-caliber CPUs into three buckets, as a function of their core counts and clock frequencies. Here, it’s worth remembering the inverse relationship of those two metrics: Driving one up pushes the other down, all else being equal. That relationship makes for an unfortunate reality for CAD users, as their workflows commonly employ some workloads that lend themselves well to execution on a multitude of cores and those that are inherently constrained to single-to-few-thread execution. (This inconvenient truth, and its impact, were explored in “Workstation CPU Cores and Clocks: An Inconvenient Tradeoff.”)
All-out maximum core count, at the cost of reduced single-thread performance. This option is the biggest in physical footprint and power consumption, but it accounts for a very small volume of the market. This class of CPU accounts for only about 10–12% of fixed workstations (that is, not counting mobile workstations), and is currently dominated in the market by Intel’s dual-socket-capable Xeon Scalable processors.
All-out maximum GHz. Harnessing more cores is a good thing, but only if it doesn’t mean giving up much in the way of GHz — a metric that on the same architecture correlates to single-thread performance — to get those extra cores. In the past few years, that’s generally meant quad-core CPUs are the sweet spot, though manufacturing progress (i.e., moving to denser 10-nm and 7-nm processes) has allowed that knee in the curve to transition to 6-core (6C) and eight-core (8C). That is, you can typically find a SKU with the highest base frequency (and usually, turbo modes as well) at 6C or 8C, so there’s no need to opt for fewer cores.
What makes this CPU category interesting in the context of this column is that It's a popular bucket for many computer users, but perhaps foremost among them gamers and CAD users. Both are among the preeminent examples of computing workloads that are both highly demanding and frequently limited to single-thread execution.
The middle ground: heftier core counts, but without throwing in the towel on single-thread performance. For our third bucket, let’s define a middle ground: those CPUs that sit in between, offering more cores than the baseline (again, call it 4–8C today), but not counts so large that users have to accept severe reductions in GHz in exchange.
The first box AMD ticked off was that top one: maxing out core counts. By tapping an approach to chiplet integration AMD calls the Infinity Architecture, engineers were able to stitch together two to several 8C silicon die, allowing for EPYC CPUs to scale up to 64 cores (for more information, see “Chiplet Architectures Emerge as One Arrow in Industry Quiver of Technologies Extending Compute Performance”). That core count eclipses Intel’s current top-end second-generation Xeon Scalable line at 56 cores.
Stitching multiple 8-core “chiplet” die in a single CPU package lets AMD gracefully scale up to massive core counts. Image source: AMD.
This past year saw AMD deliver a compelling challenger in the second of the three categories above. The company launched Threadripper PRO, offering a range of moderate-to-hefty core counts, but with far less-severe compromises in frequencies. Although it bears the Threadripper parent brand name, Threadripper PRO shares its DNA more closely with EPYC. Like its big server-oriented sibling, Threadripper PRO leveraged the same Infinity Architecture chiplet approach to drive up GHz at given core counts (or drive up core counts at given GHz, depending on your perspective).
The initial family of four Threadripper PRO SKUs in the 3900WX processor family ranges from the 4.0-GHz (base frequency), 12-core Threadripper PRO 3945WX all the way up to the 3995WX’s 64 cores. The 12C and 16C SKUs in particular, with the 12C 3945WX the first professional-caliber 12C (or more) CPU to break the 4.0-GHz barrier (as covered in “Two New CPUs Will Open More Options for the CAD Workstation Platform”), Lenovo created the tower ThinkStation P620 around Threadripper PRO, marking the first top-tier vendor to ship a branded workstation based on an AMD CPU since the late 2000s.
Threadripper PRO allows those looking for higher core counts a gentler tradeoff of base GHz to maintain performance for single-thread workloads. Source: Alex Herrera and AMD.
With the moderate-to-max core count EPYC and Threadripper PRO, consider those first two categories checked off. What was left outstanding to address was that third bucket: CPUs with leading single-thread performance. Well, with the November launch of Ryzen 5000, it looks like the company may have checked that box as well.
In contrast to EPYC and Threadripper, AMD’s Ryzen line is focused on lower core counts for PCs with a range of performance, denoted by SKU number and ranging all the way up to maximum GHz for top-end single-thread performance. Ryzen 5000 marks the first appearance of the third-generation Zen core, aptly referred to as Zen 3.
The common backbone to all three lines is Zen, the company’s ground-up microarchitecture that ushered in the renewed age of AMD competitiveness a few years back. Described as the company’s “most comprehensive design overhaul of the Zen era,” compared to second-generation Zen, Zen 3 promises an impressive 19% higher number of instructions per cycle (IPC). The IPC rate generally dictates single-thread throughput. Now, take the usual grain of salt with any broad-brush performance metrics like this one: It doesn’t apply to every application or use case, but assimilates a range of select workloads. And while I tend to give the benefit of the doubt that vendors of the caliber of AMD (and its system partners) don’t intentionally lie or deceive outright, it will certainly put its best foot forward, as any company would. So yes, your mileage will definitely vary.
The progression of AMD’s Zen CPU core microarchitecture. Image source: AMD.
Still, even measured generically across workloads, 19% is a level that supports AMD’s claim that Zen 3 delivers a “historically large generational improvement in IPC.” As touched on many times in this column, the low-hanging fruit in superscalar design is long gone, and it’s far from easy to eliminate any idle time in execution or extract more fine-grain parallelism out of a single thread. Over the past few decades, just about everything’s been tried and refined, and then refined many times more. And we’ve gotten to a place where something like 10% is a more typical generation-to-generation gain in IPC. (Bear in mind that the “+52%” disclosed by AMD above for the initial Zen incarnation was relative to a completely different microarchitecture, and what vaulted AMD back to a competitive position.)
Looking beyond blanket claims of 19% improvement, can Ryzen 5000 support a claim to the single-thread crown with real-world performance metrics, especially when it comes to CAD? If so, by how much? So far, what we have to judge are benchmarks from both AMD and third parties, which tend to focus on gaming and content creation applications rather than CAD-oriented workloads. CPU Monkey, an oft-quoted independent site which doesn’t focus on qualitative assessments but simply runs benchmarks, shows the Ryzen 5000 series shooting to the top of its Cinebench R23 scores for single-thread testing. At this point, while the degree of Ryzen 5000’s performance level is hard to exactly ascertain, particularly for workloads this audience cares about most, it would appear AMD has achieved its long-awaited goal of putting single-thread performance on equal footing with Intel’s best, if not outright ahead. I’ll be doing my own benchmarking on Ryzen 5000 with special attention to common, performance-critical CAD workloads, so hopefully look for that in a coming column.
Ah, but you might point out that Intel isn’t standing still either, so even if Ryzen 5000 moves ahead, superior Core brand processors should be imminent as well. And you’d be right. This column covered the latest 11th Gen Core mobile-focused Tiger Lake processors from Intel, hinting at performance desktop–oriented siblings to come. Intel has confirmed that its Rocket Lake S 11th Gen Core desktop CPUs should arrive in the first quarter of 2021. And while Rocket Lake S isn’t quite a DNA-equivalent twin to Tiger Lake — which is often the case in serving the disparate requirements of mobile and desktop applications — it promises a healthy bump in IPC over the previous flagship 10th Gen Core desktop SKUs as well.
How much bang for the buck should we anticipate from Rocket Lake S? Intel’s statements set expectations at “double-digit percentage IPC performance improvement (gen over gen).” We’ll have to see precisely what that means, but the phrasing to me indicates something just over 10%, so it’s probably fair to assume a more modest advance than AMD achieves with Zen 3 and Ryzen 5000.
But even if Rocket Lake S pushes past Ryzen 5000, AMD’s climb back to workstation-caliber CPU relevance — punctuated by the final single-thread box Ryzen 5000 has checked — looks to have ushered in a renewed egalitarian era in processors. A shift from the status quo of AMD being the perpetual laggard in single-thread performance to one characterized by relative parity, quite possibly followed by a long-term game of generation-to-generation leapfrog. And that is a healthy situation for the CAD workstation industry — one long dominated by Intel CPUs —as well as the consumers of those products: two vendors duking it out with highly competitive products on relatively even footing.