Technology always moves fast, particularly so in the domain of high-performance computing. But the past year’s pace has proven especially frenetic in the CPU industry, with rivals AMD and Intel hogging more than their usual share of attention with an accelerated rollout of competing CPUs for both mobile and fixed client computing applications. That quickened pace has been reflected in this column, which has dedicated more ink to CPU advancements relevant to CAD than ever. This month’s installment will cover another high-profile launch, one that will not only enter into decisions on hardware best suited for CAD in the coming year (at least), but one that introduces a novel concept in architecture addressing the increasing urgency in mitigating power consumption in pursuit of that performance.
Credit that quickened pace, at least in part, to AMD’s resurgence in the market for high-performance CPUs. From the tail end of the 00’s to the latter half of the 10’s, AMD was essentially absent from the high-performance computing markets. Yes, it sold tens millions of processors during that time, but primarily into mainstream, price-focused, consumer-oriented segments of the market. But, in high performance workstation segments, it held virtually no penetration and posed little threat to Intel’s default monopoly, giving the latter the freedom to roll out new products on its own timetable.
Things have changed. AMD’s rise in the past several years — combined with hiccups in Intel’s own execution on the silicon manufacturing front — has placed the latter and long-time CPU leader under tremendous pressure to hold off the former’s gains. That in turn has led Intel to push its own product cycles quicker, arguably forcing it to launch some products with more modest, incremental improvements in order to get something “new” out earlier, rather than wait a bit to focus on products with a more pronounced edge over the previous generation.
Consider Rocket Lake, covered here last summer. Would Intel have bothered with Rocket Lake without AMD on its heels (stated generously, as the latter has been forging ahead as of late)? 11th Generation Core CPUs based on Rocket Lake and targeting fixed clients — like deskside CAD workstations — looks to have a short life, even by silicon standards. Intel has already launched 12th Generation Core family, one that represents a far more substantial push forward architecturally. Alder Lake, the basis of that line, not only exemplifies the recognized urgency to address power consumption as much as performance, but should mark a return to a more unified architecture across Intel’s processor business.
A Novel Hybrid Architecture to Optimally Balance Performance and Power: P-Cores and E-Cores
The truth is, that despite the unified Core branding of Intel’s processors, CPUs targeting mobile applications and those serving fixed/wired applications have not shared a common architectural foundation for a while now. Consider Intel’s 11th Generation Core family. Its mobile-focused parts are based on a design called Tiger Lake while the desktop parts are derived from Rocket Lake. The reason is justifiable: the two have different priorities, one focused more on power efficiency and the other more on all-out performance. It’s been difficult to employ the same architectural foundation to achieve such disparate objectives.
Alder Lake will help unify the base technology for a wide range of performance and power levels. It will usher in a new era with a hybrid approach that allows a single processor to include heterogenous cores, one focused on performance (a P-core) and another on efficiency (an E-core). Exploiting that common foundation, a design looking for higher performance with a capable power budget, like a line-powered fixed CAD workstation will lean more heavily on P-cores. On the flip side, a battery-powered client, like a mobile workstation, might forgo a few P-cores in favor of more of the power-stingy E-cores. Ultimately, Intel is planning that this new heterogenous approach will serve all client segments, from 9-W CPU power for “ultra-mobile” devices like handhelds, through 45-W (the norm for a mobile workstation), and all the way up to 125-W, typical for a high-performance desktop/deskside CPU.
Alder Cove introduces a heterogeneous, hybrid architecture employing both Performance-oriented (P-cores) and Efficiency-oriented cores (E-cores) in the same processor. Image source: Intel.
In the case of just-released Alder Lake, the P-core is code-named Golden Cove and the E-core Gracemont. For Golden Cove, Intel promises an architecture that’s wider (for more concurrent processing), deeper (for improved pipelining, allowing for higher frequencies for the same work), and smarter (e.g. better instruction prediction ). Wider, deeper, and smarter all combine to improve the ultimate metric, the rate of instructions executed per cycle (IPC). Conversely, Gracemont focuses on minimizing watts and silicon footprint to allow for the least impact to overall power consumption and cost. Worth adding, Alder Lake also boasts a new cache architecture, what Intel calls Smart Cache, with up to 30MB of L3 cache shared among all on-chip P-cores and E-cores.
To Achieve Best Possible Performance, Fixed Applications Care About Power Too
While it would be natural to assume that the P-core would play the more significant role in workstation applications, Intel plans to deploy both cores in its CPU products so as to best serve the range of workloads, varying both in demand and priority. Often, we tend to think that mobile platforms care about power efficiency, while fixed, line-powered platforms don’t. That’s because we focus more on power delivery than the thermal dissipation resulting from power consumption. Line-powered fixed systems don’t have the delivery issues that battery-constrained mobiles do. But, with respect to thermal dissipation, fixed platforms do care about power consumption, because saving power where possible allows for expending more where needed. Remember that reducing on-chip heat output improves the ability for silicon to run faster, by retaining higher clock rates and/or voltage levels. That is, paying attention to power also reaps dividends in improved performance, regardless of whether the machine is plugged into the wall or not.
As such, E-cores aren’t just for battery-powered devices, and a design intent on maximum all-out performance isn’t likely to forgo E-cores altogether. Yes, tasks the user wants done as fast as possible — a CAD simulation, model update, or render — will want the services of a P-core (or multiple P-cores). But, there are many tasks that would better warrant the use of an E-core, like the myriad, low-demand OS tasks that happen in the background and which users typically never see.
Allocating the Right Cores to the Right Tasks
That all sounds sensible, in theory. But, it also begs the question of how. How will the CPU know whether to dispatch a thread of execution to a P-core or an E-core? It requires some real-time, autonomous intelligence. Toward that end of enabling heterogeneous Efficiency/Performance hybrid implementations, the company introduced Intel Thread Director, a dynamic, real-time, adaptive thread scheduler that relies on both embedded hardware control along with software exposure to best allocate threads to the optimal, appropriate, available cores. In hardware, the Thread Scheduler will monitor real-time telemetry data — instruction mix and state of cores — to make autonomous scheduling decisions essentially through educated guesses.
But, it won’t always have to guess, despite how accurate that speculation might be, as it also both exposes that data to Windows 11 (with Microsoft support) and its own PowerThrottling API, allowing both the OS and application to explicitly indicate the desired priority of execution. The latter will specifically allow developers to specify QoS (Quality of Service) levels to their threads. By way of example, startup enumeration will identify the properties of all processing cores, enabling the simple allocation of foreground/priority threads to performance cores and background threads to efficiency cores. In case of contention, a thread on a P-core can get kicked back to an E-core in real time, in case a higher priority thread gets initiated while all P-cores are currently busy. The end result is hopefully that every thread that “deserves” the performance — and associated watts — gets it, while power is conserved every chance it can be.
Example of real-time Thread Director thread allocation. Image source: Intel.
No More Doubling Physical Core Counts to Calculate Thread Counts
This might be a tough habit to break. For years, ever since Intel introduced HyperThreading (where one physical core can simultaneously juggle execution for two threads at once), we’ve been able to simply double the physical core count to determine the number of total threads a processor can execute concurrently. Because while P-cores like Golden Cove will continue to support HyperThreading as has been generally defined, E-cores cores like Gracemont will not (which makes sense, given the cores’ objective to minimize power). As such, the new equation to determine concurrent threads now looks like:
Threads = P x 2 + E
In addition, we’re going to have to individual specify base and Turbo clocks separately, as the E-core clocks will be lower (again, sensibly, since faster clocks chew up more power at a linear rate).
12th Generation Core CPUs Built on Alder Lake
Intel has just announced its initial round of 12th Generation Core brand CPUs based on Alder Lake, 1st SKUs for both mobile and desktop. This is the first time in recent memory we’ve seen the two sides of the Core brand launched simultaneously, bearing witness to Alder Lake’s more unified, hybrid architecture. On the mobile side, the 45-watt, H-series targets high performance mobile applications, like mobile workstations for CAD or laptops for gaming. 12th Generation Core H-series SKUs range up to 14 core (8E + 6P), while the fixed/deskbound SKUs lean a bit heavier on the P-cores, with up to 16 cores (eight each).
12th Generation Core 45 W -H series processors most relevant to fixed/deskbound CAD computing. Image source: Intel.
12th Generation Core 45 W -H series processors most relevant to mobile CAD computing. Image source: Intel.
Expect to see 12th Generation Core available on virtually all mobile workstations and every entry-level, fixed, workstation in the very near future. HP’s already announced support for some 12th Generation Core i5/i7/i9 SKUs in its Z2 entry desktop workstation and ZBook mobile workstations serving the CAD community.
What Performance Can CAD Users Expect?
All in all, Intel promises a rough and very general 19% IPC improvement across workloads. While such broad-stroke figures don’t apply to any workload specifically — and therefore should be taken with a grain of salt — it’s worth noting that 19% exactly matches the advertised figure for both Intel’s latest Rocket Lake S microarchitecture, as well as AMD’s Zen 3 generation (e.g., in the Ryzen 5000 CPU family).
Worth also referencing is Intel-supplied metrics for Alder Lake (12th Generation Core) relative to Rocket Lake (11th Generation Core), which show the top-end Alder Lake CPU running about 14% faster on Cadalyst’s AutoCAD benchmark than a top-end Rocket Lake SKU. (As with all vendor-referenced testing data, I emphasize “Intel-supplied” because while the vast majority of vendors won’t deceive or outright lie, all will justifiably try to put their best foot forward and show off their products in the best light. As always, I hope to benchmark real Alder Lake hardware to better ascertain its performance improvement on CAD-relevant workloads.)
Intel-supplied performance metrics on mobile-oriented 12th Generation Core, relative to 11th Generation Core. Image source: Intel.
Upgrading in 2022 Looks Promising
The CPU industry serving high-performance Windows and Linux clients — a.k.a., the Intel/AMD duopoly — is moving as fast as ever. With AMD applying extra pressure on Intel to accelerate its product introductions, 2021 saw a frenzy of new processors serving CAD computing markets. The latter launched its 11th Generation Core desktop family based on Rocket Lake in the summer, yet managed to get 12th Generation Core — for both mobile and fixed applications — out as well before the year closed. Alder Lake’s novel hybrid multi-core architecture should not only boost Core’s performance to help fend off AMD’s Ryzen and Threadripper families, it will also allow Intel to streamline development by unifying efforts across application spaces.
I plan to follow up with some real-world, CAD-focused testing of Alder Lake, relative both to Rocket Lake and AMD’s best in the space. For now, I’d argue the biggest beneficiaries of this extremely competitive market are those demanding the best in high-performance computing at the most aggressive prices. Buyers looking to upgrade their CAD machine in 2022 should be rewarded with excellent bang for the buck, regardless of which vendor they choose.
Alex Herrera
With more than 30 years of engineering, marketing, and management experience in the semiconductor industry, Alex Herrera is a consultant focusing on high-performance graphics and workstations. Author of frequent articles covering both the business and technology of graphics, he is also responsible for the Workstation Report series, published by Jon Peddie Research.
View All Articles
Share This Post