If you have been paying attention to the technology press over the past 12-18 months, you may have noticed a rather large number of negative stories about Intel’s processor business. A close monitoring of the hardware enthusiast community, including many of the most respected hardware analysts and reviewers paints an even more dire picture about Intel in the server processor space.
Despite all of this, Intel is not going to lose their entire server processor business any time soon. However, I am firmly convinced that Intel will lose significant market share during the next 12-18 months after the release of the upcoming 7nm AMD EPYC “Rome” server processors. By significant market share, I am talking in the 10-15% range during that time period. The previous AMD EPYC “Naples” processors have “primed the pump” in the server space, and the major server vendors are now much more receptive to AMD.
For many years, I explicitly advised people not to run their SQL Server workloads on AMD hardware because of the much lower single-threaded CPU performance and consequently higher SQL Server core license costs. Now, I am advising people to strongly consider AMD for SQL Server workloads as the AMD EPYC “Rome” processors are released in Q3 of 2019. So, what has changed my mind?
The Death of Tick-Tock
From 2007 until 2016, Intel was able to successfully execute their Tick-Tock release strategy, where they would introduce a new processor microarchitecture roughly every two years (a Tock release). One year after a Tock release Intel would take that same microarchitecture (with some minor improvements), and use a shrink of the manufacturing process to create a Tick release.
This created a predictable release cadence, and also delivered significant performance gains and other improvements with each release, especially Tock releases. This made it easier for database professionals to make the case for a hardware upgrade, and made the typical upgrade more worthwhile.
The Tick-Tock release cycle basically fell apart by about 2015, as Intel was unable to move from a 14nm manufacturing process to a 10nm manufacturing process. Intel has been stuck at 14nm in the server space since the Broadwell release in Q4 of 2016. Intel officially moved on to what they call “Process- Architecture-Optimization (PAO)” in early 2016.
This has led to a very noticeable reduction in generational performance increases since Broadwell-EP, as shown in Figure 1. These numbers are estimated TPC-E scores for a two-socket server with two, eight-core processors, using the fastest eight-core processor from each generation.
Figure 1: Generational Intel Xeon Performance Increases
Lack of Competition in the Server Space
Intel server processors have historically delivered significantly better single-threaded CPU performance and lower power consumption than competing AMD processors since the Intel Nehalem microarchitecture in 2008. This situation was so bad that Microsoft offered a 25% discount on the cost of SQL Server processor core licenses for SQL Server 2012 and SQL Server 2014, if you ran on qualifying AMD Opteron processors with six or more cores.
Even with this 25% license discount, it was not really cost-effective to use AMD Opteron processors for SQL Server usage, because of their extremely poor single-threaded performance. You could easily get more total CPU capacity, better single-threaded CPU performance, and much lower SQL Server licensing costs with an appropriate, modern Intel Xeon E5 or E7 processor during that time frame.
Since Intel had no viable competition from a performance perspective, they had little incentive to continue to innovate at the same pace. Intel became complacent over the past ten years, and ended up opening up a large opportunity for AMD. AMD has capitalized on this with their Zen architecture, and new Zen 2 architecture, using a modular, 7nm manufacturing process from Taiwan Semiconductor Manufacturing Company (TSMC).
Intel Processor Security Vulnerabilities
Adding to Intel’s woes are a series of processor vulnerabilities that have been discovered and publicized over the past 18 months. These include Spectre, Meltdown, Foreshadow and their variants, along with newer exploits such as Zombieload. Generally speaking, modern Intel processors are more vulnerable to these types of attacks than modern AMD processors are.
Older Intel processors are more vulnerable to these exploits, and they suffer more of a performance decrease from existing software and firmware-level fixes. The latest Intel Cascade Lake-SP processors do have hardware-level mitigations for some of the Spectre and Meltdown exploits, which reduces the performance impact compared to previous firmware or software-level mitigation measures.
I wrote a number of blog posts about this back in January 2018, including these:
- Checking Your SQL Server Instance for Spectre/Meltdown Patches
- Checking Your Meltdown and Spectre Mitigation Status in Windows
Microsoft’s current SQL Server specific guidance about this subject is here.
AMD EPYC 7002 Series “Rome” Highlights
The 7nm AMD EPYC 7002 “Rome” processors will have between 8 and 64 physical cores, plus Simultaneous Multi-Threading (SMT), which is the AMD equivalent of Intel Hyper-Threading. They will also have up to 256MB of L3 cache per processor.
AMD claims a 15% Instructions Per Clock (IPC) increase between the desktop Zen+ and Zen 2 generations, and we are likely to see a similar increase between the previous AMD EPYC 7001 “Naples” and the AMD EPYC 7002 series processors.
So far, we don’t know the official base and turbo clock speeds, but there was a recent leak of partial specifications and pricing by a European retailer that listed max boost clock speeds of up to 3.4 GHz. We won’t know the actual single-threaded performance of these processors until they have been released and benchmarked by neutral third-party testers. I am optimistic that they will have higher single-threaded CPU performance than Intel Cascade Lake-SP processors.
These Rome processors will have eight memory channels that will support DDR4-3200 memory, with up to 4TB of RAM per socket. The processor will also support 128 PCIe 4.0 lanes (which have double the bandwidth of PCIe 3.0 lanes). This much memory and I/O bandwidth will make this processor an excellent choice for DW type workloads.
Hardware vendors are quite enthusiastic about Rome, with Dell planning on tripling the number of AMD-based server models it offers by the end of 2019. If the leaked pricing information is accurate, AMD Rome processors will be significantly less expensive than Intel Cascade Lake-SP processors.
Figure 2: AMD EPYC Rome Processor
How is This Relevant for SQL Server?
You might be asking why you should care about all of this as a SQL Server Database professional? There are many reasons! These include your overall server CPU capacity, your single-threaded CPU performance, your memory density and capacity, your total I/O capacity, and your SQL Server 2017/2019 license costs.
I think there are a large number of existing SQL Server instances out there running on older versions of SQL Server, on older versions of Windows Server, perhaps on older versions of a hypervisor, running mainly on older generations of Intel Xeon processors. Many organizations have been keeping their legacy environments running for a number of years, waiting for a worthwhile set of reasons to finally do a complete data platform refresh. For many of these organizations, the second half of 2019 into the first half of 2020 will be a window where it will make sense to finally upgrade.
Once you have made the decision to upgrade, you should think about whether you want to run your SQL Server instances on an AMD platform or an Intel platform. Because of the advantages of the AMD Zen 2 architecture, choosing an AMD platform for your new server(s) may be the best choice, from multiple perspectives. These include probable better single-threaded CPU performance, better multi-threaded CPU performance, higher memory density and capacity, higher memory bandwidth, higher I/O bandwidth, better hardware-level security, and lower processor pricing.
AMD's biggest problem over the years has been having enough manufacturing capacity to make their product. Big OEM's like Dell have stayed away from them because of problems meeting demand.
Making a few demo CPU's and then making tens of millions of them with good yields and profitably are two different things. I've read some really good technical articles over the years about manufacture, and it's an art on its own and the process starts with design. Your CPU engineers need to work with the manufacturing team during the design phase or else you're going to have yield issues
Most of AMD's actual CPU manufacturing is handled by TSMC (with some work still being done by Global Foundries). Intel used to have a huge advantage when it came to manufacturing compared to AMD, but they have pretty much lost that now. Intel has the capital and market share to recover, but it will take a decent amount of time. As I said in the article, Intel is not doomed, but they are going to lose some market share in the server space.
I read a comment, that for large databases, AMD's memory latency will be a problem and hence it is better to stick with Intel
What is your opinion of the same ?