Evolution of CPU efficiency over time

Workstation performance is usually a somewhat neglected topic for us – since we mainly work with interpreted languages like PHP and Python (and, of course, JS), it doesn’t so much matter how fast developer machines are, since code has to run fast in production, not on their machines. As such, they get a random assortment of laptops to work with and are only upgraded when those break.

Recently we’ve begun to dabble in Go, however, and now not only does workstation performance suddenly matter, it’s also significantly affecting deploy times – taking a minute or so for each recompile with PIE and/or Memory Sanitizer enabled. “Fine”, I thought. “We have those huge Xeon workstations collecting dust in our archive, let’s put them to use.”

Surprisingly, performance did not change much – we barely shaved 10 seconds off a one-minute job.

The problem: Those Xeon workstations are first-generation Xeons, from 2009. With much of our compile time falling to a single (6 MiB) C file, it doesn’t matter that they have twice the RAM and CPU cores and thrice the thermal reserves and more cache and more everything as our laptops. Pure, raw CPU efficiency is all that counts.

And how did that develop over the past few generations? Depressingly well:

Device CPU Build time simple.go Build time (own project)
Thinkstation E20 Xeon X3440 50 seconds 52 seconds
ThinkPad X230 Core i5-3320M 45 seconds 64 seconds
ThinkPad L450 Core i5-5200U 40 seconds 47 seconds
Venue 7140 Pro Core M-5Y10c 53 seconds 91 seconds
Self-built desktop Core i7-6700 24 seconds 27 seconds

Tested was CC=clang go build --ldflags '-extldflags "-static -fPIC"' -buildmode=pie on go-sqlite3’s simple example, which is purely limited by the compile time for sqlite3-binding.c (a 6.23 MiB C file), and on our current project, which uses go-sqlite3 and a handful other libraries (to compare multi-core performance). All machines were running current Arch x64.

And yes, that’s not a typo: A(n already outdated) 4.5W tablet CPU is within 10% of the per-core performance of a first-generation Core i7-equivalent 95W Xeon. (The current m5-6Y57 would have a 40% higher clock speed.) I only benchmarked the tablet as a joke initially, but damn. Due to its lower thermal limits, it quickly clocks down when all four virtual cores are needed (and the Xeon’s four physical cores can finally shine a little), but the speed is still impressive.

There seems to be a growing, disgruntled “why should I bother upgrading, my CPU is still fast enough” consensus over the last few years (me included, admittedly), but despite the naysaying, CPUs did improve significantly over the last few years.