Unixbench Results - post free upgrade
I wrote a blog post back in 2011 posting Unixbench performance numbers for a number of virtual and physical servers that I use in my day to day choirs.
In short, Linode 512 came in at 495 beating RIMU at 290 and even my physical IBM x345 at 387.
I upgraded from a Linode 512 to a Linode 1024 back in October 2012.
On April 5 I ran Unixbench again only to discover that my performance had dropped to 272. I ran the test again on April 7 and received a score of 358.
I performed the migration from a Linode 1024 to a Linode 2048 and ran the benchmark again only to get back scores of 189. I contacted support and they migrated me to another host and there I got a score of 119. I ran the test one last time on April 12 and got 178. The migration from a Linode 1024 to a Linode 2048 appears to have a very different CPU, Intel(R) Xeon(R) CPU E5-2630L 0 @ 2.00GHz (4000.1 bogomips) as compared to the previous CPU of Intel(R) Xeon(R) CPU L5520 @ 2.27GHz (4522.0 bogomips).
As I explained to Linode support, I certainly understand that benchmarking virtual guests (servers) is very tricky subject but while I expect there to be some variance I'm at a loss to explain the delta I'm observing here.
I've gone from extolling Linode to wondering if I need to move to another provider.
Cheers!
30 Replies
I've been chalking it up to early upgraders beeing resource hogs (or omg, new shiny have to benchmark it!
For now I've just made the apps cope, the extra RAM helped quite a bit in that regard. Hopefully things will get sorted when things settle down a bit.
I want to test some stuff myself, and prefer to use same version.
I am testing a cpu intensive webapp but performance is 3x slower than on my old desktop Core 2 Duo E6500. And that while I am on a new E5 2670 host and it shows idle as load in the manager. I expected at least similar performance. (I am on 1024 plan)
Also, do I need to provide some command line option to run it on only 1 core or so? Or is it balanced by itself when I compare it with non-multi-core systems?
Update
Turns out to be a MONO issue. The same file leads to 3x the running time of the Windows .NET original framework. That's a huge disappointment. I guess I need to look for windows hoster for this project.
Linode 1024 - Xeon(R) CPU E5-2630L 0 @ 2.00GHz
CentOS 5.9 - 32 bit
Result: 164.5 / 143.9
Linode 1024 - Xeon(R) CPU E5-2670 0 @ 2.60GHz
Ubuntu 12.04.2 - 64 bit
Result: 1105.6 / 1326.6
Linode 1024 - Xeon(R) CPU E5-2630L 0 @ 2.00GHz
Debian 6.0.7 - 32 bit
Result: 199.3 / 184.5
Home workstation - AMD Phenom™ II X4 955
Ubuntu 12.04.2 - 64 bit
Result: 1449.4 / 1442.4
AWS micro instance - Xeon(R) CPU E5-2650 0 @ 2.00GHz
Ubuntu 12.04.2 - 64 bit
Result: 260.1 / 99.0
I don't get how the E5-2670 Linode can get a score 5 times higher than the other two. Maybe it's on an underused host. Maybe they upgraded more than just the CPU.
The AWS micro instance is particularly impressive as Amazon make it clear that that type of instance isn't suitable for anything but very light usage. Its root device is a network block device, it only has 600 Meg of memory, and only 1 CPU core.
EDIT: I ran the tests again. The E5-2670 Linode got the same amazingly high result, the AWS micro instance scored a lot lower the second time around.
BYTE UNIX Benchmarks (Version 5.1.3)
System: poseidon: GNU/Linux
OS: GNU/Linux – 2.6.32-344-ec2 -- #46-Ubuntu SMP Wed Mar 7 13:48:15 UTC 2012
Machine: i686 (unknown)
Language: en_US.utf8 (charmap="UTF-8", collate="UTF-8")
CPU 0: Intel(R) Xeon(R) CPU L5520 @ 2.27GHz (4536.0 bogomips)
Hyper-Threading, MMX, Physical Address Ext, SYSENTER/SYSEXIT
CPU 1: Intel(R) Xeon(R) CPU L5520 @ 2.27GHz (4536.0 bogomips)
Hyper-Threading, MMX, Physical Address Ext, SYSENTER/SYSEXIT
CPU 2: Intel(R) Xeon(R) CPU L5520 @ 2.27GHz (4536.0 bogomips)
Hyper-Threading, MMX, Physical Address Ext, SYSENTER/SYSEXIT
CPU 3: Intel(R) Xeon(R) CPU L5520 @ 2.27GHz (4536.0 bogomips)
Hyper-Threading, MMX, Physical Address Ext, SYSENTER/SYSEXIT
00:43:31 up 19 days, 7:36, 2 users, load average: 1.98, 2.68, 2.00; runlevel 2
Single test:
System Benchmarks Index Score 682.6
Parallel tests:
System Benchmarks Index Score 1703.8
BYTE UNIX Benchmarks (Version 5.1.3)
OS: GNU/Linux -- 3.7.10-x86_64-linode30 -- #1 SMP Wed Feb 27 14:29:31 EST 2013
Machine: x86_64 (GenuineIntel)
Language: en_US.utf8 (charmap="UTF-8", collate="UTF-8")
CPU 0: Intel(R) Xeon(R) CPU E5-2630L 0 @ 2.00GHz (4000.1 bogomips)
Hyper-Threading, x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET
...
CPU 7: Intel(R) Xeon(R) CPU E5-2630L 0 @ 2.00GHz (4000.1 bogomips)
Hyper-Threading, x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET
16:36:38 up 8 days, 23 min, 2 users, load average: 0.16, 0.08, 0.06; runlevel
------------------------------------------------------------------------
Benchmark Run: Sat Apr 20 2013 16:36:38 - 17:05:03
8 CPUs in system; running 1 parallel copy of tests
System Benchmarks Index Score 93.9
Benchmark Run: Sat Apr 20 2013 17:05:03 - 17:34:39
8 CPUs in system; running 8 parallel copies of tests
System Benchmarks Index Score 296.4
2x Intel(R) Xeon(R) CPU E5-2650L 0 @ 1.80GHz
1x Intel(R) Xeon(R) CPU L5630 @ 2.13GHz
4x Intel(R) Xeon(R) CPU E5-2630L 0 @ 2.00GHz
1x Intel(R) Xeon(R) CPU L5520 @ 2.27GHz
Of those 2 are old hardware (L5*). I've not landed on any 2670. I've had people question the performance of two of the 2630L but I've not had a chance to verify any performance changes yet. I also had to migrate some servers twice and one three times due to a Xen bug not being able to boot 32 bit guests. At this point I'm recommending people don't upgrade unless the ram can really help them. It's just too much potential hassle.
Although I suspect Linode has lowered singlethreaded CPU allotment in favor of more CPUs…
I was actually expecting them to get a rubbish score.
I've contacted Linode support about the poor performing E5-2630L and got migrated to a E5-2670 0 @ 2.60GHz it still doesn't perform as well as the old L5520. I've asked Linode to look into it we'll see what they come up with.
Unixbench 5.1.3.
380 / 1808
BYTE UNIX Benchmarks (Version 5.1.3)
System: tsa: GNU/Linux
OS: GNU/Linux -- 3.8.4-linode50 -- #1 SMP Mon Mar 25 15:50:29 EDT 2013
Machine: i686 (i386)
Language: en_US.utf8 (charmap="UTF-8", collate="UTF-8")
CPU 0: Intel(R) Xeon(R) CPU E5-2670 0 @ 2.60GHz (5200.1 bogomips)
Hyper-Threading, MMX, Physical Address Ext
CPU 1: Intel(R) Xeon(R) CPU E5-2670 0 @ 2.60GHz (5200.1 bogomips)
Hyper-Threading, MMX, Physical Address Ext
CPU 2: Intel(R) Xeon(R) CPU E5-2670 0 @ 2.60GHz (5200.1 bogomips)
Hyper-Threading, MMX, Physical Address Ext
CPU 3: Intel(R) Xeon(R) CPU E5-2670 0 @ 2.60GHz (5200.1 bogomips)
Hyper-Threading, MMX, Physical Address Ext
CPU 4: Intel(R) Xeon(R) CPU E5-2670 0 @ 2.60GHz (5200.1 bogomips)
Hyper-Threading, MMX, Physical Address Ext
CPU 5: Intel(R) Xeon(R) CPU E5-2670 0 @ 2.60GHz (5200.1 bogomips)
Hyper-Threading, MMX, Physical Address Ext
CPU 6: Intel(R) Xeon(R) CPU E5-2670 0 @ 2.60GHz (5200.1 bogomips)
Hyper-Threading, MMX, Physical Address Ext
CPU 7: Intel(R) Xeon(R) CPU E5-2670 0 @ 2.60GHz (5200.1 bogomips)
Hyper-Threading, MMX, Physical Address Ext
16:52:24 up 6 days, 17:39, 2 users, load average: 0.70, 2.59, 2.29; runlevel 2
------------------------------------------------------------------------
Benchmark Run: Sat Apr 20 2013 16:52:24 - 17:20:41
8 CPUs in system; running 1 parallel copy of tests
Dhrystone 2 using register variables 17501126.2 lps (10.0 s, 7 samples)
Double-Precision Whetstone 2672.2 MWIPS (10.0 s, 7 samples)
Execl Throughput 1296.3 lps (30.0 s, 2 samples)
File Copy 1024 bufsize 2000 maxblocks 99418.0 KBps (30.0 s, 2 samples)
File Copy 256 bufsize 500 maxblocks 26767.1 KBps (30.0 s, 2 samples)
File Copy 4096 bufsize 8000 maxblocks 769607.5 KBps (30.0 s, 2 samples)
Pipe Throughput 123387.6 lps (10.0 s, 7 samples)
Pipe-based Context Switching 18226.4 lps (10.0 s, 7 samples)
Process Creation 2459.9 lps (30.0 s, 2 samples)
Shell Scripts (1 concurrent) 4279.6 lpm (60.0 s, 2 samples)
Shell Scripts (8 concurrent) 1665.1 lpm (60.0 s, 2 samples)
System Call Overhead 465484.2 lps (10.0 s, 7 samples)
System Benchmarks Index Values BASELINE RESULT INDEX
Dhrystone 2 using register variables 116700.0 17501126.2 1499.7
Double-Precision Whetstone 55.0 2672.2 485.9
Execl Throughput 43.0 1296.3 301.5
File Copy 1024 bufsize 2000 maxblocks 3960.0 99418.0 251.1
File Copy 256 bufsize 500 maxblocks 1655.0 26767.1 161.7
File Copy 4096 bufsize 8000 maxblocks 5800.0 769607.5 1326.9
Pipe Throughput 12440.0 123387.6 99.2
Pipe-based Context Switching 4000.0 18226.4 45.6
Process Creation 126.0 2459.9 195.2
Shell Scripts (1 concurrent) 42.4 4279.6 1009.3
Shell Scripts (8 concurrent) 6.0 1665.1 2775.1
System Call Overhead 15000.0 465484.2 310.3
========
System Benchmarks Index Score 380.0
------------------------------------------------------------------------
Benchmark Run: Sat Apr 20 2013 17:20:41 - 17:49:07
8 CPUs in system; running 8 parallel copies of tests
Dhrystone 2 using register variables 100086132.2 lps (10.0 s, 7 samples)
Double-Precision Whetstone 19927.2 MWIPS (10.0 s, 7 samples)
Execl Throughput 7327.2 lps (30.0 s, 2 samples)
File Copy 1024 bufsize 2000 maxblocks 496250.3 KBps (30.0 s, 2 samples)
File Copy 256 bufsize 500 maxblocks 108040.8 KBps (30.0 s, 2 samples)
File Copy 4096 bufsize 8000 maxblocks 1610986.5 KBps (30.0 s, 2 samples)
Pipe Throughput 962477.2 lps (10.0 s, 7 samples)
Pipe-based Context Switching 227418.8 lps (10.0 s, 7 samples)
Process Creation 11781.3 lps (30.0 s, 2 samples)
Shell Scripts (1 concurrent) 15480.2 lpm (60.0 s, 2 samples)
Shell Scripts (8 concurrent) 2061.2 lpm (60.2 s, 2 samples)
System Call Overhead 2952615.5 lps (10.0 s, 7 samples)
System Benchmarks Index Values BASELINE RESULT INDEX
Dhrystone 2 using register variables 116700.0 100086132.2 8576.4
Double-Precision Whetstone 55.0 19927.2 3623.1
Execl Throughput 43.0 7327.2 1704.0
File Copy 1024 bufsize 2000 maxblocks 3960.0 496250.3 1253.2
File Copy 256 bufsize 500 maxblocks 1655.0 108040.8 652.8
File Copy 4096 bufsize 8000 maxblocks 5800.0 1610986.5 2777.6
Pipe Throughput 12440.0 962477.2 773.7
Pipe-based Context Switching 4000.0 227418.8 568.5
Process Creation 126.0 11781.3 935.0
Shell Scripts (1 concurrent) 42.4 15480.2 3651.0
Shell Scripts (8 concurrent) 6.0 2061.2 3435.3
System Call Overhead 15000.0 2952615.5 1968.4
========
System Benchmarks Index Score 1808.2
~~![](<URL url=)http://www.ntsel.com/R4J2DX7P.png
Can you guess when I migrated to an E5-2630L to get my free double RAM?~~
@Stever:
Maybe more telling than running benchmarks is some real-world data - time for clamd to reload it's database.
That clearly illustrates a serious problem here. I'm wondering what changed, has Linode started overselling RAM or something?
Anyone know if the problems happened with the RAM upgrade or the CPU upgrade? Maybe this is due to some flaw in xen that's killing CPU caching or something of that nature.
One of them was on a node where another user was severely abusing disk IO, so I migrated to another host and it's helped a lot, tasks that were taking 7 minutes are now taking 3 (which is how long they should be). I timed a mysql database dump it was taking ~60 seconds, now it's taking ~20.
One important note is that both of these were on E5-2630L and the new ones are E5-2670. Either there is a noticeable difference between the two or the E5-2630Ls were heavily loaded hosts.
I'll post more details about the unix bench scores later.
Here's my Munin processing time over the past month or so. This machine (now a Linode 2048) moved from a L5520 to a E5-2630L somewhere in the middle of the month:
~~![](<URL url=)http://drop.hoopycat.com/munin-proctime-20130423.png
Also, here's CPU usage for another machine (now a Linode 1024), which moved from an L5420 to a E5-2650L…
~~![](<URL url=)http://drop.hoopycat.com/munin-rocwiki-cpu-20130423.png
And the pingdom page load time over a similar period:
~~![](<URL url=)http://drop.hoopycat.com/rocwiki-pingdom-20130423.png
I'm not saying the problem doesn't exist, but it doesn't seem universal.
BTW: On the unixbench tests, y'all are running that from a completely idle system (e.g. Finnix), right?~~
I can only guess at this point that it's CPU related, even though my own guest CPU load is only 6-10% on average. But with the memory increase the working set (including database) can pretty much now fit into memory, something I do see reflected on my I/O graphs which have dropped significantly on average.
I just don't get it, since by the specs it's hard to see how the new node can't at least match the prior performance. I guess it could be a busy host, but I've had that in the past (and even migrated once) and it didn't produce average application level results as bad as I'm currently seeing. And then it was generally I/O wait that was causing my problems, but I see very little of that right now, especially with my lower I/O rate.
I do see significantly higher cpu steal percentages than I recall on older nodes, or currently see on non-upgraded nodes. I don't know if that's reality or if it's just being reported more accurately, nor how much of an impact it's having. Though I wonder if it wasn't a mistake to increase guests to 8 cores - maybe the overhead of dealing with the extra contention among all the guests across the larger set of cores is actually hurting more - at least for the average case - than access to the extra cores is of benefit.
It's unfortunate, since I was really enthusiastic about the upgrades, but at the moment, I sort of wish I hadn't done any - things were more stable and predictable before. Maybe that's just due to changing hosts and could have happened previously too - it just seemed so unlikely that moving to a new node could end up being a step backwards. I just hope whatever is going on settles down and I can stop focusing on it and having to worry about the change in my application performance.
– David
What's going on with Linode these days?
An upgrade that feels like a downgrade…
@hoopycat:
BTW: On the unixbench tests, y'all are running that from a completely idle system (e.g. Finnix), right?
In my case close enough, Ubuntu with all the services stopped (even cron!) except ssh.
I've resolved my problems with two servers thanks to Linode support.
One server had an abusive user hitting the disk the other one I don't know the reason for poor performance. Both were on E5-2630L and are now on E5-2670.
After the upgrade reports and backups that would take about half an hour increased to an hour. After moving to the E5-2670 hosts they're back down to half an hour.
Here is a munin graph of the server load for the month
~~![](<URL url=)
You can see an increase around week 16 which is when I first upgraded (ignore the big spike that was a test). After the spike you can see it's dropped back down to normal this is the new E5-2670 host.
The unix bench score for the E5-2630L were
UnixBench (w/ all processors) 375.5
UnixBench (w/ one processor) 153.3
The unix bench score for the E5-2670 are
UnixBench (w/ all processors) 1062.4
UnixBench (w/ one processor) 361.4
Now this isn't the best unix bench I've ever see on a Linode the best I've seen is
UnixBench (w/ all processors) 1431.4
UnixBench (w/ one processor) 524.5
Which was on a Intel(R) Xeon(R) CPU L5630 @ 2.13GHz
However, the reason for this seems to be that for some reason the Unix bench file copy tests perform worse on the new E5 hardware than the old hardware even though the Dhrystone and Whetstone tests are showing a good improvement over the new hardware. Unix bench seems to penalize the score due to the file copy tests.
Another point I've not had reports of performance problems on Larger nodes (2GB/4Gb) but I've not inspected them closely to be sure they're not affected it could be they're just affected less. The two servers in this post were 512s (now 1GBs).
The servers in question were a high availability pair so downtime for one server wasn't a problem. For those of you with problems and a non HA setup you'll probably have to migrate to a new host which means downtime
If you're interested benchmarks can be viewed in detail here
E5-2630L
E5-2670
L5630
Linode support were very polite and helpful during the diagnosis and were apologetic about having to migrate again. All in all my faith in Linode's quality support has been retained!
tl;dr
Unixbench appears to be less accurate on the new processors. The E5-2670 > E5-2630L. Ask to be migrated to a E5-2670 host. Linode support still awesome.~~
Migrated for 2xRAM on 4/12, maintenance occurred on 5/7.
![](
I have moved 2 of those those non mission critical nodes to Ramnode and im pleased. However Im hosting my production sites on Linode's trusty old 512MB I wont upgrade that.
Heres what my Google webmaster tools graph looks like after I migrated to Ramnode. Performance is consistent and the last drop happened after I have installed PHP-APC.
![](
@adergaard:
I'm also seeing performance hits.
What's going on with Linode these days?
An upgrade that feels like a downgrade…
I so wish I've seen this whole thread earlier.
Never would have gone the "free upgrade" road.
Performance is awful ever since, and the support is constantly saying it must be my fault with some configuration changes and whatnot…
It would probably helped many if the info was shared more. This topic/thread clearly shows that the upgrade was downgrade for some/many that did it…
Haven't changed any application settings.
Anyways, I requested a switch and landed on a server where I'm back to ~30ms pings or so. So I'm happy for now, but don't understand how I should ever have had 1 second ngnix pings in the first place. Methinks there is some grave misconfiguration that Linode needs to suss out.
I seem to recall in the past that each host node had Linodes of a particular size, eg. you wouldn't have 512MB Linodes on the same node as 2048MB Linodes. Does anyone know if that is still the case? If not, that change along with the 8 core bump could explain smaller Linodes getting squeezed.
I don't believe that's true anymore. I think you "could" run into a situation where there are multiple size Linodes on the same box.
Edit: I don't know that there's been any official word on this from Linode.