[OT] What CPU would you use in a personal file server?
I'm thinking of building a home server primarily for backup and media streaming etc but I'm having trouble deciding on a CPU to use. At some point I may want to run a few KVM virtual machines on it so I want something that supports the full range on Intel virtualisation technologies. I was thinking an E3 Xeon (something like the E3 1275 v3) would be a good fit.
I'll probably be running 2 or 4 hard drives in either RAID 1 or RAID 10.
10 Replies
It supports ECC RAM and I'm using a Supermicro server board. Since you plan on running some VMs, then I think an E3 would be great. However, I'm not sure that the 1275 is worth it - a media server probably needs the graphics, but the 1245 is significantly cheaper, only 100 Mhz slower, and uses less power.
I have a colo server with an E3 1230 v3 running 11 VMs and it runs around 20-30% utilization on average, so I think anything of that caliber will be able to easily handle what you are looking to do. I don't have experience with running a media server on such a platform, but I think it would be fine.
Also, I use RAIDZ with ZFS on Linux. You might consider that instead of the standard RAID 1 or 10.
The Pentium Dual Core chip I had in there floundered; ZFS wasn't multithreaded at the time (it looked like it used one process per pool, and nearly all my use was in one big pool), and the per-core performance of the Core 2 architecture (which was super outdated even when I bought the thing) was terrible: lots of stuff I'd do with the file server would max out a CPU core. Ultimately, I replaced it with a Core i7 920, which is also pretty out of date, but more than fast enough for ZFS (and it was free, pulled from my old desktop). Anyhow, this doesn't seem to be a problem, because you're considering modern chips.
In terms of your CPU choice, do you need a Xeon? Consider that some consumer parts support ECC RAM in server motherboards. The i3-4330 is half the cost of the E3-1275, and won't be that much slower. It's also a generation newer and uses less power. And the i3-4330 supports ECC RAM and the most important virtualization extensions. The E3's extra cache would help with virtualization, though.
EDIT: In terms of the compression comment, LZ4 (which was added to ZFS earlier this year) is capable of compression/decompression speeds that are high enough that it's bottlenecked on disk reads/writes rather than the CPU, so it generally improves performance by reducing how much data needs to be read/written.
@Guspaz:
I'm also running ZFS on a home file server, with a decent data set (44TB raw capacity across 15 drives). One thing I'd caution is to not cheap out on the GPU too much. My first attempt to build the server involved a Pentium Dual Core chip, because they very cheap and I didn't think I needed much CPU power for a file server. Big mistake! ZFS, at least, is relatively resource intensive, what with doing checksums on all data, and the encouragement to enable LZ4 compression on all your data due to the performance improvements.
The Pentium Dual Core chip I had in there floundered; ZFS wasn't multithreaded at the time (it looked like it used one process per pool, and nearly all my use was in one big pool), and the per-core performance of the Core 2 architecture (which was super outdated even when I bought the thing) was terrible: lots of stuff I'd do with the file server would max out a CPU core. Ultimately, I replaced it with a Core i7 920, which is also pretty out of date, but more than fast enough for ZFS (and it was free, pulled from my old desktop). Anyhow, this doesn't seem to be a problem, because you're considering modern chips.
In terms of your CPU choice, do you need a Xeon? Consider that some consumer parts support ECC RAM in server motherboards. The i3-4330 is half the cost of the E3-1275, and won't be that much slower. It's also a generation newer and uses less power. And the i3-4330 supports ECC RAM and the most important virtualization extensions. The E3's extra cache would help with virtualization, though.
EDIT: In terms of the compression comment, LZ4 (which was added to ZFS earlier this year) is capable of compression/decompression speeds that are high enough that it's bottlenecked on disk reads/writes rather than the CPU, so it generally improves performance by reducing how much data needs to be read/written.
That's a really big array - I only have 3 x 3TB drives. How much memory do you have on that box? One thing I have noticed is that memory usage is very important for ZFS.
Keep that in mind if you are planning on doing virtualization. I would max out the memory (32 GB) and keep an eye on performance to make sure everything runs well.
Currently, running the Nehalem-era hardware, it's fully maxed out with 24GB of RAM, although I would probably have been fine with 12GB. After the disaster that was the 2GB experience, I just said "Screw it, you'll never be RAM starved again."
Yeah, ZFS is RAM-hungry. Especially because I started out using deduplication on some of the smaller file systems (another huge mistake, deduplication is almost never worth it). RAM is cheap.
The disk setup in the system is currently:
Boot:
1x Intel 160GB G1 SSD
zpool:
7x Hitachi 5400RPM 4TB (RAIDZ2 VDEV 1)
8x WDC Green 2TB (RAIDZ2 VDEV 2)
2x Intel 80GB G2 SSD (L2ARC)
If this was a serious machine rather than a home file server, I'd probably have more modern SSDs in there, but the G2 Intel drives were super cheap on liquidation (I think I paid $30 a pop about a year ago). I used to have a RAIDZ mirror for boot using 250GB notebook drives back when I ran OpenSolaris on the box (it's Ubuntu now), but there is basically very little of value on the machine outside of the storage array, and the config data for ZFS storage pools is stored on the pool itself rather than the host system. As in, if the system drive fries, I could rebuild the system and re-import the storage pool. Heck, I already moved the zpool from OpenSolaris to Linux without much issue.
To be honest I've never used it. I've heard of some of its advantages but I haven't really had the chance to actually use it. I might just try it out in a virtual machine just to get the hang of the command line tools and read the documentation.
@Guspaz:
I already moved the zpool from OpenSolaris to Linux without much issue.
Out of interest have you ever tried the FreeBSD ZFS implementation?
Consumer version's haswell i3 + non-ECC desktop ram will work perfectly for your need and will likely save some money so you can purchase more hard drive space.
I recently set up an HTPC+NAS with an i3-4330 CPU, I am adding WD Red 3TB drives to it. And I'm considering to use FlexRaid, as it can run on top of Windows or Linux.
@Guspaz:
I already moved the zpool from OpenSolaris to Linux without much issue.
Out of interest have you ever tried the FreeBSD ZFS implementation?
I haven't. There was some confusion about if FreeBSD supported the same on-disk ZFS version that I was using at the time I was migrating off OpenSolaris, so I migrated to zfsonlinux instead. Add to the fact that I had a decade of experience using Linux and virtually no experience using BSD/UNIX apart from OpenSolaris (which I was never comfortable with) and the choice was pretty easy.
That said, I've heard good things about ZFS on FreeBSD.
One thing I didn't realise is that ZFS supports caching drives. So I presume I can have my 4 normal 7200 RPM SATA drives and then stick a 128GB SSD drive in front of it just to cache the file system? Pretty awesome. Looks like it is well worth spending the extra on RAM / CPU for the flexibility.
The L2ARC normally doesn't exist, and it can be enabled by adding a "cache" device to a pool. It doesn't need to be mirrored or have more than one disk, because if the read from the L2ARC fails, ZFS will simply fall back to reading from disk.
Then there's the ZFS Intent Log, or ZIL. This is sort of a write cache, although not exactly. I won't get into detail, but suffice it to say that the ZIL is normally located on the storage pool itself, and Very Bad Things™ can happen if it's lost. As such, if you want to move the ZIL to dedicated log devices, it's highly recommended to use at least a 2-way SSD mirror.
Personally, I find it's not worth messing with the ZIL, I just slapped two SSDs in as L2ARC cache devices and called it a day.