Using two linodes for one site

I would like to setup my site to run mysql on one linode and apache or nginx on the other linode. In doing so, I have the following questions:

1) Which linode would be expected to endure the higher load of the two, given that I have a "normal" web app that uses on average 8-10 optimized queries per pageview? Is it more straining for the web server or mysql? The webserver linode would run PHP since my site is a PHP based site, and the PHP isn't that "heavy" either.

2) Is this tactics needed at all? My site sees about 200,000 pageviews / day and has a query stats of about 25 queries / second. Most queries are optimized i.e. there's no real strain on mysql other than normal operations, selecting from index and such. But the question is whether or not I should just get a "big enough" linode instead of having two separate?

3) Can I setup one linode to host mysite.com and the other linode to host mysql.mysite.com or will it be a problem to have the same main domain separated over two different linodes by using sub-domains as suggested i.e. mysql.mysite.com on 1 linode and mysite.com on the other?

4) Would you care to suggest a good startingpoint for the two linodes. As I understand it, I can "combine" the bandwidth from all my linodes before there's any overage cost. Correct? So, would I be ok in starting with two 360 linodes for this or is that too tiny?

I sincerely appreciate any answers / experience you may have on this. Btw I know of the post about this in the library but it doesnät really adress the above issues.

Thanks much.

19 Replies

If your application makes optimized queries and has a good caching strategy, there's no reason a single Linode 512 shouldn't be able to pump out 200K page view per day. Especially if all static files are served by nginx. Give MySQL approx. 1/4 of your RAM, and spawn a dozen PHP processes. It should work fine, unless your app is terribly written.

If, for some reason, a 512 is not big enough for you, you can just upgrade it until it's enough. You can upgrade all the way to 4096 and even beyond. Also, one server is easier to manage than two servers.

So IMO there's not much to be gained by splitting the web server and the DB server for your workload, unless you're looking to build some sort of high-availability (automatic fallback) cluster.

At that load you will be fine on a 512mb, I run something that takes more hits than than and uses seriously non optimized queries (it's really old code…too busy to update) and it performs perfectly.

It's easier to expand vertically instead of horizontally (i.e. bigger server vs more servers). The only reasons for more servers are

1) High Availability

2) Geo location

Remember that if you split your database and web server you're introducing network latency into the mix which will actually slow things down a bit, ok nodes in the same datacentre will minimise this but still it won't be as fast as locally serving it.

If you're having load issues on your 512 you need to investigate your bottle necks.

Thanks much, both of you, for the answers.

That's just awesome that a single 512 would probably cut it.

I'm thinking I'll be needing some more bandwidth though but that shouldn't be a problem.

@hybinet: a couple of quick followup questions for you.

  • is there any reason I can't just stick to having nginx only and run php-fpm?

  • if I do stick to having nginx serve say the images of the site and let apache do the rest, your answer - about spawning a dozen or so php threads - still implies to me that you suggest running php as fcgi and not as a module in apache, am I correct? What then would be the reason for apache?

Sorry if I'm misunderstanding you here.

Anyway, thanks again, both of you, for taking the time to answer.

@adergaard:

@hybinet: a couple of quick followup questions for you.

  • is there any reason I can't just stick to having nginx only and run php-fpm?

  • if I do stick to having nginx serve say the images of the site and let apache do the rest, your answer - about spawning a dozen or so php threads - still implies to me that you suggest running php as fcgi and not as a module in apache, am I correct? What then would be the reason for apache?
    By "spawning PHP processes" I was being somewhat neutral there 8)

IMO there's very little difference between a) running PHP as FastCGI (either php-fpm or old style spawn-fcgi) and b) running Apache + PHP and proxying dynamic requests from nginx. Both are quite stable, and both perform approximately the same, give or take a few %, if they are properly set up. The nginx configuration is also very similar: fastcgipass vs. proxypass.

The only reason you actually "need" Apache on a VPS is when you need a specific Apache module or if you want to have parsed-on-the-fly per-directory .htaccess files. Everything else can be done by nginx, and it all boils down to preference.

Just do what you're more comfortable with. Some people prefer the proven reliability of Apache, though I suppose FPM could change that. As for myself, I've been using good ol' spawn-fcgi for a couple of years and it's been very reliable though I miss some of FPM's bells and whistles.

If you take the Apache route, keep MaxClients/ServerLimit below 20. If you take the FastCGI route, you should also have no more than 20 children on a Linode 512. That's really all that matters.

You say you might need more bandwidth, are you caching images and gzipping data? It will reduce your bandwidth considerably (and the caching of images/javascript/style sheets reduces the number of requests on your server)

@obs: my site is pretty image heavy and there's really no known caching mechanism - at least not to me - other than the "expires" header statement.

I already use a sprite (combining all images in one) for the css-images but these other images are user contributed stuff so there's no caching I could apply that I'm aware of other than header expires, and that's just not a path I like to tread.

gzipping data: no. But the data is minimal so I'm thinking gzipping it would be overkill and just strain the server side.

Anyway, please don't see this as me not being grateful for the advice, believe me I am. Thanks.

@adergaard:

gzipping data: no. But the data is minimal so I'm thinking gzipping it would be overkill and just strain the server side.

gzip does not cause any noticeable server strain, especially on modern hardware such as Linode's. Especially if you use gzip on easily compressible data such as HTML, CSS, JavaScript. etc. The fact that gzipped text is often less than 1/4 the size of the original may actually offset any cost, by reducing the time it takes for the page to be delivered.

Let's say you have 40KB of HTML/CSS/JS per page. (Modern web pages probably weigh much more – jQuery alone is 24KB -- but let's say most of it gets cached using the expires header.) At 200K page views per day, that's 8GB of easily compressible data per day. Do the math :D

PHP:

ob_start("ob_gzhandler");

nginx:

location /whatever
{
    gzip on;
}

In the case of images, however, they're already compressed (unless they're BMP :oops: ) so gzip doesn't make much sense. Use gzip where it makes sense, i.e. text and HTML files. It's a trivial way to shave off several gigs off your bandwidth with virtually no effort.

BTW, @obs probably means using the expires header as you guessed.

nginx:

location /images
{
    expires 24h; # or 7d, 30d, whatever
}

@hybinet:

thanks much. I'll look into what I can do with the expiration clause in nginx.

Yes I do mean the expires header, I set all static files to expires max in nginx, works a treat, pages load super quick, and I also gzip content. When I first enabled gzipping and caching way back when I went from 50GB/month down to…well about 8GB. See the savings!

@obs:

expires max
There's one catch. If you ever edit one of those files, you'd better rename it. Otherwise some users will keep using the old cached version and complain that things don't work. A useful trick is to append a version number to all of your CSS and JS filenames, e.g. "mycss-1.0.css" and increment them when you make changes. It's actually possible to have this done automagically with a clever combination of PHP and rewrite rules.

Just a tip for those experimenting with gzip in nginx, if you haven't, have a look into gzip_static:

http://wiki.nginx.org/NginxHttpGzipStaticModule

Just like it sounds, stash a copy of static files pre-gz'd, to save repeated compression overhead.

For "expires: max", this one's obvious in retrospect but I needed it pointed out… :)

If you're using svn or git or whatever, a proper vcs for your files? You have the perfect source of data for file versioning right there! In your deploy script just tweak a couple of steps:

# Pre-checkout, remove the files 
rm /var/www/domain/static/*_.css 
rm /var/www/domain/static/*_.js

# Checkout from vcs
svn update /wherever/domain/

# Post-checkout grab the file version
version = svn get version of file

# Then append to filename... keep it for later, too!
# I use an ugly, unique "hook" just in case 
# eg. *.js -> *---version_.js) 
mv filename.js filename---$version_.js

# ( I really can't remember the commands for much of this off
# the top of my head though, sorry, will edit this later when I 
# have a minute to dig up the config files )

# And update your templates/html/whatever:
# (eg. Replace a hook, similar to above:
#      
#       you can grep your templates dir to get "htmlfiles"...
#       then iterate them dropping in the version, etc. )

htmlfiles = grep "==VERSION==" /sitedir/templates/*.html
for staticname in staticnames:
   for htmlfile in ( /sitedir/templates/*.html )
      sed 's/filename==VERSION==/filename$VERSION' $f > $tempfile
      mv $tempfile $htmlfile

# OR symbolic link, but I've never had a use case for it, perhaps
# useful if you can't use rewrites or you want to avoid the regex
#
# ln -s /staticdir/css/cssname.css /staticdir/css/cssname---version_.css 

Then add a "rewrite static.dn.com/(js|css)/(.)–-(.)_.(js|css) static.dn.com/$0/$1.$3" rule in your web server of choice, and you never need worry about stale files again! :wink:

As above, if you can't use rewrites or have other reasons you could use a symbolic link or equivalent, of course. Just make sure your web server is ok with following them.

Gotchas:
* - Don't make the "hook" in the filename so ugly your server or older browsers could have an issue with it! (Sounds silly but can happen)

- Make sure to delete the files prior to the checkout from vcs… and that this is __strictly__ a production, deploy, one-way thing, no changes being written back to the repo. This is the scrappy bit, but it worked for us on 10+ front-end servers for 2 years, it's good enough.

- Try to do minification / automated code shrinking, etc. as a step __after__ the checkout, on the static server itself, if you can… Helps avoid minor differences causing full updates to all of your users.</list> 

__[ Edit: Updated to explain the href/template situation, thanks db3l! :)

I've abandoned any and all attempts of making pseudocode remotely runnable, will replace it with real Python if I can find the script later… ]__

@jlynch:

If you're using svn or git or whatever, a proper vcs for your files? You have the perfect source of data for file versioning right there! In your deploy script just tweak a couple of steps:
I may just be misunderstanding the issue, but aren't you missing a step somewhere where you actually change all the references in any site content to use the new version name? It's all well and good to automatically rewrite references inside of nginx but the reason for adjusting the names (or more commonly I think, adding query parameters) to css/js files is so the client browser always re-requests them.

Otherwise, the browser client still thinks it's requesting the same xxx.js file - or whatever hook name is used - regardless of what version nginx will map it to in the filesystem and may still overly aggressively cache it.

I agree you can use something like an svn version for this, but I'd probably just use that version to add/update query parameters onto references to css/js files in the source and not worry about any mapping on the web server side.

– David

@db3l:

@jlynch:

If you're using svn or git or whatever, a proper vcs for your files? You have the perfect source of data for file versioning right there! In your deploy script just tweak a couple of steps:
I may just be misunderstanding the issue, but aren't you missing a step somewhere where you actually change all the references in any site content to use the new version name? It's all well and good to automatically rewrite references inside of nginx but the reason for adjusting the names (or more commonly I think, adding query parameters) to css/js files is so the client browser always re-requests them.

Otherwise, the browser client still thinks it's requesting the same xxx.js file - or whatever hook name is used - regardless of what version nginx will map it to in the filesystem and may still overly aggressively cache it.

I agree you can use something like an svn version for this, but I'd probably just use that version to add/update query parameters onto references to css/js files in the source and not worry about any mapping on the web server side.

– David

Sorry, I kind of thought this was implied as it had already been discussed. :P

The way you do it is going to vary utterly based on how you're doing your site but yes, in your root template (Or each page where they're referenced), you need to update the link to use the new number…

We actually did this with a similar step that was run on the templates, they had and a script was run that found all of those and used that to get the filenames, then got versions for those and replaced the strings. I thought that was getting a bit involved and messy to be putting in here, I'll update the explanation I guess. :)

__[ Edit: For what it's worth, I know query parameters are a common way to do this stuff, and I'm sure they work just fine in most cases, but I found they could have unexpected issues, particularly with upstream caching, which does all sorts of damage when the whole point is to extend expiries.

More subjectively I also found they just plain didn't feel right compared to having mapped "filenames", even if they were virtual or fake or whatever we should call them, but I guess that's a personal preference more than anything. ]__

@jlynch:

Sorry, I kind of thought this was implied as it had already been discussed
Fair enough - I thought I checked the thread and only found references to renaming the files, not changing the content, though I agree it is implied. Just expected to see it as a step when the process was being described step by step.

> The way you do it is going to vary utterly based on how you're doing your site but yes, in your root template (Or each page where they're referenced), you need to update the link to use the new number…
Hmm, but in that case, why go through all the hassle of renaming the files? Using a query parameter (e.g., foo.css?v=###) where ### could be the svn repository version means there's nothing to change on the server filesystem, nor any rewrite rules necessary. And if your site is dynamically generated, the reference could be inserted at render time, so no checkout post-processing ever needed (just an "svn update" on the server).

Just as I was doing my final preview of this post I saw your edit and the comment on query tags, so no answer required, but I figured I'd leave a separate opinion in this post, since I'm on the other subjective side in terms of liking the look of a version parameter than a new filename (I guess I see the version as a temporal dimension of the same named file).

Though as an aside, I am interested in experiences where any caches break such approaches. At worst, it may prevent some intermediate caching due to overly conservative caches not caching anything with a query string. Certainly a lot more than css/js would break through any cache not considering query strings as part of the identity of a cached element.

> I thought that was getting a bit involved and messy to be putting in here, I'll update the explanation I guess. :)
No biggie - just an extra step of "update content to match" would probably be more than sufficient.

– David

There's a super easy way to do this without relying on VCS integration such as post-commit hooks. (Everyone should be using some sort of VCS, but anyway.)

PHP:

$timestamp = filemtime("example.css");
$display_filename = "example-$timestamp.css";
echo '';

This will append a 10-digit Unix timestamp to the end of your CSS filename, and this value will change if and only if the CSS file gets modified. Now go ahead and create a rewrite rule that strips that hyphen and 10 digits from the end of your CSS filename.

Note: Unix time will become 11 digits sometime around the year 2300.

@hybinet:

Note: Unix time will become 11 digits sometime around the year 2300.

:o I better get ready for that! :P

I suspect the signed/unsigned 32-bit rollover in 2038 is going to be more "exciting", not to mention something we're more likely to be alive for :-)

– David

I suspect in 28 years we'll all be running 64 bit systems so that shouldn't be a problem! Of course someone…somewhere…in a deep dark hole will still have a 32 bit system like those ye olde systems that still run on cobol and the like.

Reply

Please enter an answer
Tips:

You can mention users to notify them: @username

You can use Markdown to format your question. For more examples see the Markdown Cheatsheet.

> I’m a blockquote.

I’m a blockquote.

[I'm a link] (https://www.google.com)

I'm a link

**I am bold** I am bold

*I am italicized* I am italicized

Community Code of Conduct