Using two linodes for one site
1) Which linode would be expected to endure the higher load of the two, given that I have a "normal" web app that uses on average 8-10 optimized queries per pageview? Is it more straining for the web server or mysql? The webserver linode would run PHP since my site is a PHP based site, and the PHP isn't that "heavy" either.
2) Is this tactics needed at all? My site sees about 200,000 pageviews / day and has a query stats of about 25 queries / second. Most queries are optimized i.e. there's no real strain on mysql other than normal operations, selecting from index and such. But the question is whether or not I should just get a "big enough" linode instead of having two separate?
3) Can I setup one linode to host mysite.com and the other linode to host mysql.mysite.com or will it be a problem to have the same main domain separated over two different linodes by using sub-domains as suggested i.e. mysql.mysite.com on 1 linode and mysite.com on the other?
4) Would you care to suggest a good startingpoint for the two linodes. As I understand it, I can "combine" the bandwidth from all my linodes before there's any overage cost. Correct? So, would I be ok in starting with two 360 linodes for this or is that too tiny?
I sincerely appreciate any answers / experience you may have on this. Btw I know of the post about this in the library but it doesnät really adress the above issues.
Thanks much.
19 Replies
If, for some reason, a 512 is not big enough for you, you can just upgrade it until it's enough. You can upgrade all the way to 4096 and even beyond. Also, one server is easier to manage than two servers.
So IMO there's not much to be gained by splitting the web server and the DB server for your workload, unless you're looking to build some sort of high-availability (automatic fallback) cluster.
It's easier to expand vertically instead of horizontally (i.e. bigger server vs more servers). The only reasons for more servers are
1) High Availability
2) Geo location
Remember that if you split your database and web server you're introducing network latency into the mix which will actually slow things down a bit, ok nodes in the same datacentre will minimise this but still it won't be as fast as locally serving it.
If you're having load issues on your 512 you need to investigate your bottle necks.
That's just awesome that a single 512 would probably cut it.
I'm thinking I'll be needing some more bandwidth though but that shouldn't be a problem.
@hybinet: a couple of quick followup questions for you.
is there any reason I can't just stick to having nginx only and run php-fpm?
if I do stick to having nginx serve say the images of the site and let apache do the rest, your answer - about spawning a dozen or so php threads - still implies to me that you suggest running php as fcgi and not as a module in apache, am I correct? What then would be the reason for apache?
Sorry if I'm misunderstanding you here.
Anyway, thanks again, both of you, for taking the time to answer.
@adergaard:
@hybinet: a couple of quick followup questions for you.
is there any reason I can't just stick to having nginx only and run php-fpm?
if I do stick to having nginx serve say the images of the site and let apache do the rest, your answer - about spawning a dozen or so php threads - still implies to me that you suggest running php as fcgi and not as a module in apache, am I correct? What then would be the reason for apache?
By "spawning PHP processes" I was being somewhat neutral there8)
IMO there's very little difference between a) running PHP as FastCGI (either php-fpm or old style spawn-fcgi) and b) running Apache + PHP and proxying dynamic requests from nginx. Both are quite stable, and both perform approximately the same, give or take a few %, if they are properly set up. The nginx configuration is also very similar: fastcgipass vs. proxypass.
The only reason you actually "need" Apache on a VPS is when you need a specific Apache module or if you want to have parsed-on-the-fly per-directory .htaccess files. Everything else can be done by nginx, and it all boils down to preference.
Just do what you're more comfortable with. Some people prefer the proven reliability of Apache, though I suppose FPM could change that. As for myself, I've been using good ol' spawn-fcgi for a couple of years and it's been very reliable though I miss some of FPM's bells and whistles.
If you take the Apache route, keep MaxClients/ServerLimit below 20. If you take the FastCGI route, you should also have no more than 20 children on a Linode 512. That's really all that matters.
I already use a sprite (combining all images in one) for the css-images but these other images are user contributed stuff so there's no caching I could apply that I'm aware of other than header expires, and that's just not a path I like to tread.
gzipping data: no. But the data is minimal so I'm thinking gzipping it would be overkill and just strain the server side.
Anyway, please don't see this as me not being grateful for the advice, believe me I am. Thanks.
@adergaard:
gzipping data: no. But the data is minimal so I'm thinking gzipping it would be overkill and just strain the server side.
gzip does not cause any noticeable server strain, especially on modern hardware such as Linode's. Especially if you use gzip on easily compressible data such as HTML, CSS, JavaScript. etc. The fact that gzipped text is often less than 1/4 the size of the original may actually offset any cost, by reducing the time it takes for the page to be delivered.
Let's say you have 40KB of HTML/CSS/JS per page. (Modern web pages probably weigh much more – jQuery alone is 24KB -- but let's say most of it gets cached using the expires header.) At 200K page views per day, that's 8GB of easily compressible data per day. Do the math
PHP:
ob_start("ob_gzhandler");
nginx:
location /whatever
{
gzip on;
}
In the case of images, however, they're already compressed (unless they're BMP
BTW, @obs probably means using the expires header as you guessed.
nginx:
location /images
{
expires 24h; # or 7d, 30d, whatever
}
thanks much. I'll look into what I can do with the expiration clause in nginx.
@obs:
expires max
There's one catch. If you ever edit one of those files, you'd better rename it. Otherwise some users will keep using the old cached version and complain that things don't work. A useful trick is to append a version number to all of your CSS and JS filenames, e.g. "mycss-1.0.css" and increment them when you make changes. It's actually possible to have this done automagically with a clever combination of PHP and rewrite rules.
Just like it sounds, stash a copy of static files pre-gz'd, to save repeated compression overhead.
For "expires: max", this one's obvious in retrospect but I needed it pointed out…
If you're using svn or git or whatever, a proper vcs for your files? You have the perfect source of data for file versioning right there! In your deploy script just tweak a couple of steps:
# Pre-checkout, remove the files
rm /var/www/domain/static/*_.css
rm /var/www/domain/static/*_.js
# Checkout from vcs
svn update /wherever/domain/
# Post-checkout grab the file version
version = svn get version of file
# Then append to filename... keep it for later, too!
# I use an ugly, unique "hook" just in case
# eg. *.js -> *---version_.js)
mv filename.js filename---$version_.js
# ( I really can't remember the commands for much of this off
# the top of my head though, sorry, will edit this later when I
# have a minute to dig up the config files )
# And update your templates/html/whatever:
# (eg. Replace a hook, similar to above:
#
# you can grep your templates dir to get "htmlfiles"...
# then iterate them dropping in the version, etc. )
htmlfiles = grep "==VERSION==" /sitedir/templates/*.html
for staticname in staticnames:
for htmlfile in ( /sitedir/templates/*.html )
sed 's/filename==VERSION==/filename$VERSION' $f > $tempfile
mv $tempfile $htmlfile
# OR symbolic link, but I've never had a use case for it, perhaps
# useful if you can't use rewrites or you want to avoid the regex
#
# ln -s /staticdir/css/cssname.css /staticdir/css/cssname---version_.css
Then add a "rewrite static.dn.com/(js|css)/(.)–-(.)_.(js|css) static.dn.com/$0/$1.$3" rule in your web server of choice, and you never need worry about stale files again!
As above, if you can't use rewrites or have other reasons you could use a symbolic link or equivalent, of course. Just make sure your web server is ok with following them.
Gotchas:
* - Don't make the "hook" in the filename so ugly your server or older browsers could have an issue with it! (Sounds silly but can happen)
- Make sure to delete the files prior to the checkout from vcs… and that this is __strictly__ a production, deploy, one-way thing, no changes being written back to the repo. This is the scrappy bit, but it worked for us on 10+ front-end servers for 2 years, it's good enough.
- Try to do minification / automated code shrinking, etc. as a step __after__ the checkout, on the static server itself, if you can… Helps avoid minor differences causing full updates to all of your users.</list>
__[ Edit: Updated to explain the href/template situation, thanks db3l!
I've abandoned any and all attempts of making pseudocode remotely runnable, will replace it with real Python if I can find the script later… ]__
@jlynch:
If you're using svn or git or whatever, a proper vcs for your files? You have the perfect source of data for file versioning right there! In your deploy script just tweak a couple of steps:
I may just be misunderstanding the issue, but aren't you missing a step somewhere where you actually change all the references in any site content to use the new version name? It's all well and good to automatically rewrite references inside of nginx but the reason for adjusting the names (or more commonly I think, adding query parameters) to css/js files is so the client browser always re-requests them.
Otherwise, the browser client still thinks it's requesting the same xxx.js file - or whatever hook name is used - regardless of what version nginx will map it to in the filesystem and may still overly aggressively cache it.
I agree you can use something like an svn version for this, but I'd probably just use that version to add/update query parameters onto references to css/js files in the source and not worry about any mapping on the web server side.
– David
@db3l:
@jlynch:If you're using svn or git or whatever, a proper vcs for your files? You have the perfect source of data for file versioning right there! In your deploy script just tweak a couple of steps:
I may just be misunderstanding the issue, but aren't you missing a step somewhere where you actually change all the references in any site content to use the new version name? It's all well and good to automatically rewrite references inside of nginx but the reason for adjusting the names (or more commonly I think, adding query parameters) to css/js files is so the client browser always re-requests them.Otherwise, the browser client still thinks it's requesting the same xxx.js file - or whatever hook name is used - regardless of what version nginx will map it to in the filesystem and may still overly aggressively cache it.
I agree you can use something like an svn version for this, but I'd probably just use that version to add/update query parameters onto references to css/js files in the source and not worry about any mapping on the web server side.
– David
Sorry, I kind of thought this was implied as it had already been discussed.
The way you do it is going to vary utterly based on how you're doing your site but yes, in your root template (Or each page where they're referenced), you need to update the link to use the new number…
We actually did this with a similar step that was run on the templates, they had and a script was run that found all of those and used that to get the filenames, then got versions for those and replaced the strings. I thought that was getting a bit involved and messy to be putting in here, I'll update the explanation I guess.
__[ Edit: For what it's worth, I know query parameters are a common way to do this stuff, and I'm sure they work just fine in most cases, but I found they could have unexpected issues, particularly with upstream caching, which does all sorts of damage when the whole point is to extend expiries.
More subjectively I also found they just plain didn't feel right compared to having mapped "filenames", even if they were virtual or fake or whatever we should call them, but I guess that's a personal preference more than anything. ]__
@jlynch:
Sorry, I kind of thought this was implied as it had already been discussed
Fair enough - I thought I checked the thread and only found references to renaming the files, not changing the content, though I agree it is implied. Just expected to see it as a step when the process was being described step by step.
> The way you do it is going to vary utterly based on how you're doing your site but yes, in your root template (Or each page where they're referenced), you need to update the link to use the new number…
Hmm, but in that case, why go through all the hassle of renaming the files? Using a query parameter (e.g., foo.css?v=###) where ### could be the svn repository version means there's nothing to change on the server filesystem, nor any rewrite rules necessary. And if your site is dynamically generated, the reference could be inserted at render time, so no checkout post-processing ever needed (just an "svn update" on the server).
Just as I was doing my final preview of this post I saw your edit and the comment on query tags, so no answer required, but I figured I'd leave a separate opinion in this post, since I'm on the other subjective side in terms of liking the look of a version parameter than a new filename (I guess I see the version as a temporal dimension of the same named file).
Though as an aside, I am interested in experiences where any caches break such approaches. At worst, it may prevent some intermediate caching due to overly conservative caches not caching anything with a query string. Certainly a lot more than css/js would break through any cache not considering query strings as part of the identity of a cached element.
> I thought that was getting a bit involved and messy to be putting in here, I'll update the explanation I guess.
No biggie - just an extra step of "update content to match" would probably be more than sufficient.
– David
PHP:
$timestamp = filemtime("example.css");
$display_filename = "example-$timestamp.css";
echo '';
This will append a 10-digit Unix timestamp to the end of your CSS filename, and this value will change if and only if the CSS file gets modified. Now go ahead and create a rewrite rule that strips that hyphen and 10 digits from the end of your CSS filename.
Note: Unix time will become 11 digits sometime around the year 2300.
@hybinet:
Note: Unix time will become 11 digits sometime around the year 2300.
– David