Hello again, Linode

Well as others predicted, I'm back with Linode after a failed Amazon EC2 micro experiment.

Turns out that EC2 micro has not only pretty anemic CPU allocation, but it suffers from a terrible throttling policy that will throttle the host to almost no CPU very frequently. This makes it absolutely and completely inappropriate for any interactive services like web servers.

The good news is that Amazon agreed to refund me my reservation fee because I had only used the service for two weeks and found it to be so lacking.

So now I am back with Linode with a Linode 512 (previously had a Linode 768); paying yearly puts the price at about the same as an Amazon EC2 micro instance over the same duration. I was forced to migrate my gallery contents to S3, mounted using s3fs, and done in a simplified and hokey way that has its own drawbacks; but I'd much rather deal with this than with the terrible, abysmal performance of an EC2 micro instance.

As much as I wish that Linode would provide more disk space, it's pretty clear that every other aspect of their service is far superior to everything else (twice now I've tried to leave Linode and come back because other services were much worse).

On the plus side, I've become quite good at moving instances between providers; so should I ever need to do this again, I know what to do. But I don't think I'll be going anywhere for a good long while …

10 Replies

I tried to setup s3fs a while ago, and had some difficulty getting it to mount on startup. Would you be willing to share your method for getting that working?

All I do to mount the S3 filesystem is run:

/usr/bin/s3fs [bucketname] [mount point]

But to repeat a question from the "So long again, Linode" thread - why not link straight to the files in the S3 bucket instead of making all the traffic go through your linode?

@oliver:

All I do to mount the S3 filesystem is run:

/usr/bin/s3fs [bucketname] [mount point]

I found that s3fs is kind of poorly written (I knew this going in - which is why I was really hesitant to use it but now that it's my best option I've switched over to it) and will exit sometimes, killing your mount. I patched the s3fs code to remove one case where it would exit improperly (s3fs uses the libcurl library and if libcurl reports an error code that s3fs doesn't understand, it just exits; since libcurl is dynamically linked and presumably can add error codes with new versions, this means that s3fs basically can get screwed over by a newer version of libcurl, which is what happened to me - when copying a huge number of files to an s3fs mounted filesystem, I eventually got s3fs to exit because of an unknown curl error of '56'. I patched the s3fs code to remove this exit and to cause it to treat any unknown curl error as something to retry, not exit on). And I wrote a little script to keep s3fs mounted just in case it screws the pooch, and given how little faith I have in the s3fs code, is something I expect to happen from time to time. I put this in a script /etc/rc.s3, and run that script from /etc/rc.local:

#!/bin/sh

(while true; do COUNT=`ps auxc | grep s3fs | wc -l`; if [ "$COUNT" = "0" ]; then umount /mnt/s3/[my bucket]; AWSACCESSKEYID=[my access key] AWSSECRETACCESSKEY=[my access key id] /usr/bin/s3fs [my bucket] /mnt/s3/[my bucket] -o default_acl=public-read -o allow_other -o use_cache=/var/cache/s3fs; fi; sleep 30; done) &

The above runs a loop forever that first checks to see if there is any s3fs executable running, and if not, unmounts my s3fs mount point (because I've found that even when s3fs dies it can leave the mount up which needs to be unmounted before starting s3fs again; and if the mount is not up, this harmlessly fails) and then starts s3fs up. Then it waits for 30 seconds and repeats.

@oliver:

But to repeat a question from the "So long again, Linode" thread - why not link straight to the files in the S3 bucket instead of making all the traffic go through your linode?

With the gallery2 software, it's more complicated than that. You first have to convince the gallery2 code to report a link to an external source for images/movies instead of serving them up directly from a local file. This requires PHP script work because the gallery2 PHP scripts are not written to do this naturally.

Then there is another problem: the gallery2 software auto-generates thumbnails and medium sized versions of photos on demand. When you first upload a photo, it is written to disk but no thumbnail is made. Only on the first request for the thumbnail is made is it created. Same for if you rotate or otherwise modify an image - the thumbnail is generated on demand.

This on-demand thumbnail generation requires some more sophisticated hackery to prevent gallery2 from accessing files on the local store in a way that would cause s3fs to constantly be fetchign them from S3 anyway.

To simplify things, and until I come up with a better solution, I decided to not hack the gallery2 software at all, but just mount the image storage using s3fs and store the images on S3. The generated thumbnails also get stored in s3fs; but I allow its local file caching to cache all of them so that aside from a HEAD request that is made to S3 every time one of them is accessed, they can be quickly served up from local storage. This allows the most request-intensive pages (thumbnail pages with 20 or 30 thumbnails) to be mostly serviced from the local disk. Since the image storage is also mounted from s3fs, whenever a full sized image is requested it also has to be downloaded from S3 to my server and then served up via gallery2, and it would certainly be better to have that process be replaced by having gallery2 just redirect the browser to the file on S3; but this would require gallery2 script hackery and really, my gallery is not heavily used and full sized images are rarely downloaded anyway.

So I keep ~2 GB of generated thumbnails and medium sized images cached locally on my server to allow gallery2 to service most requests quickly (aside from the annoying HEAD requests to S3 that s3fs generates every time a file is read), and keep the remainder of the ~20 GB of data on S3, thus not exceeding the disk space available on my Linode 512. I have a cron job that every day goes through and deletes all of the s3fs cached large image files (leaving my cached thumbnails and medium sized images alone), so the only excess space used on my Linode is the large image files that were most recently accessed in the last couple of days. This should never exceed a couple of hundred megs, let alone the ~10 GB that I still have free on my Linode.

My God you have patience ;) I would have gone to another solution about 5 times during that lot. S3 sounds like more trouble than it's worth?

@bji:

I found that s3fs is kind of poorly written (I knew this going in - which is why I was really hesitant to use it but now that it's my best option I've switched over to it)
I'm curious if you gave s3backer any testing? I haven't had need to set up S3 access for one of my nodes yet, but I sort of like the s3backer design, and it's a bit more active of a project. You would lose the ability to reference buckets externally, but as you say with respect to gallery2, you're not doing that right now anyway.

s3backer's treating S3 as a raw block device likely has different trade-offs (including potential operational cost) than s3fs, but then again maybe it wouldn't have quite as many warts?

– David

@db3l:

@bji:

I found that s3fs is kind of poorly written (I knew this going in - which is why I was really hesitant to use it but now that it's my best option I've switched over to it)
I'm curious if you gave s3backer any testing? I haven't had need to set up S3 access for one of my nodes yet, but I sort of like the s3backer design, and it's a bit more active of a project. You would lose the ability to reference buckets externally, but as you say with respect to gallery2, you're not doing that right now anyway.

s3backer's treating S3 as a raw block device likely has different trade-offs (including potential operational cost) than s3fs, but then again maybe it wouldn't have quite as many warts?

– David

s3backer is a much, much better designed and implemented filesystem from what I have seen. I had some interaction with the author and from the code, and from his communication, it's definitely something I would trust alot more than s3fs.

However, s3backer stores logical file blocks on S3, not whole files; so like you said, you completely lose the ability to serve these files directly via S3. Given that I still intend to hack gallery2 to redirect the browser to the S3 URLs directly eventually, I want my files to be stored in a form that can be directly downloaded by a browser.

There is another problem with s3backer, and it's one that I communicated to the author but that we don't see eye-to-eye on. s3backer relies on Linux' standard block cache for caching; it doesn't persist cached blocks to disk on its own. The author thinks that this is a more elegant way to do caching because it relies on pre-existing Linux mechanisms. The problem is that every time your server reboots, you lose your cache because the cache has not been persisted to disk. Additionally, this limits the effective size of the cache to what can fit into virtual memory. I would not want this behavior; I would not want to have to re-download the several gigabytes of thumbnail and medium sized images each time the server was rebooted. s3fs persists its cache in disk so it is re-used after a reboot.

Also I did test s3backer in its early form and it was very slow; but I think that the author has done considerable performance improvement since then so I can't say for sure if it still is.

As I may have written in these forums before, at one time I was on the road to implementing my own version of s3fs with more robustness and performance. My idea was to implement the caching separately, as a FUSE filesystem that did nothing other than cache requests to one mount point on some segment of the disk; this could be used for any filesystem and would solve the caching problem generically. For example, you'd do something like:

cachemount /mnt/foo /mnt/bar /var/cache/baz

Which would intercept any filesystem requests for any file under /mnt/foo, first checking for a locally cached version under /var/cache/baz, and if one was not found, loading the file from the corresponding location under /mnt/bar, caching it under /var/cache/baz, and then satisfying the request with the cached result.

Then the S3 based filesystem itself would be pretty simple as it wouldn't need any caching to be implemented internally at all; it would just assume completely uncached access to S3. In the end to mount an S3 filesystem with local disk caching you'd do something like this:

cachemount /mnt/foo /mnt/bar /var/cache/baz

(as above, makes references to files under /mnt/foo satisfied by cache stored in /var/cache/baz and backed by files in /mnt/bar)

s3mount bucket /mnt/bar

(which would make any request to files under /mnt/bar be satisfied by an S3 request for the corresponding file in the given bucket)

I only got as far as writing a robust and fast S3 interface in C, which I turned into a library (libs3) and actually licensed to several companies for enough $$ to have made the entire exercise very worth my while. But I lost interest and never finished the whole thing. Maybe someday …

@tentimes:

My God you have patience ;) I would have gone to another solution about 5 times during that lot. S3 sounds like more trouble than it's worth?

Actually it's gallery that's more trouble than it's worth. I cannot recount the number of problems I have had with this software. It is the one piece of software that I know that, when I need to upgrade or otherwise modify my server, will throw me into days and days of painful maintenance. They've released a new version of gallery, called gallery3, which, surprise, surprise, requires tons of effort to upgrade from gallery2, and, oh yeah, drops support for the database that I was using (postgresql). So I cannot even upgrade this software without major database and upgrading headaches.

If I had to do it all over again, I would not choose gallery as my photo hosting software.

@bji:

There is another problem with s3backer, and it's one that I communicated to the author but that we don't see eye-to-eye on. s3backer relies on Linux' standard block cache for caching; it doesn't persist cached blocks to disk on its own. The author thinks that this is a more elegant way to do caching because it relies on pre-existing Linux mechanisms. The problem is that every time your server reboots, you lose your cache because the cache has not been persisted to disk. Additionally, this limits the effective size of the cache to what can fit into virtual memory. I would not want this behavior; I would not want to have to re-download the several gigabytes of thumbnail and medium sized images each time the server was rebooted. s3fs persists its cache in disk so it is re-used after a reboot.

I just checked the s3backer home page and apparently the author has come around to my way of thinking; s3backer now supports a persisted block cache that survives reboots, so my above concern is no longer valid.

s3backer is looking pretty good to me :) I wonder how reliable it is as a solution? I've been tempted with s3 (I have an account not used) but I feel like I wouldn't want to bet my house on it.

@bji:

s3backer is a much, much better designed and implemented filesystem from what I have seen. I had some interaction with the author and from the code, and from his communication, it's definitely something I would trust alot more than s3fs.

Calling it a "filesystem" glosses over the biggest technical advantage it has: it isn't a filesystem at all. This is the biggest practical disadvantage, too, for the reasons you mention. But, more significantly,

> As I may have written in these forums before, at one time I was on the road to implementing my own version of s3fs with more robustness and performance. My idea was to implement the caching separately, as a FUSE filesystem that did nothing other than cache requests to one mount point on some segment of the disk; this could be used for any filesystem and would solve the caching problem generically. For example, you'd do something like:

(…)

You might be interested in Documentation/filesystems/caching/fscache.txt… I don't know what the current state of the code is, but it's quite similar, and seems to have a rather persistent set of documentation if nothing else.

Reply

Please enter an answer
Tips:

You can mention users to notify them: @username

You can use Markdown to format your question. For more examples see the Markdown Cheatsheet.

> I’m a blockquote.

I’m a blockquote.

[I'm a link] (https://www.google.com)

I'm a link

**I am bold** I am bold

*I am italicized* I am italicized

Community Code of Conduct