Traffic accounting for hosted sites.

Hi all,

I'm trying to setup traffic accounting for multiple virtually hosted sites.

Goals:

virtuallyhosted.com in/out traffic with

  • Separated statistics on smtp, pop3, imap, www, ssh etc.

  • user could not spoof accounting by doing something like:

"ssh user#virtuallyhosted.com@someothervirtualsite.com"

  • Automatic monitoring and alerting.

This is a good start, but not enough:

http://www.enderunix.org/isoqlog/

Any opinions?

5 Replies

For stats, I am a big fan of awstats - http://awstats.sourceforge.net/ - it has some nice features for dealing with different domains and I like its very clean, crisp interface.

About spoofing… seems like that would pose a difficult problem if you are just using Apache virtual hosts and having multiple domain names point to the same IP address. How would ssh know what domain name was used to resolve the IP? I have seen that sort of thing done with packages like CPanel and WHM - http://www.cpanel.net/ - but I think that is controlling access in other ways and turning you into more of a hosting reseller. shrug

This is something I've been thinking about for my Linode. I'm thinking about reselling hosting on the box - though the problem being is how to account for bandwidth transfer.

I assume you are running apache (either 1.3.x or 2.x), in which case there are a couple of tools you can use in order to identify which apache virtualhosts are using what bandwidth.

If you are running apache 1.3.x (I'm running 1.3.27 on Debian), you can use mod-throttle (http://www.snert.com/Software/mod_throttle/). It's intended to be used to limit the bandwidth used over a period, though it can display (per virtualhost), what bandwidth has been used. I've even written a PHP script that parses it's output for use on another page - such as the 'overview' page in the Linode members area. Having done some testing, mod-throttle is very accurate in it's reporting - it reported the size of a file I downloaded to within 0.1kb.

If you are running apache 2.x, there is mod-watch, written by the same guy. This is a much more powerful version and allows you do even graph bandwidth at various times of the day. Note that I've not used this - though if mod-throttle is anything to go by then it'll be good. http://www.snert.com/Software/mod_watch/

As you've found, the problems arise when you come to monitoring email, ssh, ftp etc. I looked into this a while back, and came to the conclusion that there really a way to do this. I found a stats program called modlogon that appeared to go some way to providing a solution - parsing exim/proftpd logs, and tallying them - though I can't seem to find the URL now :(

Obviously I'm not aware of all of the circumstances surronding your problem, though I'm tempted to go with the opinion of SSH traffic being so neglible that it's irrelevant, and (assuming that anonymous FTP is disabled) adding a further 10% to whatever mod-throttle etc says to account for FTP and exim usage. Of course this is not an ideal arrangement, since who's to say that the customer doesn't use email lots and very little www transfer.

If anybody else has any thoughts/comments/solutions to the problem I for one would be glad to hear them.

Application level traffic accounting looks like it's tiresome to implement and easy to subvert.

Have you looked at IPCAD (Internet Protocol Cisco Accounting Daemon)? It allows you to use iptable/netfilter rules (amongst other things) to capture the traffic, and outputs RSH, NetFlow and console output in Cisco-like fashion. It's on SourceForge.

Alternatively, you could try SITA (Simple IP Traffic Accounting) http://web4.hm/, which looks similar (and easier) and offers a free licence for non-commercial use, before going to the trouble of getting your head round IPCAD.

There's a roll-your-own netfilter/iptables traffic accounting mini howto here: http://members.chello.at/goesta.smekal/code/ita/ which explains the principles (but which won't cut it in a production environment).

Caveats: These are just ideas - not tested on a Linode or in serious production. Beware of IPCAD configuration options that attempt to put the ethernet interface into promiscuous mode - this will most likely upset caker and may break the interface under UML.

Edit:After I stopped to think about this, I realise that, of course, this only works if every client has their own IP address. Sorry. I'll shut up now :oops: .

@pclissold:

I realise that, of course, this only works if every client has their own IP address.

Yep, that's the problem. If every client had their own IP then it would be much easier to implement - though as you say, easy to bypass without some serious configuration hastles.

Saying that Cpanel/Plesk appear to have solved the problem by some means - the cpanel admin interface will display bandwidth usage broken down to that used by individual services, for an individual account. Anybody have any idea how this is done? - I'm sure that if an open source version of what they are using was produced it'd be snapped up pretty quick by all sorts of small hosting companies and the like!

I've used Ensim, altough it's very limited and slow - it has superior traffic logging. It can log http/headers/imap/pop/ssh/ftp/anything for sites that are hosted on same domain. I just would like to know how it's done..

Reply

Please enter an answer
Tips:

You can mention users to notify them: @username

You can use Markdown to format your question. For more examples see the Markdown Cheatsheet.

> I’m a blockquote.

I’m a blockquote.

[I'm a link] (https://www.google.com)

I'm a link

**I am bold** I am bold

*I am italicized* I am italicized

Community Code of Conduct