Linode losing memory

My Linode is using a RedHat 9 (large) with 96 MB RAM

and 256 MB SWAP.

Running Apache, Sendmail with no traffic to speak of and

Nagios monitoring 5 remote ports.

The system keep running out of memory slowly over time RAM is consumed and NO SWAP space is ever used.

Example:

After a clean reboot I have over 55 megs of available RAM.

Over time 2 hours and 22 minutes of uptime my RAM looks like this…

91416k av, 52128k used, 39288k free

Swap: 263160k av, 0k used, 263160k free

Swap does not EVER get used and RAM keeps counting down.

I have the exact same setup here in my lab running the exact same software:

Mem: 2066300K av, 2045904K used, 20396K free

Swap: 4096524K av, 1321928K used, 2774596K free

Is my Linode acting strange or what?

7 Replies

@ss2chef:

Swap does not EVER get used and RAM keeps counting down.

[…snip…]

Is my Linode acting strange or what?

Someone correct me if I am wrong but isn't this expected behavior. The OS (and system libraries) will cache data that has been previously used by a program even if the program releases the data. The idea is that you may read from that file again soon and this way it is already in memory. The goal of the operating system is to keep the memory full of currently in use items and items that might be asked for in the future. If you ask for something new it tries to figure out which item is not being used and will be least likely to be used in the future. Of course it is not using your swap because there is no point in caching data into swap because either way you will be reading it from disk.

Other than these stats you don't like is the system performing badly?

@eman:

@ss2chef:

Swap does not EVER get used and RAM keeps counting down.

[…snip…]

Is my Linode acting strange or what?

Someone correct me if I am wrong but isn't this expected behavior. The OS (and system libraries) will cache data that has been previously used by a program even if the program releases the data. The idea is that you may read from that file again soon and this way it is already in memory. The goal of the operating system is to keep the memory full of currently in use items and items that might be asked for in the future. If you ask for something new it tries to figure out which item is not being used and will be least likely to be used in the future. Of course it is not using your swap because there is no point in caching data into swap because either way you will be reading it from disk.

Other than these stats you don't like is the system performing badly?

Hi and thank you for responding…

As the available RAM counts down to around 3 megs, the system slows to a crawl. This is with almost no network traffic or server use.

I have several dozen Linux hosts in production and have never seen such behavior.

server load is almost non existent and to see RAM slowly counting down is strange. It is supposed to be dynamic.

Can you post the output of:

# iostat 1 90
# free
# ps -e -o pid,cmd,%mem,rss,trs,sz,vsz
# uptime

Do that for both when the system is running normally, and then again while the system is at a crawl.

Does it slow down at predictable times of the day? Like midnight? Top of the hour (i.e. 7:00, 8:00, etc)?

How are you detecting the system is running really slowly? What commands and tools (and output) are you using to determine that?

iostat is from the sysstat utilities, which is from:

http://perso.wanadoo.fr/sebastien.godard/

You posted used/free/total memory numbers, but you're missing something very important – buffer and cached memory information.

Since you're not touching swap at all, it doesn't sound like normal memory starvation issues. Perhaps CPU, network I/O, or disk I/O issues when things are behaving poorly?

Linux's memory scheme is indeed to use as much memory as possible for buffering/caching BUT when apps asks for memory, Linux will take away buffered/cached memory and give it to the application for use. It's a pretty good arrangement.

Also, memory numbers can be confusing sometimes.

Some tools will report VSZ – total size of the virtual address space allocated, which is not the same as amount of memory actually "used" (RSS).

@tronic:

Can you post the output of:

# iostat 1 90
# free
# ps -e -o pid,cmd,%mem,rss,trs,sz,vsz
# uptime

Do that for both when the system is running normally, and then again while the system is at a crawl.

Does it slow down at predictable times of the day? Like midnight? Top of the hour (i.e. 7:00, 8:00, etc)?

How are you detecting the system is running really slowly? What commands and tools (and output) are you using to determine that?

iostat is from the sysstat utilities, which is from:

http://perso.wanadoo.fr/sebastien.godard/

You posted used/free/total memory numbers, but you're missing something very important – buffer and cached memory information.

Since you're not touching swap at all, it doesn't sound like normal memory starvation issues. Perhaps CPU, network I/O, or disk I/O issues when things are behaving poorly?

Linux's memory scheme is indeed to use as much memory as possible for buffering/caching BUT when apps asks for memory, Linux will take away buffered/cached memory and give it to the application for use. It's a pretty good arrangement.

Also, memory numbers can be confusing sometimes.

Some tools will report VSZ – total size of the virtual address space allocated, which is not the same as amount of memory actually "used" (RSS).

My concern is not avail RAM per se, but the fact that SWAP

usage remains at 0% regardless of the load I put on the server.

I find it strange as I have compared to several like boxes as well

as another similarly configured Linode host and all have an

active SWAP usage regardless of server load.

I realize I can crank down the RAM profile apache uses but

with server load at next to nothing apache should not be so

unresponsive…Thoughts?

I'll grab the iostat tools asap.

It's crawling now and here is some info.

free

total used free shared buffers cached

Mem: 91416 88688 2728 0 34788 33728

-/+ buffers/cache: 20172 71244

Swap: 263160 0 263160

ps -e -o pid,cmd,%mem,rss,trs,sz,vsz

PID CMD %MEM RSS TRS SZ VSZ

1 init [3] 0.5 504 23 347 1388

2 [keventd] 0.0 0 0 0 0

3 [ksoftirqd_CPU0] 0.0 0 0 0 0

4 [kswapd] 0.0 0 0 0 0

5 [bdflush] 0.0 0 0 0 0

6 [kupdated] 0.0 0 0 0 0

7 [jfsIO] 0.0 0 0 0 0

8 [jfsCommit] 0.0 0 0 0 0

9 [jfsSync] 0.0 0 0 0 0

10 [xfsbufd] 0.0 0 0 0 0

11 [xfslogd/0] 0.0 0 0 0 0

12 [xfsdatad/0] 0.0 0 0 0 0

13 [mdrecoveryd] 0.0 0 0 0 0

14 [kjournald] 0.0 0 0 0 0

815 /sbin/dhclient - 1.0 992 314 498 1992

865 syslogd -m 0 0.7 680 24 389 1556

869 klogd -x 0.5 460 18 347 1388

914 /usr/sbin/sshd 1.6 1528 265 880 3520

924 xinetd -stayaliv 0.9 860 129 512 2048

934 /usr/sbin/httpd 9.1 8392 289 4822 19288

943 crond 0.6 600 19 360 1440

961 /usr/sbin/atd 0.5 548 12 357 1428

995 /usr/bin/nagios 3.7 3416 262 1776 7104

1001 /sbin/mingetty t 0.4 416 6 342 1368

1002 /usr/sbin/httpd 9.7 8904 289 4871 19484

1003 /usr/sbin/httpd 9.6 8860 289 4861 19444

1004 /usr/sbin/httpd 9.6 8864 289 4861 19444

1005 /usr/sbin/httpd 9.7 8896 289 4856 19424

1006 /usr/sbin/httpd 9.6 8856 289 4859 19436

1007 /usr/sbin/httpd 9.6 8848 289 4861 19444

1008 /usr/sbin/httpd 9.6 8852 289 4857 19428

1009 /usr/sbin/httpd 9.6 8848 289 4857 19428

11105 /usr/sbin/sshd 2.2 2012 265 1692 6768

11110 /usr/sbin/sshd 2.4 2228 265 1701 6804

11111 -bash 1.5 1376 588 1075 4300

11142 su - 1.0 972 16 1027 4108

11143 -bash 1.5 1384 588 1077 4308

11386 ps -e -o pid,cmd 0.7 684 66 660 2640

uptime

13:16:35 up 1 day, 5 min, 1 user, load average: 0.00, 0.00, 0.00

Thanks for the info. :)

If I might suggest something… it's a lot easier to read columns of numbers if they're in a monospaced font. Easiest way to do that is to start the block with the tag. Anyway, you've stated a problem. So what I'm doing is to methodically gather information, then can analyze what the numbers are telling you. Don't want to jump to any conclusions or culprits yet -- the data will point to something. It's fun playing detective. :D Need some more information. How do you tell that there is a performance problem? Are keystrokes when ssh'd in echoing really slowly? The web pages coming up after a 10-15 second delay? Does ssh session feel normal but web performance suck? Something else? Looking at your numbers, they seem to add up properly for memory, so it would appear as if your performance issue is somewhere else other than memory. Explanation: The output of your ps -e -o ... command has numbers in the RSS column. All that adds up to about 96 MB. [b]BUT[/b]! I used the pmap utility (it comes with procps so it's probably already installed on your machine) to look at details of how apache processes had its memory allocated. It looks like most of apache's memory usage is due to shared libraries. Since you have process-based Apache daemons, there will be 10 copies of httpd running, each with its own memory allocation. However, in reality, the shared libraries will be loaded only once, not 10 times. So it is [b]NOT[/b] 9 MB per apache daemon x 10 = 90 MB. So one apache process will have shared libraries + private memory allocations, for about 9 MB on your machine. The rest of apache processes will be using only private memory, which is probably about 1.2 MB for each apache process on your system. So calculation is more like 9 MB + (1.2 * 9) = 19.8 MB for Apache usage, roughly, for your setup. Then you have some other daemons that eats less than 1 MB except for Nagios (which also uses some common shared libs too). total used free shared buffers cached Mem: 91416 88688 2728 0 34788 33728 Total available (kernel eats some memory for itself) is 91416 K. Used up is 88688 K. Buffers + cache is 34788+33728 = 68516 K. So if you exclude buffers + cache, you currently had 22.363 MB usable by your apps. Apparently that's about all your apps asked for -- Apache and other daemons... which is in the right ballpark as my calculations. So to answer your question... everything adds up. memory-wise, and you are not starved for memory. Hence, there was no need to go into swap... so Linux didn't because in reality you had almost 70 MB free. The numbers reported by ps, even for RSS, don't properly account for more than one instance of shared libs loaded by other processes... so it gives very misleading numbers. That's a subtle gotcha to keep in mind. This is what I meant earlier by memory reporting being tricky/misleading (at first glance) sometimes. That's why pmap was invented, to break down actual memory usage, and make it easier to see exactly what memory is really being allocated, and what are just merely pointers to the same library (instead of a separate allocation). So... your performance issue is most likely not memory related. Possible culprits: sudden spike in CPU usage -- especially if there's a "thundering herd" going on. (short-term spike) Or maybe there's a sudden burst of disk I/O traffic that forces processes wanting to read or write stuff to stall a bit. Or maybe there's a process deadlocking on getting some kind of resource such as a race condition. Not very common, but not unheard of, either. I come across this about once or twice a year with various multi-threaded or multiple-processes apps. Or maybe another Linode system on the same host has temporarily robbed much of the disk I/O, though the I/O queuing stuff that Linode runs tend to prevent a single Linode from eating all available disk I/O, I understand. So probably not a possible culprit. Or you may be getting a denial of service attack -- check system and application logs to make sure you haven't gotten a burst of unusual traffic. There's quite a few possibilities, basically. So now just need to know how you are detecting a performance problem, because that will help narrow it down to the offending culprit. If you REALLY want to force your system into swap to make sure that Linux's swapping algorithm works, there's some utilities written that allocates a lot of memory and modifies its memory pages (forcing kernel to mark them "dirty" and make eligible for potential swapping once real memory runs out). Unfortunately, I can't remember name or where to get any of these special tools, but definitely a lot easier to see the kernel forced to page stuff in/out of memory with that kind of tool.

````
My box looks like this:
Physical:
Free : 5.27 MB Used : 132.12 MB Total : 137.39 MB
Swap:
Free: 263.24 Used: MB 249.75 MB TotaL: 512.99 MB

````

That seems pretty normal to me. When I ran sysinfo on my home linux desktop, it was pretty much the same. Linux does like to use as much physical ram as it can. However, I have never seen swap not being used. That is pretty weird.

The behavior your describing sounds very familiar with issues I've seen on a couple servers in the past month. At one point I thought it had something to do with this:

http://www.gentoo.org/security/en/glsa/ … 411-18.xml">http://www.gentoo.org/security/en/glsa/glsa-200411-18.xml

… since my logs were full of requests that attempt to exploit this recent vunerability, and I hadn't patched Apache at that point. I did get it all patched up, but I still have the same behavior affecting one of the servers right now. I haven't made too many changes with that system, and it only started within the past month, so it's definately not normal, but I haven't had any problems with it lagging the rest of the system for me.

Though I have read a lot about how reported free memory works a lot differently than how most people really think it does, and that it's not uncommon to see that.

Reply

Please enter an answer
Tips:

You can mention users to notify them: @username

You can use Markdown to format your question. For more examples see the Markdown Cheatsheet.

> I’m a blockquote.

I’m a blockquote.

[I'm a link] (https://www.google.com)

I'm a link

**I am bold** I am bold

*I am italicized* I am italicized

Community Code of Conduct