"reserved" memory growth in Linode kernels
It seems like there's a rather significant growth in the amount of memory being reserved by the paravirt kernels, certainly compared to the latest stable 2.6.18, but notably quite significant between somewhat close recent paravirt releases.
For example, some comparisons of dmesg output (all on Linode 512s):
2.6.18.8-linode22
Memory: 511908k/532480k available (3989k kernel code, 12360k reserved, 1102k data, 224k init, 0k highmem)
2.6.38.3-linode32
Memory: 509416k/532480k available (5378k kernel code, 22616k reserved, 1570k data, 424k init, 0k highmem)
2.6.39-linode33 #5
Memory: 480264k/4202496k available (5700k kernel code, 43576k reserved, 1666k data, 412k init, 0k highmem)
Now I realize some growth over time might be expected, and the kernel/data increases seem reasonable, but the jump in reserved memory (especially between 2.6.38 and 2.6.39) seems excessive. Does anyone know what changed in that jump that could suddenly need almost twice as much reserved memory? I do know that vm.minfreebytes was tweaked a few times for 2.6.39 (at least in part leading to build #5), but between my 2.6.18 and 2.6.39 nodes it's only changing from 2918K to 4096K so that's relatively minor.
I'm not that familiar with kernel memory management, but based on what I've been able to find, I don't think (but would love to be corrected) the kernel is going to use that memory for caching or apps, but only hold it ready for internal structures. So it seems to me that moving from latest stable to latest paravirt effectively decreases my working memory on a Linode 512 by ~30MB or 5% (almost 4% of which is just a change from 2.6.38 to 2.6.39).
One thing that I found interesting was the total memory shown in only the 2.6.39 log … I guess it has different visibility into the host environment. But if the kernel is basing any calculations on that different figure, perhaps it explains some of the increase?
Can anyone shed better light on what might be going on and/or if this increase in reserved memory does in fact have a real impact on system memory usage. Any thoughts on how to get some of it back (assuming it is in fact largely being wasted)?
-- David
11 Replies
To be honest, I don't think you really have a justifiable complaint. In fact, just recently we all got a free ~40% RAM upgrade, bringing the base linode from 360 to 512MB of RAM.
@Guspaz:
To be honest, I don't think you really have a justifiable complaint. In fact, just recently we all got a free ~40% RAM upgrade, bringing the base linode from 360 to 512MB of RAM.
Hmm, not sure how you read my post as a complaint?
I was asking a technical question in regards to reserved memory growth. That's why I put it in the performance/tuning forum, and not, for example, the bugs forum. Though I suppose in honestly I do hold some hope it'll turn out to be something that can be better tuned with the later kernels.
Independent of Linode size, it appears that switching between the latest stable vs. paravirt kernels - or perhaps more notable, just between recent paravirt kernels - takes a chunk of memory away (I think, given my understanding so far of the behavior of kernel reserved memory) that is larger than I would have expected. I'm trying to understand why, and if there's a way to minimize it.
– David
this
I would try it out myself, but I'm not setup with pvgrub for running my own kernels.
In particular, the memory map analysis shown near the top of the thread seems to match very closely to my Linode 512 (except for the one usable RAM map line changing from 256MB to 512MB).
Here's the difference between 2.6.38 and 2.6.39 on my Linode 512s:
2.6.38.3-linode32
BIOS-provided physical RAM map:
Xen: 0000000000000000 - 00000000000a0000 (usable)
Xen: 00000000000a0000 - 0000000000100000 (reserved)
Xen: 0000000000100000 - 0000000020800000 (usable)
2.6.39-linode33 #5
BIOS-provided physical RAM map:
Xen: 0000000000000000 - 00000000000a0000 (usable)
Xen: 00000000000a0000 - 0000000000100000 (reserved)
Xen: 0000000000100000 - 0000000020000000 (usable)
Xen: 0000000100000000 - 0000000100800000 (usable)
So identical total usable memory, but 2.6.39 shows this big hole between the 512MB block and the final 8MB block, while they are consolidated with 2.6.38. Presumably that unreachable 4GB gap is what increased the calculation for tables in reserved space, and creates the larger physical memory displayed in dmesg. I suppose it's possible that either kernel would calculate reserved memory the same way given the difference in map, as I don't know who has more impact on that map - the host Xen or the kernel itself, in how it interfaces to the host?
I had tried adding mem=512 to the kernel startup options (since you can do that with standard kernels through the Linode Manager) but it had no effect. Although dmesg memory still showed the larger total physical memory value, so that would jive with it being some tables computed based on that maximum value and not the reachable memory.
As you say without a pv-grub setup (which I also don't use), the other kernel options can't really be tested. It would be nice if other kernel options could be specified through the manager; in this case the memmap approach looks like it might be a practical workaround.
Can anyone from Linode confirm how the Xen guest memory is currently set up, and if there's any difference between the two kernels or is this most likely a change in how the kernel itself uses the same information provided by the host?
– David
Linux ur-mom 2.6.39.1-linode34 #1 SMP Tue Jun 21 10:29:24 EDT 2011 i686 GNU/Linux
Memory: 509012k/532480k available (5701k kernel code, 23020k reserved, 1656k data, 412k init, 0k highmem)
These are ready to go now.
-Chris
Interestingly, the "BIOS provided memory map" in dmesg now shows the consolidated usable memory region again (like pre-2.6.39). I guess I would have thought changes to that had to mean Xen was providing different information to the guest, but I guess not - it must have something to do with how the kernel is interpreting whatever is being provided.
Edit: Ah, it may have been this patch:
-- David
@caker:
Looks like things improve for 2.6.39.1 (32 bit):
Linux ur-mom 2.6.39.1-linode34 #1 SMP Tue Jun 21 10:29:24 EDT 2011 i686 GNU/Linux Memory: 509012k/532480k available (5701k kernel code, 23020k reserved, 1656k data, 412k init, 0k highmem)
These are ready to go now.
http://www.linode.com/kernels/
http://www.linode.com/kernels/rss.xml -Chris
Unfortunately the x86-64 version still seems to suffer from the same or a similar kind of problem in 2.6.39.1.
$ uname -a
Linux foo 2.6.39.1-x86_64-linode19 #1 SMP Tue Jun 21 10:04:20 EDT 2011 x86_64 GNU/Linux
$ dmesg | grep ^Mem
Memory: 433024k/4202496k available (6041k kernel code, 3670464k absent, 99008k reserved, 5314k data, 668k init)
$ free -m
total used free shared buffers cached
Mem: 424 89 335 0 2 35
-/+ buffers/cache: 51 373
Swap: 1023 0 1023
$
As a reference, here's the free -m output from 2.6.38:
total used free shared buffers cached
Mem: 490 441 48 0 95 258
-/+ buffers/cache: 88 401
Swap: 1023 1 1022
(I know it serves little purpose in itself with x86-64 on the 512MB plan, but I want it for full compatibility with a separate system.)
The commit I found earlier is specifically only 32bit. It appears to be "fixing" the 32bit case against a prior 2.6.39 commit that introduced the behavior:
I'm not familiar enough with the kernel VM setup processing to know why the original change was needed, nor why it was only corrected for 32bit. But I don't see 64bit behavior changing without further upstream commits, which at this point would mean completely rolling back the original commit, and presumably it was made for some reason.
-- David
Any news on memory issue in 64-bit version?
@neo:
I see 3.0 kernels were just added by Linode (thanks!).
Any news on memory issue in 64-bit version?
It actually looks much better:
> $ uname -a && dmesg | grep ^Mem && free -m
Linux foo.bar 3.0.0-x8664-linode20 #1 SMP Tue Aug 2 12:37:09 EDT 2011 x8664 GNU/Linux
Memory: 496764k/532480k available (6398k kernel code, 448k absent, 35268k reserved, 7020k data, 668k init)
total used free shared buffers cached
Mem: 488 78 409 0 2 30
-/+ buffers/cache: 46 442
Swap: 1023 0 1023
$