20090630 linux memory utilization - plembo/onemoretech GitHub Wiki
title: Linux memory utilization link: https://onemoretech.wordpress.com/2009/06/30/linux-memory-utilization/ author: lembobro description: post_id: 291 created: 2009/06/30 14:53:04 created_gmt: 2009/06/30 14:53:04 comment_status: open post_name: linux-memory-utilization status: publish post_type: post
Linux memory utilization
Why is that Linux server using 96% of memory?
This question has come up a lot lately because we’re finally monitoring mundane stuff like memory utilization globally.
The answer? This is a “feature” of memory management in the Linux kernel.
A good article on this is Understanding Virtual Memory by Norm Murray and Neil Horman, published in the November 2004 issue of Red Hat Magazine.
There’s a basic philososphy at work in Linux memory management that’s different from what you’ll find under other O/S’s: mark all free memory as potentially usable for cache to avoid having to do a costly (in terms of efficiency) swap out to slower physical disk.
Running vmstat
on a Linux system will bear this out.
`
[root@bigserver~]# vmstat
procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu---
r b swpd free buff cache si so bi bo in cs us sy id wa
0 0 224 83248 265384 15765176 0 0 1 7 1 0 0 0 100 0
`
In Monitoring Virtual Memory with vmstat, from the October 2005 issue of Linux Journal, Brian Tanaka details how to determine if you’ve got a real memory problem on a Linux system. One key piece of advice is to make sure you watch the output over time. Running vmstat with some kind of delay is helpful. For example “vmstat 5” will refresh the output every 5 seconds (rather than every second). Running “vmstat 5 10” will refresh every 5 seconds 10 times.
In that article there’s an example of a happy system (free of paging activity) and a hurting one (where paging has become excessive). First the “happy” system:
`
procs memory swap io system cpu
r b w swpd free buff cache si so bi bo in cs us sy id
0 0 0 29232 116972 4524 244900 0 0 0 0 0 0 0 0 0
0 0 0 29232 116972 4524 244900 0 0 0 0 2560 6 0 1 99
0 0 0 29232 116972 4524 244900 0 0 0 0 2574 10 0 2 98
`
The important metrics here are free, si and so. The “free” indicator is pretty obvious, free memory, The si and so indicators are for page-ins and page-outs, respectively. Both here are “0”, meaning there’s no paging going on.
Then, the “hurting” system:
`
procs memory swap io system cpu
r b w swpd free buff cache si so bi bo in cs us sy id
. . .
1 0 0 13344 1444 1308 19692 0 168 129 42 1505 713 20 11 69
1 0 0 13856 1640 1308 18524 64 516 379 129 4341 646 24 34 42
3 0 0 13856 1084 1308 18316 56 64 14 0 320 1022 84 9 8
`
Now that’s a busy system. Nonzero values for so means that there isn’t enough physical memory in the system and therefore it has to page out to disk. Depending upon your hardware that may or may not be a bad thing, but in most cases should definitely be the subject of further monitoring.
I may be wrong, but I think Solaris used to do the opposite to conserve RAM (which at the time was really expensive $$$) and rely on really fast hardware disk access to avoid slowing processes down. It probably isn’t a coincidence that Sun, a hardware company who could specify the disk controller requirements, took that course while Linux, which had/has little pull with hardware vendors, took the other.
Of course, knowing this, you have to wonder why monitoring software vendors don’t configure their products to prevent them from barking at unsuspecting sysadmins when memory utilization goes up to 96% or so, where it actually belongs.
Addendum:
Just wanted to add another example of how to calculate actual free memory from an article in the March 2010 issue of Linux Journal by Kyle Rankin, Linux Troubleshooting, Part I: High Load.
The basic technique is simple, run top and add the amount of memory shown as “free” on the “Mem” line with the number on the “Swap” line as “cached”. This will give you your real total for free system memory.
`
[root@test1 ~]$ top -n1
top - 13:47:54 up 24 days, 19:36, 3 users, load average: 0.00, 0.00, 0.00
Tasks: 185 total, 1 running, 183 sleeping, 0 stopped, 1 zombie
Cpu(s): 0.2% us, 0.1% sy, 0.0% ni, 99.7% id, 0.0% wa, 0.0% hi, 0.0% si
Mem: 32960152k total, 21358056k used, 11602096k free, 445228k buffers
Swap: 8385920k total, 0k used, 8385920k free, 19611236k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
6429 oragrid 17 0 406m 128m 20m S 2.0 0.4 25:01.76 emagent
20951 oracle 23 0 359m 47m 17m S 2.0 0.1 0:06.03 emagent
1 root 16 0 4772 564 468 S 0.0 0.0 0:01.33 init
2 root RT 0 0 0 0 S 0.0 0.0 0:02.01 migration/0
3 root 34 19 0 0 0 S 0.0 0.0 0:00.13 ksoftirqd/0
4 root RT 0 0 0 0 S 0.0 0.0 0:02.05 migration/1
`
In the example above we have 11602096k free, and 19611236k cached. That adds up to a total of 31213332k free, or 31 Gb, of 32960152k (32 Gb) installed, meaning there’s really only 1746820k (just under 2 Gb) being used by the system and programs, not 21358056k (21 Gb). The rest is mostly filesystem cache.
Copyright 2004-2019 Phil Lembo