среда, 20 февраля 2013 г.

out of socket memory in linux


tcp mem info
cat /proc/sys/net/ipv4/tcp_mem

counted in pages (4096b per one)

3093984 4125312 6187968
The values are in number of pages. They get automatically sized at boot time (values above are for a machine with 32GB of RAM). They mean:

    When TCP uses less than 3093984 pages (11.8GB), the kernel will consider it below the "low threshold" and won't bother TCP about its memory consumption.
    When TCP uses more than 4125312 pages (15.7GB), enter the "memory pressure" mode.
    The maximum number of pages the kernel is willing to give to TCP is 618796823.6GB). When we go above this, we'll start seeing the "Out of socket  memory" error and Bad Things will happen.


cat /proc/net/sockstat
sockets: used 14565
TCP: inuse 35938 orphan 21564 tw 70529 alloc 35942 mem 1894
UDP: inuse 11 mem 3
UDPLITE: inuse 0
RAW: inuse 0
FRAG: inuse 0 memory 0

The last value on the second line (mem 1894) is the number of pages allocated to TCP.

In order to find the limit on the number of orphan sockets, simply do:

$ cat /proc/sys/net/ipv4/tcp_max_orphans
65536

Here we see the default value, which is 64k. In order to find the number of orphan sockets in the system, look again in sockstat:

$ cat /proc/net/sockstat
sockets: used 14565

TCP: inuse 35938 orphan 21564 tw 70529 alloc 35942 mem 1894

Yet, if you look once more at the code above that prints the warning, you'll see that there is this shift variable that has a value between 0 and 2, and that the check is testing if (orphans << shift > sysctl_tcp_max_orphans). What this means is that in certain cases, the kernel decides to penalize some sockets more, and it does so by multiplying the number of orphans by 2x or 4x to artificially increase the "score" of the "bad socket" to penalize. The problem is that due to the way this is implemented, you can see a worrisome "Out of socket memory" error when in fact you're still 4x below the limit and you just had a couple "bad sockets" (which happens frequently when you have an Internet facing service). So unfortunately that means that you need to tune up the maximum number of orphan sockets even if you're 2x or 4x away from the threshold. What value is reasonable for you depends on your situation at hand. Observe how the count of orphans in /proc/net/sockstat is changing when your server is at peak traffic, multiply that value by 4, round it up a bit to have a nice value, and set it. You can set it by doing a echo of the new value in /proc/sys/net/ipv4/tcp_max_orphans, and don't forget to update the value of net.ipv4.tcp_max_orphans in /etc/sysctl.conf