In this article I will give an example of changing gc_thresh on Linux, this parameter usually needs to be increased on highly loaded access servers.
For example, in “gc_thresh1” the minimum number of stored ARP records is indicated which is not cleared, in “gc_thresh2” the amount after which the records begin to be cleaned after 5 seconds, in “gc_thresh3” the amount upon reaching which the records begin to be cleared immediately. If, for example, the server is used for NAT and with a large number of network nodes, and the “gc_thresh” values that you have specified are not enough, then in the /var/log/kern.log logs you will see the error “arp_cache: neighbor table overflow!”.
First, let’s see the current values:
cat /proc/sys/net/ipv4/neigh/default/gc_thresh1
cat /proc/sys/net/ipv4/neigh/default/gc_thresh2
cat /proc/sys/net/ipv4/neigh/default/gc_thresh3
Or so:
sysctl -a |grep "neigh.default.gc_thresh"
Let’s see the number of arp entries and the dynamics of their changes:
arp -an | wc -l
arp -an | grep incom | wc -l
arp -an | less
for i in {1..10}; do cat /proc/net/arp | wc -l && sleep 5; done
ip n | wc -l
ip n | grep STALE | wc -l
ip n | grep REACHABLE
ip n | less
ip -s neighbor show
For example, for a server that served more than 6,000 clients, I indicated:
sysctl -w net.ipv4.neigh.default.gc_thresh3=24456
sysctl -w net.ipv4.neigh.default.gc_thresh2=12228
sysctl -w net.ipv4.neigh.default.gc_thresh1=8192
To prevent the values from resetting after restarting the operating system, specify them in the /etc/sysctl.conf file:
net.ipv4.neigh.default.gc_thresh1=8192
net.ipv4.neigh.default.gc_thresh2=12228
net.ipv4.neigh.default.gc_thresh3=24456
Similarly, we specify for IPv6:
net.ipv6.neigh.default.gc_thresh1=8192
net.ipv6.neigh.default.gc_thresh2=12228
net.ipv6.neigh.default.gc_thresh3=24456
If you have not changed them before, then we apply what is in the /etc/sysctl.conf file with the command below:
sysctl -p
I note that it is not correct to specify large values for gc_thresh1, you need to consider how many clients your server serves, since if you specify a large value for gc_thresh1, this may mean that the number of arp entries above this value will never be deleted from the cache, and when If the client supposes the mac address or ip address changes, there may be failures with its availability, as the cache will have old entries.
Obsolete arp records are marked from REACHABLE in STALE and later cleared, non-existing records are marked as INCOMPLETE and also cleared.
Let’s look at the timers: checking stale arp entries (gc_stale_time), cleaning them (gc_interval) and marking them as stale (base_reachable_time_ms):
cat /proc/sys/net/ipv4/neigh/all/gc_stale_time
cat /proc/sys/net/ipv4/neigh/default/gc_interval
cat /proc/sys/net/ipv4/neigh/default/base_reachable_time_ms
If desired, you can increase:
echo 120 > /proc/sys/net/ipv4/neigh/default/gc_stale_time
echo 60 > /proc/sys/net/ipv4/neigh/default/gc_interval
If we changed them, then we also specify in /etc/sysctl.conf (I show the default values for Ubuntu Server 16.04):
net.ipv4.neigh.default.gc_stale_time = 60
net.ipv4.neigh.default.gc_interval = 30
net.ipv4.neigh.default.base_reachable_time_ms=30000
See also my articles:
Tuning nf_conntrack
How to fix the error “nf_conntrack: table full, dropping package”
” number of arp entries above this value will never be deleted”
Can you please explain this sentence? For me it does not really make sense
Thank you for your article, it reminded me of how to operate at scale, something I don’t do all the time! This is really important particularly for clusters where VMs are running (each VM has a MAC addr), and I’m not sure why the defaults are so low. The cache timeout will clean out old entries so what’s the harm in having higher defaults?