Memory leaks in Ubuntu 20 kernel?

nate ubuntu at linuxpowered.net
Fri May 21 05:00:06 UTC 2021


Hello -

(Linux user since 1996)

TL;DR (?) - Memory usage more than doubled for simple workload on Ubuntu 
20.04 vs
16.04 (and 12.04 and 10.04 before), source of memory usage not reported 
as in use
by any process in "top". Memory continues to leak as time goes on

Sorry for long post but I wanted to include as much detail as I could.

Was wondering if anyone else had noticed this. Have been replacing many 
16.04 systems with
20.04 and in some cases the 20.04 systems are using a ton more memory 
for no apparent reason.

Most basic type of system I have is a utility server which runs services 
such as:

(% numbers are memory usage reported by top after restarting the 
services)
bind - 0.7%
splunk forwarder - 1.2%
syslog-ng (local logs only) - 0.2%
snmpd - 0.2%
NFS client - ~0.2%
autofs - 0.3%
Apache (basic config, 3 workers minimal traffic) - 0.1%
postfix relay (minimal traffic) - 0.1%
Chef configuration management agent - 0.4%
Only 26 kernel modules loaded(tried to minimize modules that are loaded, 
my home Ubuntu
20 laptop by contrast has 140 modules loaded)

sample 'free -m' immediately after flushing swap and clearing disk 
buffers
(as in 2 seconds after):
               total        used        free      shared  buff/cache   
available
Mem:           2983        2047         101           1         834      
    718
Swap:           511          56         455

System has 718MB of "available" memory but it feels an immediate need to 
get into
swap right after swap is cleared.

All local filesystems are ext4 (so no memory hogging zfs or something 
running), only
4.1GB of local disk space in use.

I have a custom system memory monitoring script I wrote back in 2004 and 
have been evolving it
to support newer linux interfaces since then. I have data going back for 
the past year on these
systems.  On ubuntu 16.04 the systems were configured with 1.5GB of 
memory, 1 CPU, and were using
on average 300MB of memory, about 850MB of cache, ~130MB of buffers, and 
a pretty steady ~180MB
of free memory. This usage was extremely stable for at least 8 months 
prior to upgrade.

Using the exact same configuration (this is managed by Chef), on Ubuntu 
20.04 the memory usage
has grown dramatically. System was installed Feb 9 2021. I increased the 
system memory to 2GB
as part of the upgrade, then increased again to 3GB at the end of April. 
  Currently a sample
system is using 1.08GB memory(up more than 300%), 1.64GB of cache(about 
200% increase), almost
no buffers, and about 130MB free. The memory usage seems to be at a big 
leak, on Feb 9 memory
usage was about 400MB and now it is 1.08GB.

Systems had been running on Ubuntu 16.04 for probably the past 4 years, 
and before that,
Ubuntu 12.04 and before that 10.04. All with the same configuration (+/- 
minor required option
changes to support each distro version etc).

Running "top" reports nothing using more than 1.8% of memory. I 
restarted all of the "major"
services(not expecting any results), and sure enough zero impact to 
memory usage. I flushed
all of the kernel cache buffers:

echo 1 > /proc/sys/vm/drop_caches
echo 2 > /proc/sys/vm/drop_caches
echo 3 > /proc/sys/vm/drop_caches

It freed up only about 100MB.

All systems have a 512MB "last resort" kind of swap. "vm.swappiness" has 
been set to 0 for
the past decade. System swaps often. I have a cron set to analyze swap 
usage and free
memory and clear swap every hour if there is space, when it clears swap 
it also flushes
all of the buffers.

These are low usage systems, with a single CPU, they average under 5% 
CPU usage 24/7.

I before upgrading to 3GB, I tried turning off swap just to see what 
would happen, the
result was the OOM killer started getting called. So I re-enabled swap 
and increased memory
to 3GB.

Realistically these systems SHOULD run fine with 1GB of memory, I only 
set it to 2GB for
situations during apt-get upgrades I have seen in the past memory usage 
spike. But now it
is at 3GB and the memory usage profile has not changed it is continuing 
to leak more.

OS is Ubuntu 20.04.2
Kernel is 5.4.0-65-generic (I know not the latest)

I did come across this post a few weeks ago which had a comment calling 
into question
the memory changes in 5.4, and that some newer kernels were better for 
that person
but not "normal" memory usage:
https://askubuntu.com/questions/1278460/why-does-vm-swappiness-not-working

These do all run inside Vmware ESX, hosts have plenty of memory, there 
is no
ballooning going on (never have had a ballooning incident in the past 
decade, have
tons of extra memory). I gather probably 30,000 data points a minute 
across our
infrastructure all graphed, and alerted on in LogicMonitor.

I can't imagine others haven't encountered this situation but am having 
a hard time
finding references to it anywhere but that askubuntu post above.

I am just not sure where to look to find the source of the usage. I came 
across this
tonight:
https://www.kernel.org/doc/html/v5.4/dev-tools/kmemleak.html

Though the option it requires isn't enabled in the kernel, I could build 
my own kernel
to test but am uncertain if I could make sense of the results as I am 
not a kernel
dev.

I have seen significantly higher memory usage in MySQL servers in some 
situations vs
16.04 as well, and other places too, but in those cases the memory usage 
actually
registers as being used by MySQL. In this case it is not registering as 
being used
by any process.

I have 14 different servers running this configuration and they are all 
behaving
the same. Currently have nearly 300 Ubuntu 20.04 systems. These utility 
servers
should be the easiest ones to narrow down the issue as they don't run 
any fancy
software. Still have a couple hundred 16.04 systems left to upgrade as 
well.

Worst case I guess I can just reboot them occasionally, not something 
I've ever had
to do for Linux.

thanks

nate







More information about the ubuntu-users mailing list