[Bug 1750013] Re: systemd-logind: memory leaks on session's connections (trusty-only)
Guilherme G. Piccoli
1750013 at bugs.launchpad.net
Fri Feb 16 18:25:52 UTC 2018
After some debug, it was observed that the only path in which a session
is freed (other than in user/seat free or manager free) is in garbage
collecting, specifically in the function manager_gc().
It means a closing session should be somehow added to gc in order gc has
a chance to validate if the session is ready to go, and then performs
the clean-up.
To include a session in gc, the function session_add_to_gc_queue() should
be called. This function is currently called from 3 paths:
* from session_stop();
* from manager startup routines;
* and from the empty cgroup notification handler.
Since we are not using systemd as init system in Trusty, the empty cgroup
notifier is not reliable. We rely on cgmanager to perform cgroup
interaction on Trusty, and its release-agent also does not notify when
cgroups are empty. Finally, commit
efdb023 ("core: unified cgroup hierarchy support")
made a refactor in cgroup management and it claims for the first time,
cgroup empty notifications are reliable, meaning until this commit we
certainly couldn't rely on it.
The startup path of manager is executed only in logind start; session_stop()
could execute from Terminate dbus events (like loginctl terminate-session),
and from user/seat stopping routines. Meaning there's no way a session is
added to gc queue except the cgroup empty notification (if it worked) in
this version of systemd.
The simple test-case mentioned in the first comment leaks entire sessions
and logind consumes an increasing amount of memory.
--
In order to prevent that, I wrote a small patch that proactively puts a
session in gc on session_remove_fifo(). This function is always called in
the event of Release a session - when the ssh ends, for example.
This way, we guarantee closing sessions have a chance to be cleared.
But also, it was noticed that kernel may take a small time window to
clear the tasks' cgroup tracking for the session - hence, the gc
might fail in freeing the session, so the patch adds a check on gc too
in order to re-add the session on gc in case session is on closing state.
This way stopping sessions are always freed and no leak is observed.
The patch:
diff --git a/src/login/logind-session.c b/src/login/logind-session.c
index 662273b..1a7389a 100644
--- a/src/login/logind-session.c
+++ b/src/login/logind-session.c
@@ -951,6 +952,7 @@ void session_remove_fifo(Session *s) {
free(s->fifo_path);
s->fifo_path = NULL;
}
+ session_add_to_gc_queue(s);
}
int session_check_gc(Session *s, bool drop_not_started) {
diff --git a/src/login/logind.c b/src/login/logind.c
index 333a779..0b9e165 100644
--- a/src/login/logind.c
+++ b/src/login/logind.c
@@ -1495,6 +1495,8 @@ void manager_gc(Manager *m, bool drop_not_started) {
if (session_check_gc(session, drop_not_started) == 0) {
session_stop(session);
session_free(session);
+ } else if (session_get_state(session) == SESSION_CLOSING) {
+ session_add_to_gc_queue(session);
}
}
This diverges from upstream since we have refactors in systemd code,
specially in session shutdown logic on commit 5f41d1f
("logind: rework session shutdown logic"). The simple approach of this
patch is less expensive then bring multiple patches to the current code
of Trusty's systemd.
As possible side effects, adding a session to gc has none other than a
possible (non-measured) minimal performance penalty.
Finally, it was observed that "cgmanager" package is necessary to perform
the full clean-up of a session - since we don't handle cgroups with systemd
in Trusty, cgmanager is the responsible one to clear the cgroup directory
for each cleaned-up session.
After testing the solution above (with cgmanager), memory consumption was
drastically reduced, since systemd-logind is not "leaking" entire
sessions anymore.
More small leaks were observed though, so investigation continues.
--
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to systemd in Ubuntu.
https://bugs.launchpad.net/bugs/1750013
Title:
systemd-logind: memory leaks on session's connections (trusty-only)
Status in systemd package in Ubuntu:
New
Bug description:
It was observed that systemd-logind tool is leaking memory at each session
connected. The issue happens in systemd from Trusty (14.04), which latest
version currently (Feb/2018) is 204-5ubuntu20.26 (and still reproduces
the bug).
The basic test-case is to run the following loop from a remote
machine:
while true; do ssh <hostname-target> "whoami"; done
and watch the increase in memory consumption from "systemd-logind" process
in the target machine. One can use the "ps uax" command to verify the
RSS of the process, or count the anon pages from /proc/<logind_pid>/smaps.
To clarify a bit how a session works, the following "stack trace" details
a bit which function calls happen when a SSH connection is opened, from
Trusty's systemd-logind point of view:
main() <logind.c>
manager_startup()
manager_run() [event-loop]
bus_loop_dispatch() <dbus-loop.c>
dbus_watch_handle() -> bus_manager_message_handler()
bus_manager_create_session()
manager_add_session() <logind.c>
session_new() <logind-session.c>
session_create_fifo()
session_start()
session_create_cgroup()
session_save()
session_bus_path()
[...]
After each session is closed, it was observed that session_free() isn't
called, keeping the sessions alive. This can be verified through the
command "loginctl list-session" - each session that once connected is
present there "forever".
The memory leaks can eventually lead to OOM situation of this process.
Debug progress will be tracked here, in this LP.
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/1750013/+subscriptions
More information about the foundations-bugs
mailing list