[ 3.5.y.z extended stable ] Patch "memcg: oom: fix totalpages calculation for" has been added to staging queue
Herton Ronaldo Krzesinski
herton.krzesinski at canonical.com
Fri Dec 7 16:06:01 UTC 2012
This is a note to let you know that I have just added a patch titled
memcg: oom: fix totalpages calculation for
to the linux-3.5.y-queue branch of the 3.5.y.z extended stable tree
which can be found at:
http://kernel.ubuntu.com/git?p=ubuntu/linux.git;a=shortlog;h=refs/heads/linux-3.5.y-queue
If you, or anyone else, feels it should not be added to this tree, please
reply to this email.
For more information about the 3.5.y.z tree, see
https://wiki.ubuntu.com/Kernel/Dev/ExtendedStable
Thanks.
-Herton
------
>From 5c2f6c2e3d1a338968d64a8f696073a6296497e8 Mon Sep 17 00:00:00 2001
From: Michal Hocko <mhocko at suse.cz>
Date: Fri, 16 Nov 2012 14:14:49 -0800
Subject: [PATCH] memcg: oom: fix totalpages calculation for
memory.swappiness==0
commit 9a5a8f19b43430752067ecaee62fc59e11e88fa6 upstream.
oom_badness() takes a totalpages argument which says how many pages are
available and it uses it as a base for the score calculation. The value
is calculated by mem_cgroup_get_limit which considers both limit and
total_swap_pages (resp. memsw portion of it).
This is usually correct but since fe35004fbf9e ("mm: avoid swapping out
with swappiness==0") we do not swap when swappiness is 0 which means
that we cannot really use up all the totalpages pages. This in turn
confuses oom score calculation if the memcg limit is much smaller than
the available swap because the used memory (capped by the limit) is
negligible comparing to totalpages so the resulting score is too small
if adj!=0 (typically task with CAP_SYS_ADMIN or non zero oom_score_adj).
A wrong process might be selected as result.
The problem can be worked around by checking mem_cgroup_swappiness==0
and not considering swap at all in such a case.
Signed-off-by: Michal Hocko <mhocko at suse.cz>
Acked-by: David Rientjes <rientjes at google.com>
Acked-by: Johannes Weiner <hannes at cmpxchg.org>
Acked-by: KOSAKI Motohiro <kosaki.motohiro at jp.fujitsu.com>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu at jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm at linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds at linux-foundation.org>
[ herton: unfuzz patch ]
Signed-off-by: Herton Ronaldo Krzesinski <herton.krzesinski at canonical.com>
---
Documentation/cgroups/memory.txt | 4 ++++
mm/memcontrol.c | 21 +++++++++++++++------
2 files changed, 19 insertions(+), 6 deletions(-)
diff --git a/Documentation/cgroups/memory.txt b/Documentation/cgroups/memory.txt
index dd88540..63838c2 100644
--- a/Documentation/cgroups/memory.txt
+++ b/Documentation/cgroups/memory.txt
@@ -464,6 +464,10 @@ Note:
5.3 swappiness
Similar to /proc/sys/vm/swappiness, but affecting a hierarchy of groups only.
+Please note that unlike the global swappiness, memcg knob set to 0
+really prevents from any swapping even if there is a swap storage
+available. This might lead to memcg OOM killer if there are no file
+pages to reclaim.
Following cgroups' swappiness can't be changed.
- root cgroup (uses /proc/sys/vm/swappiness).
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index f72b5e5..3ddd346 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -1457,17 +1457,26 @@ static int mem_cgroup_count_children(struct mem_cgroup *memcg)
u64 mem_cgroup_get_limit(struct mem_cgroup *memcg)
{
u64 limit;
- u64 memsw;
limit = res_counter_read_u64(&memcg->res, RES_LIMIT);
- limit += total_swap_pages << PAGE_SHIFT;
- memsw = res_counter_read_u64(&memcg->memsw, RES_LIMIT);
/*
- * If memsw is finite and limits the amount of swap space available
- * to this memcg, return that limit.
+ * Do not consider swap space if we cannot swap due to swappiness
*/
- return min(limit, memsw);
+ if (mem_cgroup_swappiness(memcg)) {
+ u64 memsw;
+
+ limit += total_swap_pages << PAGE_SHIFT;
+ memsw = res_counter_read_u64(&memcg->memsw, RES_LIMIT);
+
+ /*
+ * If memsw is finite and limits the amount of swap space
+ * available to this memcg, return that limit.
+ */
+ limit = min(limit, memsw);
+ }
+
+ return limit;
}
static unsigned long mem_cgroup_reclaim(struct mem_cgroup *memcg,
--
1.7.9.5
More information about the kernel-team
mailing list