Patch series "Use obj_cgroup APIs to charge the LRU pages", v6.
With the following patchsets applied, all the kernel memory is charged with
the new APIs of obj_cgroup:
commit
f2fe7b09a52b ("mm: memcg/slab: charge individual slab objects instead of pages")
commit
b4e0b68fbd9d ("mm: memcontrol: use obj_cgroup APIs to charge kmem pages")
But user memory allocations (LRU pages) pinning memcgs for a long time -
it exists at a larger scale and is causing recurring problems in the real
world: page cache doesn't get reclaimed for a long time, or is used by the
second, third, fourth, ... instance of the same job that was restarted
into a new cgroup every time. Unreclaimable dying cgroups pile up, waste
memory, and make page reclaim very inefficient.
We can convert LRU pages and most other raw memcg pins to the objcg
direction to fix this problem, and then the LRU pages will not pin the
memcgs.
This patchset aims to make the LRU pages to drop the reference to memory
cgroup by using the APIs of obj_cgroup. Finally, we can see that the
number of the dying cgroups will not increase if we run the following test
script.
#!/bin/bash
dd if=/dev/zero of=temp bs=4096 count=1
cat /proc/cgroups | grep memory
for i in {0..2000}
do
mkdir /sys/fs/cgroup/memory/test$i
echo $$ > /sys/fs/cgroup/memory/test$i/cgroup.procs
cat temp >> log
echo $$ > /sys/fs/cgroup/memory/cgroup.procs
rmdir /sys/fs/cgroup/memory/test$i
done
cat /proc/cgroups | grep memory
rm -f temp log
This patch (of 11):
Since no-hierarchy mode is deprecated after
commit
bef8620cd8e0 ("mm: memcg: deprecate the non-hierarchical mode")
so parent_mem_cgroup() cannot return a NULL except root memcg, however,
root memcg cannot be offline, so it is safe to drop the check of returned
value of parent_mem_cgroup(). Remove those dead code.
The comments in memcg_offline_kmem() above memcg_reparent_list_lrus() are
out of date since
commit
5abc1e37afa0 ("mm: list_lru: allocate list_lru_one only when needed")
There is no ordering requirement between memcg_reparent_list_lrus() and
memcg_reparent_objcgs(), so remove those outdated comments.
Link: https://lkml.kernel.org/r/20220621125658.64935-1-songmuchun@bytedance.com
Link: https://lkml.kernel.org/r/20220621125658.64935-2-songmuchun@bytedance.com
Signed-off-by: Muchun Song <songmuchun@bytedance.com>
Acked-by: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Waiman Long <longman@redhat.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Xiongchun Duan <duanxiongchun@bytedance.com>
Cc: Michal Koutný <mkoutny@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
* parent_mem_cgroup - find the accounting parent of a memcg
* @memcg: memcg whose parent to find
*
- * Returns the parent memcg, or NULL if this is the root or the memory
- * controller is in legacy no-hierarchy mode.
+ * Returns the parent memcg, or NULL if this is the root.
*/
static inline struct mem_cgroup *parent_mem_cgroup(struct mem_cgroup *memcg)
{
return;
parent = parent_mem_cgroup(memcg);
- if (!parent)
- parent = root_mem_cgroup;
-
memcg_reparent_objcgs(memcg, parent);
-
- /*
- * After we have finished memcg_reparent_objcgs(), all list_lrus
- * corresponding to this cgroup are guaranteed to remain empty.
- * The ordering is imposed by list_lru_node->lock taken by
- * memcg_reparent_list_lrus().
- */
memcg_reparent_list_lrus(memcg, parent);
}
#else
break;
}
memcg = parent_mem_cgroup(memcg);
- if (!memcg)
- memcg = root_mem_cgroup;
}
return memcg;
}
{
int i, nid;
long nr;
- struct mem_cgroup *parent;
+ struct mem_cgroup *parent = parent_mem_cgroup(memcg);
struct shrinker_info *child_info, *parent_info;
- parent = parent_mem_cgroup(memcg);
- if (!parent)
- parent = root_mem_cgroup;
-
/* Prevent from concurrent shrinker_info expand */
down_read(&shrinker_rwsem);
for_each_node(nid) {