]> www.infradead.org Git - users/willy/pagecache.git/commitdiff
pid: Do not set pid_max in new pid namespaces
authorMichal Koutný <mkoutny@suse.com>
Wed, 5 Mar 2025 14:58:49 +0000 (15:58 +0100)
committerChristian Brauner <brauner@kernel.org>
Thu, 6 Mar 2025 09:18:36 +0000 (10:18 +0100)
It is already difficult for users to troubleshoot which of multiple pid
limits restricts their workload. The per-(hierarchical-)NS pid_max would
contribute to the confusion.
Also, the implementation copies the limit upon creation from
parent, this pattern showed cumbersome with some attributes in legacy
cgroup controllers -- it's subject to race condition between parent's
limit modification and children creation and once copied it must be
changed in the descendant.

Let's do what other places do (ucounts or cgroup limits) -- create new
pid namespaces without any limit at all. The global limit (actually any
ancestor's limit) is still effectively in place, we avoid the
set/unshare race and bumps of global (ancestral) limit have the desired
effect on pid namespace that do not care.

Link: https://lore.kernel.org/r/20240408145819.8787-1-mkoutny@suse.com/
Link: https://lore.kernel.org/r/20250221170249.890014-1-mkoutny@suse.com/
Fixes: 7863dcc72d0f4 ("pid: allow pid_max to be set per pid namespace")
Signed-off-by: Michal Koutný <mkoutny@suse.com>
Link: https://lore.kernel.org/r/20250305145849.55491-1-mkoutny@suse.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
kernel/pid_namespace.c

index 8f6cfec87555a3af47566b19d853306371106690..7098ed44e717d3da8f518f3826cb8047bb744786 100644 (file)
@@ -107,7 +107,7 @@ static struct pid_namespace *create_pid_namespace(struct user_namespace *user_ns
                goto out_free_idr;
        ns->ns.ops = &pidns_operations;
 
-       ns->pid_max = parent_pid_ns->pid_max;
+       ns->pid_max = PID_MAX_LIMIT;
        err = register_pidns_sysctls(ns);
        if (err)
                goto out_free_inum;