From 44f65d900698278a8451988abe0d5ca37fd46882 Mon Sep 17 00:00:00 2001 From: Jeff Xu Date: Tue, 6 Aug 2024 21:49:27 +0000 Subject: [PATCH] binfmt_elf: mseal address zero MIME-Version: 1.0 Content-Type: text/plain; charset=utf8 Content-Transfer-Encoding: 8bit In load_elf_binary as part of the execve(), when the current task’s personality has MMAP_PAGE_ZERO set, the kernel allocates one page at address 0. According to the comment: /* Why this, you ask??? Well SVr4 maps page 0 as read-only, and some applications "depend" upon this behavior. Since we do not have the power to recompile these, we emulate the SVr4 behavior. Sigh. */ At one point, Linus suggested removing this [1]. Code search in debian didn't see much use of MMAP_PAGE_ZERO [2], it exists in util and test (rr). Sealing this is probably safe, the comment doesn't say the app ever wanting to change the mapping to rwx. Sealing also ensures that never happens. If there is a complaint, we can make this configurable. Link: https://lore.kernel.org/lkml/CAHk-=whVa=nm_GW=NVfPHqcxDbWt4JjjK1YWb0cLjO4ZSGyiDA@mail.gmail.com/ [1] Link: https://codesearch.debian.net/search?q=MMAP_PAGE_ZERO&literal=1&perpkg=1&page=1 [2] Signed-off-by: Jeff Xu Link: https://lore.kernel.org/r/20240806214931.2198172-2-jeffxu@google.com Signed-off-by: Kees Cook --- fs/binfmt_elf.c | 5 +++++ include/linux/mm.h | 10 ++++++++++ mm/mseal.c | 2 +- 3 files changed, 16 insertions(+), 1 deletion(-) diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c index bf9bfd1a0007..c3aa700aeb3c 100644 --- a/fs/binfmt_elf.c +++ b/fs/binfmt_elf.c @@ -1314,6 +1314,11 @@ out_free_interp: emulate the SVr4 behavior. Sigh. */ error = vm_mmap(NULL, 0, PAGE_SIZE, PROT_READ | PROT_EXEC, MAP_FIXED | MAP_PRIVATE, 0); + + retval = do_mseal(0, PAGE_SIZE, 0); + if (retval) + pr_warn_ratelimited("pid=%d, couldn't seal address 0, ret=%d.\n", + task_pid_nr(current), retval); } regs = current_pt_regs(); diff --git a/include/linux/mm.h b/include/linux/mm.h index c4b238a20b76..a178c15812eb 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -4201,4 +4201,14 @@ void vma_pgtable_walk_end(struct vm_area_struct *vma); int reserve_mem_find_by_name(const char *name, phys_addr_t *start, phys_addr_t *size); +#ifdef CONFIG_64BIT +int do_mseal(unsigned long start, size_t len_in, unsigned long flags); +#else +static inline int do_mseal(unsigned long start, size_t len_in, unsigned long flags) +{ + /* noop on 32 bit */ + return 0; +} +#endif + #endif /* _LINUX_MM_H */ diff --git a/mm/mseal.c b/mm/mseal.c index bf783bba8ed0..7a40a84569c8 100644 --- a/mm/mseal.c +++ b/mm/mseal.c @@ -248,7 +248,7 @@ static int apply_mm_seal(unsigned long start, unsigned long end) * * unseal() is not supported. */ -static int do_mseal(unsigned long start, size_t len_in, unsigned long flags) +int do_mseal(unsigned long start, size_t len_in, unsigned long flags) { size_t len; int ret = 0; -- 2.50.1