Patch series "ksm: support tracking KSM-placed zero-pages", v6.
The core idea of this patch set is to enable users to perceive the number
of any pages merged by KSM, regardless of whether use_zero_page switch has
been turned on, so that users can know how much free memory increase is
really due to their madvise(MERGEABLE) actions. But the problem is, when
enabling use_zero_pages, all empty pages will be merged with kernel zero
pages instead of with each other as use_zero_pages is disabled, and then
these zero-pages are no longer monitored by KSM.
The motivations for me to do this contains three points:
1) MADV_UNMERGEABLE and other ways to trigger unsharing will *not*
unshare the shared zeropage as placed by KSM (which is against the
MADV_UNMERGEABLE documentation at least); see the link:
https://lore.kernel.org/lkml/
4a3daba6-18f9-d252-697c-
197f65578c44@redhat.com/
2) We cannot know how many pages are zero pages placed by KSM when
enabling use_zero_pages, which hides the critical information about
how much actual memory are really saved by KSM. Knowing how many
ksm-placed zero pages are helpful for user to use the policy of madvise
(MERGEABLE) better because they can see the actual profit brought by KSM.
3) The zero pages placed-by KSM are different from those initial empty page
(filled with zeros) which are never touched by applications. The former
is active-merged by KSM while the later have never consume actual memory.
use_zero_pages is useful, not only because of cache colouring as described
in doc, but also because use_zero_pages can accelerate merging empty pages
when there are plenty of empty pages (full of zeros) as the time of
page-by-page comparisons (unstable_tree_search_insert) is saved. So we
hope to implement the support for ksm zero page tracking without affecting
the feature of use_zero_pages.
Zero pages may be the most common merged pages in actual environment(not
only VM but also including other application like containers). Enabling
use_zero_pages in the environment with plenty of empty pages(full of
zeros) will be very useful. Users and app developer can also benefit from
knowing the proportion of zero pages in all merged pages to optimize
applications.
With the patch series, we can both unshare zero-pages(KSM-placed)
accurately and count ksm zero pages with enabling use_zero_pages.
This patch (of 6):
A new function try_to_get_old_rmap_item is abstracted from
get_next_rmap_item. This function will be reused by the subsequent
patches about counting ksm_zero_pages.
The patch improves the readability and reusability of KSM code.
Link: https://lkml.kernel.org/r/202302100915227721315@zte.com.cn
Link: https://lkml.kernel.org/r/202212300911327101708@zte.com.cn
Link: https://lkml.kernel.org/r/202212300912449061763@zte.com.cn
Signed-off-by: xu xin <xu.xin16@zte.com.cn>
Reviewed-by: Xiaokai Ran <ran.xiaokai@zte.com.cn>
Reviewed-by: Yang Yang <yang.yang29@zte.com.cn>
Cc: David Hildenbrand <david@redhat.com>
Cc: Claudio Imbrenda <imbrenda@linux.ibm.com>
Cc: Xuexin Jiang <jiang.xuexin@zte.com.cn>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
}
}
-static struct ksm_rmap_item *get_next_rmap_item(struct ksm_mm_slot *mm_slot,
- struct ksm_rmap_item **rmap_list,
- unsigned long addr)
+static struct ksm_rmap_item *try_to_get_old_rmap_item(unsigned long addr,
+ struct ksm_rmap_item **rmap_list)
{
- struct ksm_rmap_item *rmap_item;
-
while (*rmap_list) {
- rmap_item = *rmap_list;
+ struct ksm_rmap_item *rmap_item = *rmap_list;
if ((rmap_item->address & PAGE_MASK) == addr)
return rmap_item;
if (rmap_item->address > addr)
break;
*rmap_list = rmap_item->rmap_list;
+ /* Running here indicates it's vma has been UNMERGEABLE */
remove_rmap_item_from_tree(rmap_item);
free_rmap_item(rmap_item);
}
+ return NULL;
+}
+
+static struct ksm_rmap_item *get_next_rmap_item(struct ksm_mm_slot *mm_slot,
+ struct ksm_rmap_item **rmap_list,
+ unsigned long addr)
+{
+ struct ksm_rmap_item *rmap_item;
+
+ /* lookup if we have a old rmap_item matching the addr*/
+ rmap_item = try_to_get_old_rmap_item(addr, rmap_list);
+ if (rmap_item)
+ return rmap_item;
+
+ /* Need to allocate a new rmap_item */
rmap_item = alloc_rmap_item();
if (rmap_item) {
/* It has already been zeroed */