]> www.infradead.org Git - users/jedix/linux-maple.git/commit
bpf: cpumap: switch to napi_skb_cache_get_bulk()
authorAlexander Lobakin <aleksander.lobakin@intel.com>
Tue, 25 Feb 2025 17:17:48 +0000 (18:17 +0100)
committerPaolo Abeni <pabeni@redhat.com>
Thu, 27 Feb 2025 13:03:39 +0000 (14:03 +0100)
commited16b8a4d1ca901fc13ced042b76dde54738249a
treefaee17265fb8866bfad53332bcec8b039781d087
parent859d6acd94cc4ad65e9eb3fa2a9815a19e5b35cf
bpf: cpumap: switch to napi_skb_cache_get_bulk()

Now that cpumap uses GRO, which drops unused skb heads to the NAPI
cache, use napi_skb_cache_get_bulk() to try to reuse cached entries
and lower MM layer pressure. Always disable the BH before checking and
running the cpumap-pinned XDP prog and don't re-enable it in between
that and allocating an skb bulk, as we can access the NAPI caches only
from the BH context.
The better GRO aggregates packets, the less new skbs will be allocated.
If an aggregated skb contains 16 frags, this means 15 skbs were returned
to the cache, so next 15 skbs will be built without allocating anything.

The same trafficgen UDP GRO test now shows:

                GRO off   GRO on
threaded GRO    2.3       4         Mpps
thr bulk GRO    2.4       4.7       Mpps

diff            +4        +17       %

Comparing to the baseline cpumap:

baseline        2.7       N/A       Mpps
thr bulk GRO    2.4       4.7       Mpps
diff            -11       +74       %

Tested-by: Daniel Xu <dxu@dxuuu.xyz>
Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
kernel/bpf/cpumap.c