Alexei Starovoitov [Wed, 1 Dec 2021 18:10:39 +0000 (10:10 -0800)]
selftests/bpf: Revert CO-RE removal in test_ksyms_weak.
The commit 087cba799ced ("selftests/bpf: Add weak/typeless ksym test for light skeleton")
added test_ksyms_weak to light skeleton testing, but remove CO-RE access.
Revert that part of commit, since light skeleton can use CO-RE in the kernel.
Alexei Starovoitov [Wed, 1 Dec 2021 18:10:38 +0000 (10:10 -0800)]
selftests/bpf: Additional test for CO-RE in the kernel.
Add a test where randmap() function is appended to three different bpf
programs. That action checks struct bpf_core_relo replication logic
and offset adjustment in gen loader part of libbpf.
Fourth bpf program has 360 CO-RE relocations from vmlinux, bpf_testmod,
and non-existing type. It tests candidate cache logic.
Alexei Starovoitov [Wed, 1 Dec 2021 18:10:32 +0000 (10:10 -0800)]
libbpf: Use CO-RE in the kernel in light skeleton.
Without lskel the CO-RE relocations are processed by libbpf before any other
work is done. Instead, when lskel is needed, remember relocation as RELO_CORE
kind. Then when loader prog is generated for a given bpf program pass CO-RE
relos of that program to gen loader via bpf_gen__record_relo_core(). The gen
loader will remember them as-is and pass it later as-is into the kernel.
The normal libbpf flow is to process CO-RE early before call relos happen. In
case of gen_loader the core relos have to be added to other relos to be copied
together when bpf static function is appended in different places to other main
bpf progs. During the copy the append_subprog_relos() will adjust insn_idx for
normal relos and for RELO_CORE kind too. When that is done each struct
reloc_desc has good relos for specific main prog.
Alexei Starovoitov [Wed, 1 Dec 2021 18:10:31 +0000 (10:10 -0800)]
bpf: Add bpf_core_add_cands() and wire it into bpf_core_apply_relo_insn().
Given BPF program's BTF root type name perform the following steps:
. search in vmlinux candidate cache.
. if (present in cache and candidate list >= 1) return candidate list.
. do a linear search through kernel BTFs for possible candidates.
. regardless of number of candidates found populate vmlinux cache.
. if (candidate list >= 1) return candidate list.
. search in module candidate cache.
. if (present in cache) return candidate list (even if list is empty).
. do a linear search through BTFs of all kernel modules
collecting candidates from all of them.
. regardless of number of candidates found populate module cache.
. return candidate list.
Then wire the result into bpf_core_apply_relo_insn().
When BPF program is trying to CO-RE relocate a type
that doesn't exist in either vmlinux BTF or in modules BTFs
these steps will perform 2 cache lookups when cache is hit.
Note the cache doesn't prevent the abuse by the program that might
have lots of relocations that cannot be resolved. Hence cond_resched().
CO-RE in the kernel requires CAP_BPF, since BTF loading requires it.
Alexei Starovoitov [Wed, 1 Dec 2021 18:10:29 +0000 (10:10 -0800)]
bpf: Adjust BTF log size limit.
Make BTF log size limit to be the same as the verifier log size limit.
Otherwise tools that progressively increase log size and use the same log
for BTF loading and program loading will be hitting hard to debug EINVAL.
Alexei Starovoitov [Wed, 1 Dec 2021 18:10:28 +0000 (10:10 -0800)]
bpf: Pass a set of bpf_core_relo-s to prog_load command.
struct bpf_core_relo is generated by llvm and processed by libbpf.
It's a de-facto uapi.
With CO-RE in the kernel the struct bpf_core_relo becomes uapi de-jure.
Add an ability to pass a set of 'struct bpf_core_relo' to prog_load command
and let the kernel perform CO-RE relocations.
Note the struct bpf_line_info and struct bpf_func_info have the same
layout when passed from LLVM to libbpf and from libbpf to the kernel
except "insn_off" fields means "byte offset" when LLVM generates it.
Then libbpf converts it to "insn index" to pass to the kernel.
The struct bpf_core_relo's "insn_off" field is always "byte offset".
Alexei Starovoitov [Wed, 1 Dec 2021 18:10:27 +0000 (10:10 -0800)]
bpf: Define enum bpf_core_relo_kind as uapi.
enum bpf_core_relo_kind is generated by llvm and processed by libbpf.
It's a de-facto uapi.
With CO-RE in the kernel the bpf_core_relo_kind values become uapi de-jure.
Also rename them with BPF_CORE_ prefix to distinguish from conflicting names in
bpf_core_read.h. The enums bpf_field_info_kind, bpf_type_id_kind,
bpf_type_info_kind, bpf_enum_value_kind are passing different values from bpf
program into llvm.
Alexei Starovoitov [Wed, 1 Dec 2021 18:10:26 +0000 (10:10 -0800)]
bpf: Prepare relo_core.c for kernel duty.
Make relo_core.c to be compiled for the kernel and for user space libbpf.
Note the patch is reducing BPF_CORE_SPEC_MAX_LEN from 64 to 32.
This is the maximum number of nested structs and arrays.
For example:
struct sample {
int a;
struct {
int b[10];
};
};
struct sample *s = ...;
int *y = &s->b[5];
This field access is encoded as "0:1:0:5" and spec len is 4.
Alexei Starovoitov [Wed, 1 Dec 2021 18:10:25 +0000 (10:10 -0800)]
bpf: Rename btf_member accessors.
Rename btf_member_bit_offset() and btf_member_bitfield_size() to
avoid conflicts with similarly named helpers in libbpf's btf.h.
Rename the kernel helpers, since libbpf helpers are part of uapi.
Alexei Starovoitov [Wed, 1 Dec 2021 18:10:24 +0000 (10:10 -0800)]
libbpf: Replace btf__type_by_id() with btf_type_by_id().
To prepare relo_core.c to be compiled in the kernel and the user space
replace btf__type_by_id with btf_type_by_id.
In libbpf btf__type_by_id and btf_type_by_id have different behavior.
bpf_core_apply_relo_insn() needs behavior of uapi btf__type_by_id
vs internal btf_type_by_id, but type_id range check is already done
in bpf_core_apply_relo(), so it's safe to replace it everywhere.
The kernel btf_type_by_id() does the check anyway. It doesn't hurt.
Alexander Lobakin [Wed, 1 Dec 2021 16:49:31 +0000 (17:49 +0100)]
samples: bpf: Fix conflicting types in fds_example
Fix the following samples/bpf build error appeared after the
introduction of bpf_map_create() in libbpf:
CC samples/bpf/fds_example.o
samples/bpf/fds_example.c:49:12: error: static declaration of 'bpf_map_create' follows non-static declaration
static int bpf_map_create(void)
^
samples/bpf/libbpf/include/bpf/bpf.h:55:16: note: previous declaration is here
LIBBPF_API int bpf_map_create(enum bpf_map_type map_type,
^
samples/bpf/fds_example.c:82:23: error: too few arguments to function call, expected 6, have 0
fd = bpf_map_create();
~~~~~~~~~~~~~~ ^
samples/bpf/libbpf/include/bpf/bpf.h:55:16: note: 'bpf_map_create' declared here
LIBBPF_API int bpf_map_create(enum bpf_map_type map_type,
^
2 errors generated.
fds_example by accident has a static function with the same name.
It's not worth it to separate a single call into its own function,
so just embed it.
Fixes: 992c4225419a ("libbpf: Unify low-level map creation APIs w/ new bpf_map_create()") Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://lore.kernel.org/bpf/20211201164931.47357-1-alexandr.lobakin@intel.com
Hou Tao [Wed, 1 Dec 2021 07:34:57 +0000 (15:34 +0800)]
bpf: Clean-up bpf_verifier_vlog() for BPF_LOG_KERNEL log level
An extra newline will output for bpf_log() with BPF_LOG_KERNEL level
as shown below:
[ 52.095704] BPF:The function test_3 has 12 arguments. Too many.
[ 52.095704]
[ 52.096896] Error in parsing func ptr test_3 in struct bpf_dummy_ops
Now all bpf_log() are ended by newline, but not all btf_verifier_log()
are ended by newline, so checking whether or not the log message
has the trailing newline and adding a newline if not.
Also there is no need to calculate the left userspace buffer size
for kernel log output and to truncate the output by '\0' which
has already been done by vscnprintf(), so only do these for
userspace log output.
Andrii Nakryiko [Tue, 30 Nov 2021 23:48:15 +0000 (15:48 -0800)]
Merge branch 'Apply suggestions for typeless/weak ksym series'
Kumar Kartikeya says:
====================
Three commits addressing comments for the typeless/weak ksym set. No functional
change intended. Hopefully this is simpler to read for kfunc as well.
====================
libbpf: Avoid reload of imm for weak, unresolved, repeating ksym
Alexei pointed out that we can use BPF_REG_0 which already contains imm
from move_blob2blob computation. Note that we now compare the second
insn's imm, but this should not matter, since both will be zeroed out
for the error case for the insn populated earlier.
Suggested-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20211122235733.634914-4-memxor@gmail.com
libbpf: Avoid double stores for success/failure case of ksym relocations
Instead, jump directly to success case stores in case ret >= 0, else do
the default 0 value store and jump over the success case. This is better
in terms of readability. Readjust the code for kfunc relocation as well
to follow a similar pattern, also leads to easier to follow code now.
Suggested-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20211122235733.634914-3-memxor@gmail.com
bpf: Change bpf_kallsyms_lookup_name size type to ARG_CONST_SIZE_OR_ZERO
Andrii mentioned in [0] that switching to ARG_CONST_SIZE_OR_ZERO lets
user avoid having to prove that string size at runtime is not zero and
helps with not having to supress clang optimizations.
Alexei Starovoitov [Tue, 30 Nov 2021 18:56:28 +0000 (10:56 -0800)]
Merge branch 'Add bpf_loop helper'
Joanne Koong says:
====================
This patchset add a new helper, bpf_loop.
One of the complexities of using for loops in bpf programs is that the verifier
needs to ensure that in every possibility of the loop logic, the loop will always
terminate. As such, there is a limit on how many iterations the loop can do.
The bpf_loop helper moves the loop logic into the kernel and can thereby
guarantee that the loop will always terminate. The bpf_loop helper simplifies
a lot of the complexity the verifier needs to check, as well as removes the
constraint on the number of loops able to be run.
From the test results, we see that using bpf_loop in place
of the traditional for loop led to a decrease in verification time
and number of bpf instructions by ~99%. The benchmark results show
that as the number of iterations increases, the overhead per iteration
decreases.
The high-level overview of the patches -
Patch 1 - kernel-side + API changes for adding bpf_loop
Patch 2 - tests
Patch 3 - use bpf_loop in strobemeta + pyperf600 and measure verifier performance
Patch 4 - benchmark for throughput + latency of bpf_loop call
v3 -> v4:
~ Address nits: use usleep for triggering bpf programs, fix copyright style
v2 -> v3:
~ Rerun benchmarks on physical machine, update results
~ Propagate original error codes in the verifier
v1 -> v2:
~ Change helper name to bpf_loop (instead of bpf_for_each)
~ Set max nr_loops (~8 million loops) for bpf_loop call
~ Split tests + strobemeta/pyperf600 changes into two patches
~ Add new ops_report_final helper for outputting throughput and latency
====================
>From this data, we can see that the latency per loop decreases as the
number of loops increases. On this particular machine, each loop had an
overhead of about ~4 ns, and we were able to run ~250 million loops
per second.
Joanne Koong [Tue, 30 Nov 2021 03:06:19 +0000 (19:06 -0800)]
bpf: Add bpf_loop helper
This patch adds the kernel-side and API changes for a new helper
function, bpf_loop:
long bpf_loop(u32 nr_loops, void *callback_fn, void *callback_ctx,
u64 flags);
where long (*callback_fn)(u32 index, void *ctx);
bpf_loop invokes the "callback_fn" **nr_loops** times or until the
callback_fn returns 1. The callback_fn can only return 0 or 1, and
this is enforced by the verifier. The callback_fn index is zero-indexed.
A few things to please note:
~ The "u64 flags" parameter is currently unused but is included in
case a future use case for it arises.
~ In the kernel-side implementation of bpf_loop (kernel/bpf/bpf_iter.c),
bpf_callback_t is used as the callback function cast.
~ A program can have nested bpf_loop calls but the program must
still adhere to the verifier constraint of its stack depth (the stack depth
cannot exceed MAX_BPF_STACK))
~ Recursive callback_fns do not pass the verifier, due to the call stack
for these being too deep.
~ The next patch will include the tests and benchmark
Christoph Hellwig [Fri, 19 Nov 2021 16:32:15 +0000 (17:32 +0100)]
bpf, docs: Split general purpose eBPF documentation out of filter.rst
filter.rst starts out documenting the classic BPF and then spills into
introducing and documentating eBPF. Move the eBPF documentation into
rwo new files under Documentation/bpf/ for the instruction set and
the verifier and link to the BPF documentation from filter.rst.
Christoph Hellwig [Fri, 19 Nov 2021 16:32:13 +0000 (17:32 +0100)]
bpf, docs: Prune all references to "internal BPF"
The eBPF name has completely taken over from eBPF in general usage for
the actual eBPF representation, or BPF for any general in-kernel use.
Prune all remaining references to "internal BPF".
Alan Maguire [Mon, 29 Nov 2021 10:00:40 +0000 (10:00 +0000)]
libbpf: Silence uninitialized warning/error in btf_dump_dump_type_data
When compiling libbpf with gcc 4.8.5, we see:
CC staticobjs/btf_dump.o
btf_dump.c: In function ‘btf_dump_dump_type_data.isra.24’:
btf_dump.c:2296:5: error: ‘err’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
if (err < 0)
^
cc1: all warnings being treated as errors
make: *** [staticobjs/btf_dump.o] Error 1
While gcc 4.8.5 is too old to build the upstream kernel, it's possible it
could be used to build standalone libbpf which suffers from the same problem.
Silence the error by initializing 'err' to 0. The warning/error seems to be
a false positive since err is set early in the function. Regardless we
shouldn't prevent libbpf from building for this.
Tiezhu Yang [Thu, 25 Nov 2021 01:36:07 +0000 (09:36 +0800)]
bpf, mips: Fix build errors about __NR_bpf undeclared
Add the __NR_bpf definitions to fix the following build errors for mips:
$ cd tools/bpf/bpftool
$ make
[...]
bpf.c:54:4: error: #error __NR_bpf not defined. libbpf does not support your arch.
# error __NR_bpf not defined. libbpf does not support your arch.
^~~~~
bpf.c: In function ‘sys_bpf’:
bpf.c:66:17: error: ‘__NR_bpf’ undeclared (first use in this function); did you mean ‘__NR_brk’?
return syscall(__NR_bpf, cmd, attr, size);
^~~~~~~~
__NR_brk
[...]
In file included from gen_loader.c:15:0:
skel_internal.h: In function ‘skel_sys_bpf’:
skel_internal.h:53:17: error: ‘__NR_bpf’ undeclared (first use in this function); did you mean ‘__NR_brk’?
return syscall(__NR_bpf, cmd, attr, size);
^~~~~~~~
__NR_brk
Andrii Nakryiko [Wed, 24 Nov 2021 00:23:21 +0000 (16:23 -0800)]
selftests/bpf: Prevent misaligned memory access in get_stack_raw_tp test
Perfbuf doesn't guarantee 8-byte alignment of the data like BPF ringbuf
does, so struct get_stack_trace_t can arrive not properly aligned for
subsequent u64 accesses. Easiest fix is to just copy data locally.
Andrii Nakryiko [Wed, 24 Nov 2021 00:23:19 +0000 (16:23 -0800)]
selftests/bpf: Fix UBSan complaint about signed __int128 overflow
Test is using __int128 variable as unsigned and highest order bit can be
set to 1 after bit shift. Use unsigned __int128 explicitly and prevent
UBSan from complaining.
Andrii Nakryiko [Wed, 24 Nov 2021 00:23:18 +0000 (16:23 -0800)]
libbpf: Fix using invalidated memory in bpf_linker
add_dst_sec() can invalidate bpf_linker's section index making
dst_symtab pointer pointing into unallocated memory. Reinitialize
dst_symtab pointer on each iteration to make sure it's always valid.
Andrii Nakryiko [Wed, 24 Nov 2021 00:23:17 +0000 (16:23 -0800)]
libbpf: Fix glob_syms memory leak in bpf_linker
glob_syms array wasn't freed on bpf_link__free(). Fix that.
Fixes: a46349227cd8 ("libbpf: Add linker extern resolution support for functions and global variables") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20211124002325.1737739-6-andrii@kernel.org
Andrii Nakryiko [Wed, 24 Nov 2021 00:23:16 +0000 (16:23 -0800)]
libbpf: Don't call libc APIs with NULL pointers
Sanitizer complains about qsort(), bsearch(), and memcpy() being called
with NULL pointer. This can only happen when the associated number of
elements is zero, so no harm should be done. But still prevent this from
happening to keep sanitizer runs clean from extra noise.
Andrii Nakryiko [Wed, 24 Nov 2021 00:23:14 +0000 (16:23 -0800)]
libbpf: Fix potential misaligned memory access in btf_ext__new()
Perform a memory copy before we do the sanity checks of btf_ext_hdr.
This prevents misaligned memory access if raw btf_ext data is not 4-byte
aligned ([0]).
Andrii Nakryiko [Wed, 24 Nov 2021 19:32:33 +0000 (11:32 -0800)]
selftests/bpf: Migrate selftests to bpf_map_create()
Conversion is straightforward for most cases. In few cases tests are
using mutable map_flags and attribute structs, but bpf_map_create_opts
can be used in the similar fashion, so there were no problems. Just lots
of repetitive conversions.
Andrii Nakryiko [Wed, 24 Nov 2021 19:32:32 +0000 (11:32 -0800)]
libbpf: Prevent deprecation warnings in xsk.c
xsk.c is using own APIs that are marked for deprecation internally.
Given xsk.c and xsk.h will be gone in libbpf 1.0, there is no reason to
do public vs internal function split just to avoid deprecation warnings.
So just add a pragma to silence deprecation warnings (until the code is
removed completely).
Andrii Nakryiko [Wed, 24 Nov 2021 19:32:30 +0000 (11:32 -0800)]
libbpf: Unify low-level map creation APIs w/ new bpf_map_create()
Mark the entire zoo of low-level map creation APIs for deprecation in
libbpf 0.7 ([0]) and introduce a new bpf_map_create() API that is
OPTS-based (and thus future-proof) and matches the BPF_MAP_CREATE
command name.
While at it, ensure that gen_loader sends map_extra field. Also remove
now unneeded btf_key_type_id/btf_value_type_id logic that libbpf is
doing anyways.
Andrii Nakryiko [Tue, 23 Nov 2021 20:01:05 +0000 (12:01 -0800)]
selftests/bpf: Mix legacy (maps) and modern (vars) BPF in one test
Add selftest that combines two BPF programs within single BPF object
file such that one of the programs is using global variables, but can be
skipped at runtime on old kernels that don't support global data.
Another BPF program is written with the goal to be runnable on very old
kernels and only relies on explicitly accessed BPF maps.
Such test, run against old kernels (e.g., libbpf CI will run it against 4.9
kernel that doesn't support global data), allows to test the approach
and ensure that libbpf doesn't make unnecessary assumption about
necessary kernel features.
Andrii Nakryiko [Tue, 23 Nov 2021 20:01:04 +0000 (12:01 -0800)]
libbpf: Load global data maps lazily on legacy kernels
Load global data maps lazily, if kernel is too old to support global
data. Make sure that programs are still correct by detecting if any of
the to-be-loaded programs have relocation against any of such maps.
This allows to solve the issue ([0]) with bpf_printk() and Clang
generating unnecessary and unreferenced .rodata.strX.Y sections, but it
also goes further along the CO-RE lines, allowing to have a BPF object
in which some code can work on very old kernels and relies only on BPF
maps explicitly, while other BPF programs might enjoy global variable
support. If such programs are correctly set to not load at runtime on
old kernels, bpf_object will load and function correctly now.
Fixes: aed659170a31 ("libbpf: Support multiple .rodata.* and .data.* BPF maps") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20211123200105.387855-1-andrii@kernel.org
Jiri Olsa [Wed, 17 Nov 2021 19:41:14 +0000 (11:41 -0800)]
selftests/bpf: Add btf_dedup case with duplicated structs within CU
Add an artificial minimal example simulating compilers producing two
different types within a single CU that correspond to identical struct
definitions.
Andrii Nakryiko [Wed, 17 Nov 2021 19:41:13 +0000 (11:41 -0800)]
libbpf: Accommodate DWARF/compiler bug with duplicated structs
According to [0], compilers sometimes might produce duplicate DWARF
definitions for exactly the same struct/union within the same
compilation unit (CU). We've had similar issues with identical arrays
and handled them with a similar workaround in 6b6e6b1d09aa ("libbpf:
Accomodate DWARF/compiler bug with duplicated identical arrays"). Do the
same for struct/union by ensuring that two structs/unions are exactly
the same, down to the integer values of field referenced type IDs.
Solving this more generically (allowing referenced types to be
equivalent, but using different type IDs, all within a single CU)
requires a huge complexity increase to handle many-to-many mappings
between canonidal and candidate type graphs. Before we invest in that,
let's see if this approach handles all the instances of this issue in
practice. Thankfully it's pretty rare, it seems.
Andrii Nakryiko [Thu, 18 Nov 2021 17:40:54 +0000 (09:40 -0800)]
libbpf: Add runtime APIs to query libbpf version
Libbpf provided LIBBPF_MAJOR_VERSION and LIBBPF_MINOR_VERSION macros to
check libbpf version at compilation time. This doesn't cover all the
needs, though, because version of libbpf that application is compiled
against doesn't necessarily match the version of libbpf at runtime,
especially if libbpf is used as a shared library.
Add libbpf_major_version() and libbpf_minor_version() returning major
and minor versions, respectively, as integers. Also add a convenience
libbpf_version_string() for various tooling using libbpf to print out
libbpf version in a human-readable form. Currently it will return
"v0.6", but in the future it can contains some extra information, so the
format itself is not part of a stable API and shouldn't be relied upon.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://lore.kernel.org/bpf/20211118174054.2699477-1-andrii@kernel.org
Ilya Leoshkevich [Thu, 18 Nov 2021 11:52:25 +0000 (12:52 +0100)]
selfetests/bpf: Adapt vmtest.sh to s390 libbpf CI changes
[1] added s390 support to libbpf CI and added an ${ARCH} prefix to a
number of paths and identifiers in libbpf GitHub repo, which vmtest.sh
relies upon. Update these and make use of the new s390 support.
Tirthendu Sarkar [Wed, 17 Nov 2021 12:36:13 +0000 (18:06 +0530)]
selftests/bpf: Fix xdpxceiver failures for no hugepages
xsk_configure_umem() needs hugepages to work in unaligned mode. So when
hugepages are not configured, 'unaligned' tests should be skipped which
is determined by the helper function hugepages_present(). This function
erroneously returns true with MAP_NORESERVE flag even when no hugepages
are configured. The removal of this flag fixes the issue.
The test TEST_TYPE_UNALIGNED_INV_DESC also needs to be skipped when
there are no hugepages. However, this was not skipped as there was no
check for presence of hugepages and hence was failing. The check to skip
the test has now been added.
Dave Tucker [Fri, 12 Nov 2021 21:17:24 +0000 (21:17 +0000)]
bpf, docs: Fix ordering of bpf documentation
This commit fixes the display of the BPF documentation in the sidebar
when rendered as HTML.
Before this patch, the sidebar would render as follows for some
sections:
| BPF Documentation
|- BPF Type Format (BTF)
|- BPF Type Format (BTF)
This was due to creating a heading in index.rst followed by
a sphinx toctree, where the file referenced carries the same
title as the section heading.
To fix this I applied a pattern that has been established in other
subfolders of Documentation:
1. Re-wrote index.rst to have a single toctree
2. Split the sections out in to their own files
Additionally maps.rst and programs.rst make use of a glob pattern to
include map_* or prog_* rst files in their toctree, meaning future map
or program type documentation will be automatically included.
Dave Tucker [Fri, 12 Nov 2021 21:17:22 +0000 (21:17 +0000)]
bpf, docs: Change underline in btf to match style guide
This changes the type of underline used to follow the guidelines in
Documentation/doc-guide/sphinx.rst which also ensures that the headings
are rendered at the correct level in the HTML sidebar
Add benchmark to measure overhead of uprobes and uretprobes. Also have
a baseline (no uprobe attached) benchmark.
On my dev machine, baseline benchmark can trigger 130M user_target()
invocations. When uprobe is attached, this falls to just 700K. With
uretprobe, we get down to 520K:
$ sudo ./bench trig-uprobe-base -a
Summary: hits 131.289 ± 2.872M/s
Tiezhu Yang [Fri, 5 Nov 2021 01:30:00 +0000 (09:30 +0800)]
bpf: Change value of MAX_TAIL_CALL_CNT from 32 to 33
In the current code, the actual max tail call count is 33 which is greater
than MAX_TAIL_CALL_CNT (defined as 32). The actual limit is not consistent
with the meaning of MAX_TAIL_CALL_CNT and thus confusing at first glance.
We can see the historical evolution from commit 04fd61ab36ec ("bpf: allow
bpf programs to tail-call other bpf programs") and commit f9dabe016b63
("bpf: Undo off-by-one in interpreter tail call count limit"). In order
to avoid changing existing behavior, the actual limit is 33 now, this is
reasonable.
After commit 874be05f525e ("bpf, tests: Add tail call test suite"), we can
see there exists failed testcase.
On all archs when CONFIG_BPF_JIT_ALWAYS_ON is not set:
# echo 0 > /proc/sys/net/core/bpf_jit_enable
# modprobe test_bpf
# dmesg | grep -w FAIL
Tail call error path, max count reached jited:0 ret 34 != 33 FAIL
On some archs:
# echo 1 > /proc/sys/net/core/bpf_jit_enable
# modprobe test_bpf
# dmesg | grep -w FAIL
Tail call error path, max count reached jited:1 ret 34 != 33 FAIL
Although the above failed testcase has been fixed in commit 18935a72eb25
("bpf/tests: Fix error in tail call limit tests"), it would still be good
to change the value of MAX_TAIL_CALL_CNT from 32 to 33 to make the code
more readable.
The 32-bit x86 JIT was using a limit of 32, just fix the wrong comments and
limit to 33 tail calls as the constant MAX_TAIL_CALL_CNT updated. For the
mips64 JIT, use "ori" instead of "addiu" as suggested by Johan Almbladh.
For the riscv JIT, use RV_REG_TCC directly to save one register move as
suggested by Björn Töpel. For the other implementations, no function changes,
it does not change the current limit 33, the new value of MAX_TAIL_CALL_CNT
can reflect the actual max tail call count, the related tail call testcases
in test_bpf module and selftests can work well for the interpreter and the
JIT.
Quentin Monnet [Mon, 15 Nov 2021 22:58:44 +0000 (22:58 +0000)]
selftests/bpf: Configure dir paths via env in test_bpftool_synctypes.py
Script test_bpftool_synctypes.py parses a number of files in the bpftool
directory (or even elsewhere in the repo) to make sure that the list of
types or options in those different files are consistent. Instead of
having fixed paths, let's make the directories configurable through
environment variable. This should make easier in the future to run the
script in a different setup, for example on an out-of-tree bpftool
mirror with a different layout.
Quentin Monnet [Mon, 15 Nov 2021 22:58:43 +0000 (22:58 +0000)]
bpftool: Update doc (use susbtitutions) and test_bpftool_synctypes.py
test_bpftool_synctypes.py helps detecting inconsistencies in bpftool
between the different list of types and options scattered in the
sources, the documentation, and the bash completion. For options that
apply to all bpftool commands, the script had a hardcoded list of
values, and would use them to check whether the man pages are
up-to-date. When writing the script, it felt acceptable to have this
list in order to avoid to open and parse bpftool's main.h every time,
and because the list of global options in bpftool doesn't change so
often.
However, this is prone to omissions, and we recently added a new
-l|--legacy option which was described in common_options.rst, but not
listed in the options summary of each manual page. The script did not
complain, because it keeps comparing the hardcoded list to the (now)
outdated list in the header file.
To address the issue, this commit brings the following changes:
- Options that are common to all bpftool commands (--json, --pretty, and
--debug) are moved to a dedicated file, and used in the definition of
a RST substitution. This substitution is used in the sources of all
the man pages.
- This list of common options is updated, with the addition of the new
-l|--legacy option.
- The script test_bpftool_synctypes.py is updated to compare:
- Options specific to a command, found in C files, for the
interactive help messages, with the same specific options from the
relevant man page for that command.
- Common options, checked just once: the list in main.h is
compared with the new list in substitutions.rst.
Quentin Monnet [Mon, 15 Nov 2021 22:58:42 +0000 (22:58 +0000)]
bpftool: Add SPDX tags to RST documentation files
Most files in the kernel repository have a SPDX tags. The files that
don't have such a tag (or another license boilerplate) tend to fall
under the GPL-2.0 license. In the past, bpftool's Makefile (for example)
has been marked as GPL-2.0 for that reason, when in fact all bpftool is
dual-licensed.
To prevent a similar confusion from happening with the RST documentation
files for bpftool, let's explicitly mark all files as dual-licensed.
... I needed to sync libbpf repo and manually copy libbpf sources to
pahole. To simplify process, I used BTF_KIND_RESTRICT to simulate the
BTF_KIND_TYPE_TAG with vmlinux build as "restrict" modifier is barely
used in kernel.
But this approach missed one case in dedup with structures where
BTF_KIND_RESTRICT is handled and BTF_KIND_TYPE_TAG is not handled in
btf_dedup_is_equiv(), and this will result in a pahole dedup failure.
This patch fixed this issue and a selftest is added in the subsequent
patch to test this scenario.
The other missed handling is in btf__resolve_size(). Currently the compiler
always emit like PTR->TYPE_TAG->... so in practice we don't hit the missing
BTF_KIND_TYPE_TAG handling issue with compiler generated code. But lets
add case BTF_KIND_TYPE_TAG in the switch statement to be future proof.
We've added 72 non-merge commits during the last 13 day(s) which contain
a total of 171 files changed, 2728 insertions(+), 1143 deletions(-).
The main changes are:
1) Add btf_type_tag attributes to bring kernel annotations like __user/__rcu to
BTF such that BPF verifier will be able to detect misuse, from Yonghong Song.
2) Big batch of libbpf improvements including various fixes, future proofing APIs,
and adding a unified, OPTS-based bpf_prog_load() low-level API, from Andrii Nakryiko.
3) Add ingress_ifindex to BPF_SK_LOOKUP program type for selectively applying the
programmable socket lookup logic to packets from a given netdev, from Mark Pashmfouroush.
4) Remove the 128M upper JIT limit for BPF programs on arm64 and add selftest to
ensure exception handling still works, from Russell King and Alan Maguire.
5) Add a new bpf_find_vma() helper for tracing to map an address to the backing
file such as shared library, from Song Liu.
6) Batch of various misc fixes to bpftool, fixing a memory leak in BPF program dump,
updating documentation and bash-completion among others, from Quentin Monnet.
7) Deprecate libbpf bpf_program__get_prog_info_linear() API and migrate its users as
the API is heavily tailored around perf and is non-generic, from Dave Marchevsky.
8) Enable libbpf's strict mode by default in bpftool and add a --legacy option as an
opt-out for more relaxed BPF program requirements, from Stanislav Fomichev.
9) Fix bpftool to use libbpf_get_error() to check for errors, from Hengqi Chen.
* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (72 commits)
bpftool: Use libbpf_get_error() to check error
bpftool: Fix mixed indentation in documentation
bpftool: Update the lists of names for maps and prog-attach types
bpftool: Fix indent in option lists in the documentation
bpftool: Remove inclusion of utilities.mak from Makefiles
bpftool: Fix memory leak in prog_dump()
selftests/bpf: Fix a tautological-constant-out-of-range-compare compiler warning
selftests/bpf: Fix an unused-but-set-variable compiler warning
bpf: Introduce btf_tracing_ids
bpf: Extend BTF_ID_LIST_GLOBAL with parameter for number of IDs
bpftool: Enable libbpf's strict mode by default
docs/bpf: Update documentation for BTF_KIND_TYPE_TAG support
selftests/bpf: Clarify llvm dependency with btf_tag selftest
selftests/bpf: Add a C test for btf_type_tag
selftests/bpf: Rename progs/tag.c to progs/btf_decl_tag.c
selftests/bpf: Test BTF_KIND_DECL_TAG for deduplication
selftests/bpf: Add BTF_KIND_TYPE_TAG unit tests
selftests/bpf: Test libbpf API function btf__add_type_tag()
bpftool: Support BTF_KIND_TYPE_TAG
libbpf: Support BTF_KIND_TYPE_TAG
...
====================
Please revert. Besides the driver in net, it modifies the I2C core
code. This has not been acked by the I2C maintainer (in this case me).
So, please don't pull this in via the net tree. The question raised here
(extending SMBus calls to 255 byte) is complicated because we need ABI
backwards compatibility.
The various validate method implementations we have in phylink users
have been quite repetitive but also prone to bugs. These patches
introduce a generic implementation which relies solely on the
supported_interfaces bitmap introduced during last cycle, and in the
first patch, a bit array of MAC capabilities.
MAC drivers are free to continue to do their own thing if they have
special requirements - such as mvneta and mvpp2 which do not support
1000base-X without AN enabled. Most implementations currently in the
kernel can be converted to call phylink_generic_validate() directly
from the phylink MAC operations structure once they fill in the
supported_interfaces and mac_capabilities members of phylink_config.
This series introduces the generic implementation, and converts mvneta
and mvpp2 to use it.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Russell King (Oracle) [Mon, 15 Nov 2021 10:00:37 +0000 (10:00 +0000)]
net: mvpp2: use phylink_generic_validate()
Convert mvpp2 to use phylink_generic_validate() for the bulk of its
validate() implementation. This network adapter has a restriction
that for 802.3z links, autonegotiation must be enabled.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
Russell King (Oracle) [Mon, 15 Nov 2021 10:00:32 +0000 (10:00 +0000)]
net: mvneta: use phylink_generic_validate()
Convert mvneta to use phylink_generic_validate() for the bulk of its
validate() implementation. This network adapter has a restriction
that for 802.3z links, autonegotiation must be enabled.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
Russell King (Oracle) [Mon, 15 Nov 2021 10:00:27 +0000 (10:00 +0000)]
net: phylink: add generic validate implementation
Add a generic validate() implementation using the supported_interfaces
and a bitmask of MAC pause/speed/duplex capabilities. This allows us
to entirely eliminate many driver private validate() implementations.
We expose the underlying phylink_get_linkmodes() function so that
drivers which have special needs can still benefit from conversion.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
Christophe Leroy [Mon, 15 Nov 2021 11:08:59 +0000 (12:08 +0100)]
net/wan/fsl_ucc_hdlc: fix sparse warnings
CHECK drivers/net/wan/fsl_ucc_hdlc.c
drivers/net/wan/fsl_ucc_hdlc.c:309:57: warning: incorrect type in argument 2 (different address spaces)
drivers/net/wan/fsl_ucc_hdlc.c:309:57: expected void [noderef] __iomem *
drivers/net/wan/fsl_ucc_hdlc.c:309:57: got restricted __be16 *
drivers/net/wan/fsl_ucc_hdlc.c:311:46: warning: incorrect type in argument 2 (different address spaces)
drivers/net/wan/fsl_ucc_hdlc.c:311:46: expected void [noderef] __iomem *
drivers/net/wan/fsl_ucc_hdlc.c:311:46: got restricted __be32 *
drivers/net/wan/fsl_ucc_hdlc.c:320:57: warning: incorrect type in argument 2 (different address spaces)
drivers/net/wan/fsl_ucc_hdlc.c:320:57: expected void [noderef] __iomem *
drivers/net/wan/fsl_ucc_hdlc.c:320:57: got restricted __be16 *
drivers/net/wan/fsl_ucc_hdlc.c:322:46: warning: incorrect type in argument 2 (different address spaces)
drivers/net/wan/fsl_ucc_hdlc.c:322:46: expected void [noderef] __iomem *
drivers/net/wan/fsl_ucc_hdlc.c:322:46: got restricted __be32 *
drivers/net/wan/fsl_ucc_hdlc.c:372:29: warning: incorrect type in assignment (different base types)
drivers/net/wan/fsl_ucc_hdlc.c:372:29: expected unsigned short [usertype]
drivers/net/wan/fsl_ucc_hdlc.c:372:29: got restricted __be16 [usertype]
drivers/net/wan/fsl_ucc_hdlc.c:379:36: warning: restricted __be16 degrades to integer
drivers/net/wan/fsl_ucc_hdlc.c:402:12: warning: incorrect type in assignment (different address spaces)
drivers/net/wan/fsl_ucc_hdlc.c:402:12: expected struct qe_bd [noderef] __iomem *bd
drivers/net/wan/fsl_ucc_hdlc.c:402:12: got struct qe_bd *curtx_bd
drivers/net/wan/fsl_ucc_hdlc.c:425:20: warning: incorrect type in assignment (different address spaces)
drivers/net/wan/fsl_ucc_hdlc.c:425:20: expected struct qe_bd [noderef] __iomem *[assigned] bd
drivers/net/wan/fsl_ucc_hdlc.c:425:20: got struct qe_bd *tx_bd_base
drivers/net/wan/fsl_ucc_hdlc.c:427:16: error: incompatible types in comparison expression (different address spaces):
drivers/net/wan/fsl_ucc_hdlc.c:427:16: struct qe_bd [noderef] __iomem *
drivers/net/wan/fsl_ucc_hdlc.c:427:16: struct qe_bd *
drivers/net/wan/fsl_ucc_hdlc.c:462:33: warning: incorrect type in argument 1 (different address spaces)
drivers/net/wan/fsl_ucc_hdlc.c:506:41: warning: incorrect type in argument 1 (different address spaces)
drivers/net/wan/fsl_ucc_hdlc.c:528:33: warning: incorrect type in argument 1 (different address spaces)
drivers/net/wan/fsl_ucc_hdlc.c:552:38: warning: incorrect type in argument 1 (different address spaces)
drivers/net/wan/fsl_ucc_hdlc.c:596:67: warning: incorrect type in argument 2 (different address spaces)
drivers/net/wan/fsl_ucc_hdlc.c:611:41: warning: incorrect type in argument 1 (different address spaces)
drivers/net/wan/fsl_ucc_hdlc.c:851:38: warning: incorrect type in initializer (different address spaces)
drivers/net/wan/fsl_ucc_hdlc.c:854:40: warning: incorrect type in argument 1 (different address spaces)
drivers/net/wan/fsl_ucc_hdlc.c:855:40: warning: incorrect type in argument 1 (different address spaces)
drivers/net/wan/fsl_ucc_hdlc.c:858:39: warning: incorrect type in argument 1 (different address spaces)
drivers/net/wan/fsl_ucc_hdlc.c:861:37: warning: incorrect type in argument 2 (different address spaces)
drivers/net/wan/fsl_ucc_hdlc.c:866:38: warning: incorrect type in initializer (different address spaces)
drivers/net/wan/fsl_ucc_hdlc.c:868:21: warning: incorrect type in argument 1 (different address spaces)
drivers/net/wan/fsl_ucc_hdlc.c:870:40: warning: incorrect type in argument 2 (different address spaces)
drivers/net/wan/fsl_ucc_hdlc.c:871:40: warning: incorrect type in argument 2 (different address spaces)
drivers/net/wan/fsl_ucc_hdlc.c:873:39: warning: incorrect type in argument 2 (different address spaces)
drivers/net/wan/fsl_ucc_hdlc.c:993:57: warning: incorrect type in argument 2 (different address spaces)
drivers/net/wan/fsl_ucc_hdlc.c:995:46: warning: incorrect type in argument 2 (different address spaces)
drivers/net/wan/fsl_ucc_hdlc.c:1004:57: warning: incorrect type in argument 2 (different address spaces)
drivers/net/wan/fsl_ucc_hdlc.c:1006:46: warning: incorrect type in argument 2 (different address spaces)
drivers/net/wan/fsl_ucc_hdlc.c:412:35: warning: dereference of noderef expression
drivers/net/wan/fsl_ucc_hdlc.c:412:35: warning: dereference of noderef expression
drivers/net/wan/fsl_ucc_hdlc.c:724:29: warning: dereference of noderef expression
drivers/net/wan/fsl_ucc_hdlc.c:815:21: warning: dereference of noderef expression
drivers/net/wan/fsl_ucc_hdlc.c:1021:29: warning: dereference of noderef expression
Most of the warnings are due to DMA memory being incorrectly handled as IO memory.
Fix it by doing direct read/write and doing proper dma_rmb() / dma_wmb().
Other problems are type mismatches or lack of use of IO accessors.
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Reported-by: kernel test robot <lkp@intel.com> Link: https://lkml.org/lkml/2021/11/12/647 Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: David S. Miller <davem@davemloft.net>
Guo Zhengkui [Mon, 15 Nov 2021 05:00:10 +0000 (13:00 +0800)]
hinic: use ARRAY_SIZE instead of ARRAY_LEN
ARRAY_SIZE defined in <linux/kernel.h> is safer than self-defined
macros to get size of an array such as ARRAY_LEN used here. Because
ARRAY_SIZE uses __must_be_array(arr) to ensure arr is really an array.
Reported-by: Alejandro Colomar <colomar.6.4.3@gmail.com> Signed-off-by: Guo Zhengkui <guozhengkui@vivo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jacky Chou [Mon, 15 Nov 2021 03:49:41 +0000 (11:49 +0800)]
net: usb: ax88179_178a: add TSO feature
On low-effciency embedded platforms, transmission performance is poor
due to on Bulk-out with single packet.
Adding TSO feature improves the transmission performance and reduces
the number of interrupt caused by Bulk-out complete.
Reference to module, net: usb: aqc111.
Signed-off-by: Jacky Chou <jackychou@asix.com.tw> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 15 Nov 2021 14:11:25 +0000 (14:11 +0000)]
Merge branch 'mctp-i2c-driver'
Matt Johnston says:
====================
MCTP I2C driver
This patch series adds a netdev driver providing MCTP transport over
I2C.
It applies against net-next using recent MCTP changes there, though also
has I2C core changes for review. I'll leave it to maintainers where it
should be applied - please let me know if it needs to be submitted
differently.
The I2C patches were previously sent as RFC though the only feedback
there was an ack to 255 bytes for aspeed.
The dt-bindings patch went through review on the list.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Matt Johnston [Mon, 15 Nov 2021 02:49:26 +0000 (10:49 +0800)]
mctp i2c: MCTP I2C binding driver
Provides MCTP network transport over an I2C bus, as specified in
DMTF DSP0237. All messages between nodes are sent as SMBus Block Writes.
Each I2C bus to be used for MCTP is flagged in devicetree by a
'mctp-controller' property on the bus node. Each flagged bus gets a
mctpi2cX net device created based on the bus number. A
'mctp-i2c-controller' I2C client needs to be added under the adapter. In
an I2C mux situation the mctp-i2c-controller node must be attached only
to the root I2C bus. The I2C client will handle incoming I2C slave block
write data for subordinate busses as well as its own bus.
In configurations without devicetree a driver instance can be attached
to a bus using the I2C slave new_device mechanism.
The MCTP core will hold/release the MCTP I2C device while responses
are pending (a 6 second timeout or once a socket is closed, response
received etc). While held the MCTP I2C driver will lock the I2C bus so
that the correct I2C mux remains selected while responses are received.
(Ideally we would just lock the mux to keep the current bus selected for
the response rather than a full I2C bus lock, but that isn't exposed in
the I2C mux API)
This driver requires I2C adapters that allow 255 byte transfers
(SMBus 3.0) as the specification requires a minimum MTU of 68 bytes.
Signed-off-by: Matt Johnston <matt@codeconstruct.com.au> Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>
Matt Johnston [Mon, 15 Nov 2021 02:49:25 +0000 (10:49 +0800)]
dt-bindings: net: New binding mctp-i2c-controller
Used to define a local endpoint to communicate with MCTP peripherals
attached to an I2C bus. This I2C endpoint can communicate with remote
MCTP devices on the I2C bus.
In the example I2C topology below (matching the second yaml example) we
have MCTP devices on busses i2c1 and i2c6. MCTP-supporting busses are
indicated by the 'mctp-controller' DT property on an I2C bus node.
A mctp-i2c-controller I2C client DT node is placed at the top of the
mux topology, since only the root I2C adapter will support I2C slave
functionality.
.-------.
|eeprom |
.------------. .------. /'-------'
| adapter | | mux --@0,i2c5------'
| i2c1 ----.*| --@1,i2c6--.--.
|............| \'------' \ \ .........
| mctp-i2c- | \ \ \ .mctpB .
| controller | \ \ '.0x30 .
| | \ ......... \ '.......'
| 0x50 | \ .mctpA . \ .........
'------------' '.0x1d . '.mctpC .
'.......' '.0x31 .
'.......'
(mctpX boxes above are remote MCTP devices not included in the DT at
present, they can be hotplugged/probed at runtime. A DT binding for
specific fixed MCTP devices could be added later if required)
Signed-off-by: Matt Johnston <matt@codeconstruct.com.au> Reviewed-by: Rob Herring <robh@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
255 byte support has been tested on a npcm750 board
Signed-off-by: Matt Johnston <matt@codeconstruct.com.au> Reviewed-by: Tali Perry <tali.perry1@gmail.com> Reviewed-by: Patrick Venture <venture@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Matt Johnston [Mon, 15 Nov 2021 02:49:23 +0000 (10:49 +0800)]
i2c: aspeed: Allow 255 byte block transfers
255 byte transfers have been tested on an AST2500 board
Signed-off-by: Matt Johnston <matt@codeconstruct.com.au> Reviewed-by: Brendan Higgins <brendanhiggins@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Matt Johnston [Mon, 15 Nov 2021 02:49:21 +0000 (10:49 +0800)]
i2c: core: Allow 255 byte transfers for SMBus 3.x
SMBus 3.0 increased the maximum block transfer size from 32 bytes to
255 bytes. We increase the size of struct i2c_smbus_data's block[]
member.
i2c_smbus_xfer() and i2c_smbus_xfer_emulated() now support 255 byte
block operations, other block functions remain limited to 32 bytes for
compatibility with existing callers.
We allow adapters to indicate support for the larger size with
I2C_FUNC_SMBUS_V3_BLOCK. Most emulated drivers should be able to use 255
byte blocks by replacing I2C_SMBUS_BLOCK_MAX with I2C_SMBUS_V3_BLOCK_MAX
though some will have hardware limitations that need testing.
Signed-off-by: Matt Johnston <matt@codeconstruct.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>
Christophe JAILLET [Sun, 14 Nov 2021 19:02:35 +0000 (20:02 +0100)]
net: bridge: Slightly optimize 'find_portno()'
The 'inuse' bitmap is local to this function. So we can use the
non-atomic '__set_bit()' to save a few cycles.
While at it, also remove some useless {}.
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Acked-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Harshit Mogalapalli [Fri, 12 Nov 2021 21:36:47 +0000 (13:36 -0800)]
net: sched: sch_netem: Refactor code in 4-state loss generator
Fixed comments to match description with variable names and
refactored code to match the convention as per [1].
To match the convention mapping is done as follows:
State 3 - LOST_IN_BURST_PERIOD
State 4 - LOST_IN_GAP_PERIOD
[1] S. Salsano, F. Ludovici, A. Ordine, "Definition of a general
and intuitive loss model for packet networks and its implementation
in the Netem module in the Linux kernel"
Fixes: a6e2fe17eba4 ("sch_netem: replace magic numbers with enumerate") Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Uwe Kleine-König [Fri, 12 Nov 2021 14:53:52 +0000 (15:53 +0100)]
net: dsa: vsc73xxx: Make vsc73xx_remove() return void
vsc73xx_remove() returns zero unconditionally and no caller checks the
returned value. So convert the function to return no value.
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
The previous stmmac_xdp_set_prog() implementation uses stmmac_release()
and stmmac_open() which tear down the PHY device and causes undesirable
autonegotiation which causes a delay whenever AFXDP ZC is setup.
This patch introduces two new functions that just sufficiently tear
down DMA descriptors, buffer, NAPI process, and IRQs and reestablish
them accordingly in both stmmac_xdp_release() and stammac_xdp_open().
As the results of this enhancement, we get rid of transient state
introduced by the link auto-negotiation:
Reported-by: Kurt Kanzenbach <kurt@linutronix.de> Signed-off-by: Ong Boon Leong <boon.leong.ong@intel.com> Tested-by: Kurt Kanzenbach <kurt@linutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
Hengqi Chen [Mon, 15 Nov 2021 01:24:36 +0000 (09:24 +0800)]
bpftool: Use libbpf_get_error() to check error
Currently, LIBBPF_STRICT_ALL mode is enabled by default for
bpftool which means on error cases, some libbpf APIs would
return NULL pointers. This makes IS_ERR check failed to detect
such cases and result in segfault error. Use libbpf_get_error()
instead like we do in libbpf itself.