Daniel Wagner [Tue, 30 Jan 2024 17:28:40 +0000 (18:28 +0100)]
nvme: allow to overwrite base sysfs path
In order to be able to test the topology scan code, make the
lookup paths flexible. Check if the LIBNVME_SYSFS_PATH environment
variable is set and if so use this as base path.
We could also introduce a setter for this, but this is really
a debugging interface and thus I don't really want it to be
visible in the public API.
Jinliang Wang [Fri, 19 Jan 2024 07:47:04 +0000 (23:47 -0800)]
mi: set correct rc and errno when crc mismatch
Before the fix, when we meet crc mismatch response, the errno is 0
and rc is 1. This combination will be mistaken as Admin Generic Command
Status code value 0x1 (Invalid Command Opcode):
$ nvme id-ctrl mctp:1,20
crc mismatch
NVMe status: Invalid Command Opcode: A reserved coded value or an
unsupported value in the command opcode field(0x1
After the fix, the rc is -1, and errno is set to Bad message.
$ nvme id-ctrl mctp:1,20
crc mismatch
identify controller: Bad message
Signed-off-by: Jinliang Wang <jinliangw@google.com>
This is because the strlen check on subsysnqn or hostnqn crashes at
src/nvme/linux.c due to either of them being null. Avoid this segfault
by checking these strings before running a strlen on them.
gcc complains about potentially uninitialized variables used
in the cleanup function even though such scenario is unlikely
to happen. Still, an explicit initializer makes the warnings
go away.
E.g.
../src/nvme/fabrics.c: In function ‘nvmf_hostnqn_generate’:
../src/nvme/fabrics.c:1297:30: note: ‘stream’ was declared here
1297 | _cleanup_file_ FILE *stream;
| ^~~~~~
In function ‘cleanup_fd’,
inlined from ‘uuid_from_device_tree’ at ../src/nvme/fabrics.c:1189:19,
inlined from ‘nvmf_hostnqn_generate’ at ../src/nvme/fabrics.c:1350:9:
../src/nvme/cleanup.h:37:17: warning: ‘f’ may be used uninitialized [-Wmaybe-uninitialized]
37 | close(*fd);
| ^~~~~~~~~~
Daniel Wagner [Wed, 20 Dec 2023 09:48:31 +0000 (10:48 +0100)]
build: handle patch level versioning
Use the patch level identifier from the version string when it is
provided instead blindly extending the version string with '.0'. This
prevents patch level releases.
Daniel Wagner [Wed, 20 Dec 2023 07:26:16 +0000 (08:26 +0100)]
tree: do no free ns on error in nvme_ns_init
The ns pointer is owned by the caller not by nvme_ns_init, thus we can't
just free it on error.
Fixes: 7959f52960fd ("tree: read all attributes from sysfs when available") Reported-by: Tomasz Kłoczko <kloczek@fedoraproject.org> Signed-off-by: Daniel Wagner <dwagner@suse.de>
Daniel Wagner [Tue, 19 Dec 2023 09:40:21 +0000 (10:40 +0100)]
build: construct path to config.h manually
meson reports:
internal/meson.build:25: DEPRECATION: Project uses feature that was
always broken, and is now deprecated since '1.3.0': str.format: Value
other than strings, integers, bools, options, dictionaries and lists
thereof.
Thus just hardcode the config file path by using current_build_dir. Note
this changes the path from a relative one to an absolute one.
Daniel Wagner [Thu, 7 Dec 2023 12:52:59 +0000 (13:52 +0100)]
tree: do not open blk device on default
The fd is not needed anymore if the kernel exposes all necessary sysfs
entries to fully scan the nvme subsystem. Thus do not alwyas open the
blk device and do it only when necessary.
Daniel Wagner [Fri, 1 Dec 2023 09:23:29 +0000 (10:23 +0100)]
tree: read all attributes from sysfs when available
The kernel already exposes parts or all attributes we are looking up
with the ns id command. By reading these from the sysfs we can remove
the last command we are issuing during the topology scan.
Chris Patterson [Fri, 1 Dec 2023 06:21:15 +0000 (22:21 -0800)]
types: fix regression for vendor-specific field in nvme_id_ns
Recent versions of nvme-cli have started reading vs from offset
392 instead of 384. Previous PRs coupled the use of nvme_id_ns
for use in namespace management (create_ns). However, the NVMe
spec has a different structure for namespace management, with
only a subset of the fields allowed/shared and some additional
fields.
To fix this, remove lbstm and restore the proper length for vs
from nvme_id_ns.
I expect that create_ns() should fully switch over to the
nvme_ns_mgmt_host_sw_specified struct which seems aligned with
the spec (though has some newer fields than what is available in
the latest NVMe Command Set Specification Revision 1.0c). This
will have to be addressed separately in nvme-cli.
Caleb Sander [Tue, 28 Nov 2023 21:32:21 +0000 (14:32 -0700)]
tree: use cleanup functions
Use cleanup attributes from cleanup.h to avoid boilerplate cleanup code.
Introduce struct dirents and the corresponding _cleanup_dirents_
to clean up arrays of directory entries.
Caleb Sander [Tue, 28 Nov 2023 20:35:01 +0000 (13:35 -0700)]
cleanup: add cleanup functions
Introduce _cleanup_free_ from nvme-cli to call free() on cleanup,
_cleanup_file_ to call fclose(), _cleanup_dir_ to call closedir(),
and _cleanup_fd_ to call close().
_cleanup_free_ generalizes __cleanup__(cleanup_charp),
so remove it along with cleanup.c.
Caleb Sander [Thu, 30 Nov 2023 18:58:22 +0000 (11:58 -0700)]
tree: fix incorrect return value
nvme_scan_subsystem() accidentally sets errno = -EINVAL
if the matching subsystem is filtered out.
This errno value will be overwritten by the out_free path with -ret,
where ret is the length of the path string.
Set ret = -EINVAL instead, so errno will be set to EINVAL.
This matches the behavior in the case where a new subsystem is allocated
and fails the filter function.
Hannes Reinecke [Thu, 16 Nov 2023 06:41:37 +0000 (07:41 +0100)]
libnvme: support NVMe TLS identities version 1
With NVMe TP8018 a new version '1' for generating NVMe TLS identities
was specified; identities generated for this version require a PSK hash
to be attached to the version '0' identifier.
This patch implements a new function 'nvme_insert_tls_keys_versioned()'
to support this functionality and makes the original function
'nvme_insert_tls_keys()' a wrapper for the new function.
Daniel Wagner [Thu, 16 Nov 2023 14:28:25 +0000 (15:28 +0100)]
build: do not rebuild muon/samu every time
There is not need to build the build tools everytime. Thus stage the
samurai and muon build into a new top level directory .build-tools and
only build them on demand.
Caleb Sander [Tue, 8 Aug 2023 01:21:12 +0000 (19:21 -0600)]
fabrics: avoid redundant args in nvme_discovery_log()
nvme_discovery_log() takes nvme_get_log_args as input
and then overwrites many of its fields.
This leads to wasted code in the caller setting up unused fields.
Instead pass nvme_get_discovery_args, which more accurately
expresses the possible discovery log arguments.
As an added benefit, it also subsumes the other arguments.
And nvmf_get_discovery_wargs() can pass its arguments straight through.
Caleb Sander [Tue, 8 Aug 2023 00:59:10 +0000 (18:59 -0600)]
fabrics: have nvmf_get_discovery_log() call nvmf_get_discovery_wargs()
There is a lot of duplicated code between nvmf_get_discovery_log()
and nvmf_get_discovery_wargs().
Since nvmf_get_discovery_wargs() is more general,
use it to implement nvmf_get_discovery_log().
Caleb Sander [Tue, 1 Aug 2023 17:24:15 +0000 (11:24 -0600)]
fabrics: fetch smaller Discovery Log Page header
Most of the 1 KB Discovery Log Page header is reserved.
Only the first 18 bytes are currently defined.
So avoid transfering more data from the controller than necessary.
Caleb Sander [Tue, 1 Aug 2023 01:57:54 +0000 (19:57 -0600)]
fabrics: avoid redundant Get Log Page on retry
In case of a GENCTR mismatch, nvme_discovery_log() currently fetches
the Discovery Log Page header again before fetching the entries.
This is unnecessary since it was just fetched to determine GENCTR.
So only fetch the header once and re-use it for the next loop iteration.
Remove a couple of unnecessary variables in the process.
Caleb Sander [Tue, 1 Aug 2023 01:29:03 +0000 (19:29 -0600)]
fabrics: clear RAE for discovery log page commands
In several circumstances, nvme_discovery_log() leaves the RAE bit set
for all Get Log Page commands it issues.
This happens if the header indicates the log page has no records,
or if a command other than the header re-fetch errors out.
If the RAE bit is never cleared, the controller will not send more
Discovery Asynchronous Event Notifications.
Setting the RAE bit is not necessary to avoid missed events since
the use of GENCTR ensures the log page is fetched atomically.
It is better to risk receiving additional AENs
than for the discovery controller to never send more AENs
because it is waiting for a Get Log Page command that never comes.
So clear the RAE bit when fetching each piece of the log page.
Caleb Sander [Thu, 2 Nov 2023 13:29:56 +0000 (14:29 +0100)]
test: Cast values to u32 if shift overflows int
Bit shifts that overflow the resulting type are undefined behavior in C.
C arithmetic promotes to ints all smaller integer types.
There are several places where a 32-bit unsigned value
is constructed by shifting a u8 or u16 to the most significant bits.
Since this overflows a signed 32-bit integer,
explicitly cast to u32 to avoid the UB.
Technically, an int is allowed to only be 16 bits,
so any shift that could set bit 15 or higher is UB.
But platforms where int is s16 are not very common,
so it's likely not worth the effort to fix the code.
Caleb Sander [Thu, 2 Nov 2023 13:29:46 +0000 (14:29 +0100)]
mi: Cast values to u32 if shift overflows int
Bit shifts that overflow the resulting type are undefined behavior in C.
C arithmetic promotes to ints all smaller integer types.
There are several places where a 32-bit unsigned value
is constructed by shifting a u8 or u16 to the most significant bits.
Since this overflows a signed 32-bit integer,
explicitly cast to u32 to avoid the UB.
Technically, an int is allowed to only be 16 bits,
so any shift that could set bit 15 or higher is UB.
But platforms where int is s16 are not very common,
so it's likely not worth the effort to fix the code.
Caleb Sander [Sat, 14 Oct 2023 01:28:33 +0000 (19:28 -0600)]
types: Cast values to u32 if shift overflows int
Bit shifts that overflow the resulting type are undefined behavior in C.
C arithmetic promotes to ints all smaller integer types.
There are several places where a 32-bit unsigned value
is constructed by shifting a u8 or u16 to the most significant bits.
Since this overflows a signed 32-bit integer,
explicitly cast to u32 to avoid the UB.
Technically, an int is allowed to only be 16 bits,
so any shift that could set bit 15 or higher is UB.
But platforms where int is s16 are not very common,
so it's likely not worth the effort to fix the code.
Caleb Sander [Thu, 2 Nov 2023 13:26:36 +0000 (14:26 +0100)]
test: Avoid unaligned pointer dereferences
Avoid casting byte-aligned pointers to pointers with higher alignment.
Loading or storing values with higher alignment is undefined behavior,
since some processors don't allow unaligned memory accesses
and compilers may assume pointers of different types don't alias.
Perform an explicit memcpy(), which an optimizing compiler
can easily replace with a single load/store on supported architectures.
Caleb Sander [Sat, 14 Oct 2023 04:35:43 +0000 (22:35 -0600)]
nbft: Avoid unaligned pointer dereferences
Avoid casting byte-aligned pointers to pointers with higher alignment.
Loading or storing values with higher alignment is undefined behavior,
since some processors don't allow unaligned memory accesses
and compilers may assume pointers of different types don't alias.
Perform an explicit memcpy() in two places, which an optimizing compiler
can easily replace with a single load/store on supported architectures.
While we're touching this code, also use IN6_IS_ADDR_V4MAPPED()
instead of hand-rolling it.
Caleb Sander [Sat, 14 Oct 2023 01:28:09 +0000 (19:28 -0600)]
test: don't pass NULL to memcmp() or memset()
According to the C standard, it is undefined behavior
to pass NULL pointers to standard library functions.
This includes the mem*() family of functions,
even they are passed a length of 0.
So avoid calling these functions when the length is 0.
Caleb Sander [Sat, 14 Oct 2023 01:13:56 +0000 (19:13 -0600)]
test: make LD_PRELOAD tests work with ASAN
Several tests mock libc functions by setting LD_PRELOAD
to a shared library containing mock implementations of the functions.
Currently this prevents the tests from running with ASAN.
ASAN complains that libasan doesn't come first in LD_PRELOAD and aborts.
But in practice this doesn't seem to be an issue.
Sample ASAN issues added to libnvme, the test cases, and the mocks
are all correctly reported.
So suppress this warning by setting the environment variable
ASAN_OPTIONS=verify_asan_link_order=0.
In case the user does want to inject a specific libasan,
change the tests' use of LD_PRELOAD to append the mock shared library
rather than overwriting LD_PRELOAD entirely.
Caleb Sander [Fri, 13 Oct 2023 23:41:30 +0000 (17:41 -0600)]
readme: clarify that LD_PRELOAD is not needed for ASAN
The README currently suggests setting LD_PRELOAD to the libasan path.
However, this seems to be unnecessary (at least on my gcc 9 setup).
The test executables request to link with libasan.so (before libc.so),
so it is automatically loaded.
The recommended suggestion also ends up running ninja with ASAN
instrumentation, which causes it to fail with leaks detected.
Joy Gu [Wed, 25 Oct 2023 20:47:42 +0000 (13:47 -0700)]
types: add cross-namespace copy formats, status codes, ONCS bits
Add support for NVMe TP4130 ("Cross-Namespace Copy"):
- Add Copy Descriptor Formats 2h and 3h
- Add new status codes for Copy: Incompatible Namespace or Format, Fast
Copy Not Possible, Overlapping I/O Range, and Insufficient Resources
- Add two new ONCS bits NVMCSA and NVMAFC
- Add Copy Descriptor Formats Enable (CDFE) to Host Behavior Support
Data Structure
[dwagner: whitespace cleanups
moved new functions to 1.7 linker section] Signed-off-by: Daniel Wagner <dwagner@suse.de>
Caleb Sander [Tue, 20 Jun 2023 03:17:16 +0000 (21:17 -0600)]
fabrics: use SECTYPE to determine whether to use TLS
The NVMe specfications are clear that the discovery log page entry
SECTYPE field indicates whether TLS is supported.
Currently the TREQ field is used, which results in enabling TLS
even when SECTYPE = "No Security" and TREQ = "Not required".
Only enable TLS if SECTYPE indicates a TLS version is enabled.
From the NVMe/TCP transport specification, version 1.0c:
Security Type (SECTYPE): Specifies the type of security used by the
NVMe/TCP port. If SECTYPE is a value of 0h (No Security), then the host
shall set up a normal TCP connection.
From TP 8018:
The SECTYPE value of 0h (i.e., No Security) specifies that TLS is not
supported.
From TP 8025 (the "TLS Permitted" host case):
If the SECTYPE field in the TSAS field in the discovery log entry for
the remote entity is cleared to zero and the TSC field is not set to 01b
(i.e., Required), then initiate TCP connections without TLS.
Brandon Paupore [Mon, 9 Oct 2023 21:19:17 +0000 (16:19 -0500)]
ioctl: MSB variable-size storage/reference tags
The spec defines these values on a bitwise basis within the shared
80-bit region, stored from MSB to LSB for each tag. Mechanically this
can be achieved by setting the values as big-endian using the same logic
currently in place.
Sam James [Sat, 30 Sep 2023 05:38:53 +0000 (06:38 +0100)]
test: handle POSIX ioctl prototype
glibc has the following prototype for ioctl: int ioctl(int fd, unsigned long request, ...)
POSIX (inc. musl) has the following for ioctl: int ioctl(int fd, int request, ...)
Check which prototype is used in <sys/ioctl.h> to avoid a conflict and conditionally
define the right one for the system.
Bug: https://bugs.gentoo.org/914921 Signed-off-by: Sam James <sam@gentoo.org>
Jeremy Kerr [Sat, 23 Sep 2023 16:44:50 +0000 (09:44 -0700)]
mi-mctp: Fix free() in error path of mi_open_mctp
If the malloc of our struct nvme_mi_transport_mctp fails, we'll attempt
to free ->rsp_buf of this (now zero) pointer.
Instead, structure the error path to progressively undo the
initialisation operations. This means we'll need to save the errno at
the site of each possible failure.
In doing this, add a comment to the call to nvme_mi_close(), just to
clarify behaviour with regards to the cleanups through that path.
Reported-by: Barnabás Pőcze <pobrn@protonmail.com> Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au>