]> www.infradead.org Git - users/hch/xfs.git/log
users/hch/xfs.git
20 months agoplatform/x86/amd/pmf: Fix memory leak in amd_pmf_get_pb_data()
Cong Liu [Wed, 24 Jan 2024 01:29:38 +0000 (09:29 +0800)]
platform/x86/amd/pmf: Fix memory leak in amd_pmf_get_pb_data()

amd_pmf_get_pb_data() will allocate memory for the policy buffer,
but does not free it if copy_from_user() fails. This leads to a memory
leak.

Fixes: 10817f28e533 ("platform/x86/amd/pmf: Add capability to sideload of policy binary")
Reviewed-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com>
Signed-off-by: Cong Liu <liucong2@kylinos.cn>
Link: https://lore.kernel.org/r/20240124012939.6550-1-liucong2@kylinos.cn
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
20 months agoplatform/x86/amd/pmf: Get ambient light information from AMD SFH driver
Shyam Sundar S K [Tue, 23 Jan 2024 14:14:58 +0000 (19:44 +0530)]
platform/x86/amd/pmf: Get ambient light information from AMD SFH driver

AMD SFH driver has APIs defined to export the ambient light information;
use this within the PMF driver to send inputs to the PMF TA, so that PMF
driver can enact to the actions coming from the TA.

Signed-off-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://lore.kernel.org/r/20240123141458.3715211-2-Shyam-sundar.S-k@amd.com
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
20 months agoplatform/x86/amd/pmf: Get Human presence information from AMD SFH driver
Shyam Sundar S K [Tue, 23 Jan 2024 14:14:57 +0000 (19:44 +0530)]
platform/x86/amd/pmf: Get Human presence information from AMD SFH driver

AMD SFH driver has APIs defined to export the human presence information;
use this within the PMF driver to send inputs to the PMF TA, so that PMF
driver can enact to the actions coming from the TA.

Signed-off-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://lore.kernel.org/r/20240123141458.3715211-1-Shyam-sundar.S-k@amd.com
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
21 months agoplatform/mellanox: mlxbf-pmc: Fix offset calculation for crspace events
Shravan Kumar Ramani [Wed, 17 Jan 2024 10:01:34 +0000 (05:01 -0500)]
platform/mellanox: mlxbf-pmc: Fix offset calculation for crspace events

The event selector fields for 2 counters are contained in one
32-bit register and the current logic does not account for this.

Fixes: 423c3361855c ("platform/mellanox: mlxbf-pmc: Add support for BlueField-3")
Signed-off-by: Shravan Kumar Ramani <shravankr@nvidia.com>
Reviewed-by: David Thompson <davthompson@nvidia.com>
Reviewed-by: Vadim Pasternak <vadimp@nvidia.com>
Link: https://lore.kernel.org/r/8834cfa496c97c7c2fcebcfca5a2aa007e20ae96.1705485095.git.shravankr@nvidia.com
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
21 months agoplatform/mellanox: mlxbf-tmfifo: Drop Tx network packet when Tx TmFIFO is full
Liming Sun [Thu, 11 Jan 2024 17:31:06 +0000 (12:31 -0500)]
platform/mellanox: mlxbf-tmfifo: Drop Tx network packet when Tx TmFIFO is full

Starting from Linux 5.16 kernel, Tx timeout mechanism was added
in the virtio_net driver which prints the "Tx timeout" warning
message when a packet stays in Tx queue for too long. Below is an
example of the reported message:

"[494105.316739] virtio_net virtio1 tmfifo_net0: TX timeout on
queue: 0, sq: output.0, vq: 0×1, name: output.0, usecs since
last trans: 3079892256".

This issue could happen when external host driver which drains the
FIFO is restared, stopped or upgraded. To avoid such confusing
"Tx timeout" messages, this commit adds logic to drop the outstanding
Tx packet if it's not able to transmit in two seconds due to Tx FIFO
full, which can be considered as congestion or out-of-resource drop.

This commit also handles the special case that the packet is half-
transmitted into the Tx FIFO. In such case, the packet is discarded
with remaining length stored in vring->rem_padding. So paddings with
zeros can be sent out when Tx space is available to maintain the
integrity of the packet format. The padded packet will be dropped on
the receiving side.

Signed-off-by: Liming Sun <limings@nvidia.com>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Link: https://lore.kernel.org/r/20240111173106.96958-1-limings@nvidia.com
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
21 months agoMAINTAINERS: remove defunct acpi4asus project info from asus notebooks section
Hans de Goede [Tue, 16 Jan 2024 10:39:21 +0000 (11:39 +0100)]
MAINTAINERS: remove defunct acpi4asus project info from asus notebooks section

The acpi4asus project appears to be defunct, according to:
https://sourceforge.net/p/acpi4asus/mailman/acpi4asus-user/
the last posts to the list were done in May 2020 and even then
they were mostly spam.

And the http://acpi4asus.sf.net website still talks about 2.6.x kernels.

Drop the defunct mailing-list and update the W: entry to point to
the new up2date https://asus-linux.org/ site.

Cc: Corentin Chary <corentin.chary@gmail.com>
Cc: Luke D. Jones <luke@ljones.dev>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
21 months agoMAINTAINERS: add Luke Jones as maintainer for asus notebooks
Luke D. Jones [Mon, 15 Jan 2024 21:18:29 +0000 (10:18 +1300)]
MAINTAINERS: add Luke Jones as maintainer for asus notebooks

Add myself as maintainer for "ASUS NOTEBOOKS AND EEEPC ACPI/WMI EXTRAS
DRIVERS" as suggested by Hans de Goede based on my history of
contributions.

Signed-off-by: Luke D. Jones <luke@ljones.dev>
Link: https://lore.kernel.org/r/20240115211829.48251-1-luke@ljones.dev
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
21 months agoMAINTAINERS: Remove Perry Yuan as DELL WMI HARDWARE PRIVACY SUPPORT maintainer
Heiner Kallweit [Sun, 14 Jan 2024 16:53:16 +0000 (17:53 +0100)]
MAINTAINERS: Remove Perry Yuan as DELL WMI HARDWARE PRIVACY SUPPORT maintainer

Recent mails to his Dell address bounced with "user unknown".
So remove him as maintainer.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Link: https://lore.kernel.org/r/c9757d0a-2046-464b-93e1-a2d9ab0ce36b@gmail.com
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
21 months agoplatform/x86: silicom-platform: Add missing "Description:" for power_cycle sysfs...
Hans de Goede [Mon, 8 Jan 2024 14:06:55 +0000 (15:06 +0100)]
platform/x86: silicom-platform: Add missing "Description:" for power_cycle sysfs attr

The Documentation/ABI/testing/sysfs-platform-silicom entry
for the power_cycle sysfs attr is missing the "Description:" keyword,
add this.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Link: https://lore.kernel.org/r/20240108140655.547261-1-hdegoede@redhat.com
21 months agoplatform/x86: intel-wmi-sbl-fw-update: Fix function name in error message
Armin Wolf [Sat, 6 Jan 2024 22:41:26 +0000 (23:41 +0100)]
platform/x86: intel-wmi-sbl-fw-update: Fix function name in error message

Since when the driver was converted to use the bus-based WMI
interface, the old GUID-based WMI functions are not used anymore.
Update the error message to avoid confusing users.

Compile-tested only.

Fixes: 75c487fcb69c ("platform/x86: intel-wmi-sbl-fw-update: Use bus-based WMI interface")
Signed-off-by: Armin Wolf <W_Armin@gmx.de>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Link: https://lore.kernel.org/r/20240106224126.13803-1-W_Armin@gmx.de
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
21 months agoplatform/x86: p2sb: Use pci_resource_n() in p2sb_read_bar0()
Shin'ichiro Kawasaki [Mon, 8 Jan 2024 06:20:59 +0000 (15:20 +0900)]
platform/x86: p2sb: Use pci_resource_n() in p2sb_read_bar0()

Accesses to resource[] member of struct pci_dev shall be wrapped with
pci_resource_n() for future compatibility. Call the helper function in
p2sb_read_bar0().

Suggested-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Link: https://lore.kernel.org/r/20240108062059.3583028-3-shinichiro.kawasaki@wdc.com
Tested-by Klara Modin <klarasmodin@gmail.com>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
21 months agoplatform/x86: p2sb: Allow p2sb_bar() calls during PCI device probe
Shin'ichiro Kawasaki [Mon, 8 Jan 2024 06:20:58 +0000 (15:20 +0900)]
platform/x86: p2sb: Allow p2sb_bar() calls during PCI device probe

p2sb_bar() unhides P2SB device to get resources from the device. It
guards the operation by locking pci_rescan_remove_lock so that parallel
rescans do not find the P2SB device. However, this lock causes deadlock
when PCI bus rescan is triggered by /sys/bus/pci/rescan. The rescan
locks pci_rescan_remove_lock and probes PCI devices. When PCI devices
call p2sb_bar() during probe, it locks pci_rescan_remove_lock again.
Hence the deadlock.

To avoid the deadlock, do not lock pci_rescan_remove_lock in p2sb_bar().
Instead, do the lock at fs_initcall. Introduce p2sb_cache_resources()
for fs_initcall which gets and caches the P2SB resources. At p2sb_bar(),
refer the cache and return to the caller.

Before operating the device at P2SB DEVFN for resource cache, check
that its device class is PCI_CLASS_MEMORY_OTHER 0x0580 that PCH
specifications define. This avoids unexpected operation to other devices
at the same DEVFN.

Link: https://lore.kernel.org/linux-pci/6xb24fjmptxxn5js2fjrrddjae6twex5bjaftwqsuawuqqqydx@7cl3uik5ef6j/
Fixes: 9745fb07474f ("platform/x86/intel: Add Primary to Sideband (P2SB) bridge support")
Cc: stable@vger.kernel.org
Suggested-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Link: https://lore.kernel.org/r/20240108062059.3583028-2-shinichiro.kawasaki@wdc.com
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Tested-by Klara Modin <klarasmodin@gmail.com>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
21 months agoplatform/x86: intel-uncore-freq: Fix types in sysfs callbacks
Nathan Chancellor [Thu, 4 Jan 2024 22:59:03 +0000 (15:59 -0700)]
platform/x86: intel-uncore-freq: Fix types in sysfs callbacks

When booting a kernel with CONFIG_CFI_CLANG, there is a CFI failure when
accessing any of the values under
/sys/devices/system/cpu/intel_uncore_frequency/package_00_die_00:

  $ cat /sys/devices/system/cpu/intel_uncore_frequency/package_00_die_00/max_freq_khz
  fish: Job 1, 'cat /sys/devices/system/cpu/int…' terminated by signal SIGSEGV (Address boundary error)

  $ sudo dmesg &| grep 'CFI failure'
  [  170.953925] CFI failure at kobj_attr_show+0x19/0x30 (target: show_max_freq_khz+0x0/0xc0 [intel_uncore_frequency_common]; expected type: 0xd34078c5

The sysfs callback functions such as show_domain_id() are written as if
they are going to be called by dev_attr_show() but as the above message
shows, they are instead called by kobj_attr_show(). kCFI checks that the
destination of an indirect jump has the exact same type as the prototype
of the function pointer it is called through and fails when they do not.

These callbacks are called through kobj_attr_show() because
uncore_root_kobj was initialized with kobject_create_and_add(), which
means uncore_root_kobj has a ->sysfs_ops of kobj_sysfs_ops from
kobject_create(), which uses kobj_attr_show() as its ->show() value.

The only reason there has not been a more noticeable problem until this
point is that 'struct kobj_attribute' and 'struct device_attribute' have
the same layout, so getting the callback from container_of() works the
same with either value.

Change all the callbacks and their uses to be compatible with
kobj_attr_show() and kobj_attr_store(), which resolves the kCFI failure
and allows the sysfs files to work properly.

Closes: https://github.com/ClangBuiltLinux/linux/issues/1974
Fixes: ae7b2ce57851 ("platform/x86/intel/uncore-freq: Use sysfs API to create attributes")
Cc: stable@vger.kernel.org
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Sami Tolvanen <samitolvanen@google.com>
Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Link: https://lore.kernel.org/r/20240104-intel-uncore-freq-kcfi-fix-v1-1-bf1e8939af40@kernel.org
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
21 months agoplatform/x86: wmi: Fix wmi_dev_probe()
Dan Carpenter [Fri, 5 Jan 2024 13:47:54 +0000 (16:47 +0300)]
platform/x86: wmi: Fix wmi_dev_probe()

This has a reversed if statement so it accidentally disables the wmi
method before returning.

Fixes: 704af3a40747 ("platform/x86: wmi: Remove chardev interface")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Reviewed-by: Armin Wolf <W_Armin@gmx.de>
Link: https://lore.kernel.org/r/9c81251b-bc87-4ca3-bb86-843dc85e5145@moroto.mountain
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
21 months agoplatform/x86: wmi: Fix notify callback locking
Armin Wolf [Wed, 3 Jan 2024 19:27:07 +0000 (20:27 +0100)]
platform/x86: wmi: Fix notify callback locking

When an legacy WMI event handler is removed, an WMI event could
have called the handler just before it was removed, meaning the
handler could still be running after wmi_remove_notify_handler()
returns.
Something similar could also happens when using the WMI bus, as
the WMI core might still call the notify() callback from an WMI
driver even if its remove() callback was just called.

Fix this by introducing a rw semaphore which ensures that the
event state of a WMI device does not change while the WMI core
is handling an event for it.

Tested on a Dell Inspiron 3505 and a Acer Aspire E1-731.

Fixes: 1686f5444546 ("platform/x86: wmi: Incorporate acpi_install_notify_handler")
Signed-off-by: Armin Wolf <W_Armin@gmx.de>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Link: https://lore.kernel.org/r/20240103192707.115512-5-W_Armin@gmx.de
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
21 months agoplatform/x86: wmi: Decouple legacy WMI notify handlers from wmi_block_list
Armin Wolf [Wed, 3 Jan 2024 19:27:06 +0000 (20:27 +0100)]
platform/x86: wmi: Decouple legacy WMI notify handlers from wmi_block_list

Until now, legacy WMI notify handler functions where using the
wmi_block_list, which did no refcounting on the returned WMI device.
This meant that the WMI device could disappear at any moment,
potentially leading to various errors.
Fix this by using bus_find_device() which returns an actual
reference to the found WMI device.

Tested on a Dell Inspiron 3505 and a Acer Aspire E1-731.

Signed-off-by: Armin Wolf <W_Armin@gmx.de>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Link: https://lore.kernel.org/r/20240103192707.115512-4-W_Armin@gmx.de
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
21 months agoplatform/x86: wmi: Return immediately if an suitable WMI event is found
Armin Wolf [Wed, 3 Jan 2024 19:27:05 +0000 (20:27 +0100)]
platform/x86: wmi: Return immediately if an suitable WMI event is found

Commit 58f6425eb92f ("WMI: Cater for multiple events with same GUID")
allowed legacy WMI notify handlers to be installed for multiple WMI
devices with the same GUID.
However this is useless since the legacy GUID-based interface is
blacklisted from seeing WMI devices with duplicated GUIDs.

Return immediately if a suitable WMI event is found in
wmi_install/remove_notify_handler() since searching for other suitable
events is pointless.

Tested on a Dell Inspiron 3505 and a Acer Aspire E1-731.

Signed-off-by: Armin Wolf <W_Armin@gmx.de>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Link: https://lore.kernel.org/r/20240103192707.115512-3-W_Armin@gmx.de
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
21 months agoplatform/x86: wmi: Fix error handling in legacy WMI notify handler functions
Armin Wolf [Wed, 3 Jan 2024 19:27:04 +0000 (20:27 +0100)]
platform/x86: wmi: Fix error handling in legacy WMI notify handler functions

When wmi_install_notify_handler()/wmi_remove_notify_handler() are
unable to enable/disable the WMI device, they unconditionally return
an error to the caller.
When registering legacy WMI notify handlers, this means that the
callback remains registered despite wmi_install_notify_handler()
having returned an error.
When removing legacy WMI notify handlers, this means that the
callback is removed despite wmi_remove_notify_handler() having
returned an error.

Fix this by only warning when the WMI device could not be enabled.
This behaviour matches the bus-based WMI interface.

Tested on a Dell Inspiron 3505 and a Acer Aspire E1-731.

Fixes: 58f6425eb92f ("WMI: Cater for multiple events with same GUID")
Signed-off-by: Armin Wolf <W_Armin@gmx.de>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Link: https://lore.kernel.org/r/20240103192707.115512-2-W_Armin@gmx.de
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
21 months agoLinux 6.8-rc1 v6.8-rc1
Linus Torvalds [Sun, 21 Jan 2024 22:11:32 +0000 (14:11 -0800)]
Linux 6.8-rc1

21 months agoMerge tag 'bcachefs-2024-01-21' of https://evilpiepirate.org/git/bcachefs
Linus Torvalds [Sun, 21 Jan 2024 22:01:12 +0000 (14:01 -0800)]
Merge tag 'bcachefs-2024-01-21' of https://evilpiepirate.org/git/bcachefs

Pull more bcachefs updates from Kent Overstreet:
 "Some fixes, Some refactoring, some minor features:

   - Assorted prep work for disk space accounting rewrite

   - BTREE_TRIGGER_ATOMIC: after combining our trigger callbacks, this
     makes our trigger context more explicit

   - A few fixes to avoid excessive transaction restarts on
     multithreaded workloads: fstests (in addition to ktest tests) are
     now checking slowpath counters, and that's shaking out a few bugs

   - Assorted tracepoint improvements

   - Starting to break up bcachefs_format.h and move on disk types so
     they're with the code they belong to; this will make room to start
     documenting the on disk format better.

   - A few minor fixes"

* tag 'bcachefs-2024-01-21' of https://evilpiepirate.org/git/bcachefs: (46 commits)
  bcachefs: Improve inode_to_text()
  bcachefs: logged_ops_format.h
  bcachefs: reflink_format.h
  bcachefs; extents_format.h
  bcachefs: ec_format.h
  bcachefs: subvolume_format.h
  bcachefs: snapshot_format.h
  bcachefs: alloc_background_format.h
  bcachefs: xattr_format.h
  bcachefs: dirent_format.h
  bcachefs: inode_format.h
  bcachefs; quota_format.h
  bcachefs: sb-counters_format.h
  bcachefs: counters.c -> sb-counters.c
  bcachefs: comment bch_subvolume
  bcachefs: bch_snapshot::btime
  bcachefs: add missing __GFP_NOWARN
  bcachefs: opts->compression can now also be applied in the background
  bcachefs: Prep work for variable size btree node buffers
  bcachefs: grab s_umount only if snapshotting
  ...

21 months agoMerge tag 'timers-core-2024-01-21' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sun, 21 Jan 2024 19:14:40 +0000 (11:14 -0800)]
Merge tag 'timers-core-2024-01-21' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull timer updates from Thomas Gleixner:
 "Updates for time and clocksources:

   - A fix for the idle and iowait time accounting vs CPU hotplug.

     The time is reset on CPU hotplug which makes the accumulated
     systemwide time jump backwards.

   - Assorted fixes and improvements for clocksource/event drivers"

* tag 'timers-core-2024-01-21' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  tick-sched: Fix idle and iowait sleeptime accounting vs CPU hotplug
  clocksource/drivers/ep93xx: Fix error handling during probe
  clocksource/drivers/cadence-ttc: Fix some kernel-doc warnings
  clocksource/drivers/timer-ti-dm: Fix make W=n kerneldoc warnings
  clocksource/timer-riscv: Add riscv_clock_shutdown callback
  dt-bindings: timer: Add StarFive JH8100 clint
  dt-bindings: timer: thead,c900-aclint-mtimer: separate mtime and mtimecmp regs

21 months agoMerge tag 'powerpc-6.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc...
Linus Torvalds [Sun, 21 Jan 2024 19:04:29 +0000 (11:04 -0800)]
Merge tag 'powerpc-6.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc fixes from Aneesh Kumar:

 - Increase default stack size to 32KB for Book3S

Thanks to Michael Ellerman.

* tag 'powerpc-6.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/64s: Increase default stack size to 32KB

21 months agobcachefs: Improve inode_to_text()
Kent Overstreet [Sun, 21 Jan 2024 17:19:01 +0000 (12:19 -0500)]
bcachefs: Improve inode_to_text()

Add line breaks - inode_to_text() is now much easier to read.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
21 months agobcachefs: logged_ops_format.h
Kent Overstreet [Sun, 21 Jan 2024 07:57:45 +0000 (02:57 -0500)]
bcachefs: logged_ops_format.h

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
21 months agobcachefs: reflink_format.h
Kent Overstreet [Sun, 21 Jan 2024 07:54:47 +0000 (02:54 -0500)]
bcachefs: reflink_format.h

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
21 months agobcachefs; extents_format.h
Kent Overstreet [Sun, 21 Jan 2024 07:51:56 +0000 (02:51 -0500)]
bcachefs; extents_format.h

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
21 months agobcachefs: ec_format.h
Kent Overstreet [Sun, 21 Jan 2024 07:47:14 +0000 (02:47 -0500)]
bcachefs: ec_format.h

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
21 months agobcachefs: subvolume_format.h
Kent Overstreet [Sun, 21 Jan 2024 07:42:53 +0000 (02:42 -0500)]
bcachefs: subvolume_format.h

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
21 months agobcachefs: snapshot_format.h
Kent Overstreet [Sun, 21 Jan 2024 07:41:06 +0000 (02:41 -0500)]
bcachefs: snapshot_format.h

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
21 months agobcachefs: alloc_background_format.h
Kent Overstreet [Sun, 21 Jan 2024 05:01:52 +0000 (00:01 -0500)]
bcachefs: alloc_background_format.h

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
21 months agobcachefs: xattr_format.h
Kent Overstreet [Sun, 21 Jan 2024 04:59:15 +0000 (23:59 -0500)]
bcachefs: xattr_format.h

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
21 months agobcachefs: dirent_format.h
Kent Overstreet [Sun, 21 Jan 2024 04:57:10 +0000 (23:57 -0500)]
bcachefs: dirent_format.h

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
21 months agobcachefs: inode_format.h
Kent Overstreet [Sun, 21 Jan 2024 04:55:39 +0000 (23:55 -0500)]
bcachefs: inode_format.h

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
21 months agobcachefs; quota_format.h
Kent Overstreet [Sun, 21 Jan 2024 04:53:52 +0000 (23:53 -0500)]
bcachefs; quota_format.h

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
21 months agobcachefs: sb-counters_format.h
Kent Overstreet [Sun, 21 Jan 2024 04:50:56 +0000 (23:50 -0500)]
bcachefs: sb-counters_format.h

bcachefs_format.h has gotten too big; let's do some organizing.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
21 months agobcachefs: counters.c -> sb-counters.c
Kent Overstreet [Sun, 21 Jan 2024 04:46:35 +0000 (23:46 -0500)]
bcachefs: counters.c -> sb-counters.c

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
21 months agobcachefs: comment bch_subvolume
Kent Overstreet [Sun, 21 Jan 2024 04:44:17 +0000 (23:44 -0500)]
bcachefs: comment bch_subvolume

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
21 months agobcachefs: bch_snapshot::btime
Kent Overstreet [Sun, 21 Jan 2024 04:35:41 +0000 (23:35 -0500)]
bcachefs: bch_snapshot::btime

Add a field to bch_snapshot for creation time; this will be important
when we start exposing the snapshot tree to userspace.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
21 months agobcachefs: add missing __GFP_NOWARN
Kent Overstreet [Wed, 17 Jan 2024 22:16:07 +0000 (17:16 -0500)]
bcachefs: add missing __GFP_NOWARN

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
21 months agobcachefs: opts->compression can now also be applied in the background
Kent Overstreet [Tue, 16 Jan 2024 21:20:21 +0000 (16:20 -0500)]
bcachefs: opts->compression can now also be applied in the background

The "apply this compression method in the background" paths now use the
compression option if background_compression is not set; this means that
setting or changing the compression option will cause existing data to
be compressed accordingly in the background.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
21 months agobcachefs: Prep work for variable size btree node buffers
Kent Overstreet [Tue, 16 Jan 2024 18:29:59 +0000 (13:29 -0500)]
bcachefs: Prep work for variable size btree node buffers

bcachefs btree nodes are big - typically 256k - and btree roots are
pinned in memory. As we're now up to 18 btrees, we now have significant
memory overhead in mostly empty btree roots.

And in the future we're going to start enforcing that certain btree node
boundaries exist, to solve lock contention issues - analagous to XFS's
AGIs.

Thus, we need to start allocating smaller btree node buffers when we
can. This patch changes code that refers to the filesystem constant
c->opts.btree_node_size to refer to the btree node buffer size -
btree_buf_bytes() - where appropriate.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
21 months agobcachefs: grab s_umount only if snapshotting
Su Yue [Mon, 15 Jan 2024 02:21:25 +0000 (10:21 +0800)]
bcachefs: grab s_umount only if snapshotting

When I was testing mongodb over bcachefs with compression,
there is a lockdep warning when snapshotting mongodb data volume.

$ cat test.sh
prog=bcachefs

$prog subvolume create /mnt/data
$prog subvolume create /mnt/data/snapshots

while true;do
    $prog subvolume snapshot /mnt/data /mnt/data/snapshots/$(date +%s)
    sleep 1s
done

$ cat /etc/mongodb.conf
systemLog:
  destination: file
  logAppend: true
  path: /mnt/data/mongod.log

storage:
  dbPath: /mnt/data/

lockdep reports:
[ 3437.452330] ======================================================
[ 3437.452750] WARNING: possible circular locking dependency detected
[ 3437.453168] 6.7.0-rc7-custom+ #85 Tainted: G            E
[ 3437.453562] ------------------------------------------------------
[ 3437.453981] bcachefs/35533 is trying to acquire lock:
[ 3437.454325] ffffa0a02b2b1418 (sb_writers#10){.+.+}-{0:0}, at: filename_create+0x62/0x190
[ 3437.454875]
               but task is already holding lock:
[ 3437.455268] ffffa0a02b2b10e0 (&type->s_umount_key#48){.+.+}-{3:3}, at: bch2_fs_file_ioctl+0x232/0xc90 [bcachefs]
[ 3437.456009]
               which lock already depends on the new lock.

[ 3437.456553]
               the existing dependency chain (in reverse order) is:
[ 3437.457054]
               -> #3 (&type->s_umount_key#48){.+.+}-{3:3}:
[ 3437.457507]        down_read+0x3e/0x170
[ 3437.457772]        bch2_fs_file_ioctl+0x232/0xc90 [bcachefs]
[ 3437.458206]        __x64_sys_ioctl+0x93/0xd0
[ 3437.458498]        do_syscall_64+0x42/0xf0
[ 3437.458779]        entry_SYSCALL_64_after_hwframe+0x6e/0x76
[ 3437.459155]
               -> #2 (&c->snapshot_create_lock){++++}-{3:3}:
[ 3437.459615]        down_read+0x3e/0x170
[ 3437.459878]        bch2_truncate+0x82/0x110 [bcachefs]
[ 3437.460276]        bchfs_truncate+0x254/0x3c0 [bcachefs]
[ 3437.460686]        notify_change+0x1f1/0x4a0
[ 3437.461283]        do_truncate+0x7f/0xd0
[ 3437.461555]        path_openat+0xa57/0xce0
[ 3437.461836]        do_filp_open+0xb4/0x160
[ 3437.462116]        do_sys_openat2+0x91/0xc0
[ 3437.462402]        __x64_sys_openat+0x53/0xa0
[ 3437.462701]        do_syscall_64+0x42/0xf0
[ 3437.462982]        entry_SYSCALL_64_after_hwframe+0x6e/0x76
[ 3437.463359]
               -> #1 (&sb->s_type->i_mutex_key#15){+.+.}-{3:3}:
[ 3437.463843]        down_write+0x3b/0xc0
[ 3437.464223]        bch2_write_iter+0x5b/0xcc0 [bcachefs]
[ 3437.464493]        vfs_write+0x21b/0x4c0
[ 3437.464653]        ksys_write+0x69/0xf0
[ 3437.464839]        do_syscall_64+0x42/0xf0
[ 3437.465009]        entry_SYSCALL_64_after_hwframe+0x6e/0x76
[ 3437.465231]
               -> #0 (sb_writers#10){.+.+}-{0:0}:
[ 3437.465471]        __lock_acquire+0x1455/0x21b0
[ 3437.465656]        lock_acquire+0xc6/0x2b0
[ 3437.465822]        mnt_want_write+0x46/0x1a0
[ 3437.465996]        filename_create+0x62/0x190
[ 3437.466175]        user_path_create+0x2d/0x50
[ 3437.466352]        bch2_fs_file_ioctl+0x2ec/0xc90 [bcachefs]
[ 3437.466617]        __x64_sys_ioctl+0x93/0xd0
[ 3437.466791]        do_syscall_64+0x42/0xf0
[ 3437.466957]        entry_SYSCALL_64_after_hwframe+0x6e/0x76
[ 3437.467180]
               other info that might help us debug this:

[ 3437.469670] 2 locks held by bcachefs/35533:
               other info that might help us debug this:

[ 3437.467507] Chain exists of:
                 sb_writers#10 --> &c->snapshot_create_lock --> &type->s_umount_key#48

[ 3437.467979]  Possible unsafe locking scenario:

[ 3437.468223]        CPU0                    CPU1
[ 3437.468405]        ----                    ----
[ 3437.468585]   rlock(&type->s_umount_key#48);
[ 3437.468758]                                lock(&c->snapshot_create_lock);
[ 3437.469030]                                lock(&type->s_umount_key#48);
[ 3437.469291]   rlock(sb_writers#10);
[ 3437.469434]
                *** DEADLOCK ***

[ 3437.469670] 2 locks held by bcachefs/35533:
[ 3437.469838]  #0: ffffa0a02ce00a88 (&c->snapshot_create_lock){++++}-{3:3}, at: bch2_fs_file_ioctl+0x1e3/0xc90 [bcachefs]
[ 3437.470294]  #1: ffffa0a02b2b10e0 (&type->s_umount_key#48){.+.+}-{3:3}, at: bch2_fs_file_ioctl+0x232/0xc90 [bcachefs]
[ 3437.470744]
               stack backtrace:
[ 3437.470922] CPU: 7 PID: 35533 Comm: bcachefs Kdump: loaded Tainted: G            E      6.7.0-rc7-custom+ #85
[ 3437.471313] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.3-1-1 04/01/2014
[ 3437.471694] Call Trace:
[ 3437.471795]  <TASK>
[ 3437.471884]  dump_stack_lvl+0x57/0x90
[ 3437.472035]  check_noncircular+0x132/0x150
[ 3437.472202]  __lock_acquire+0x1455/0x21b0
[ 3437.472369]  lock_acquire+0xc6/0x2b0
[ 3437.472518]  ? filename_create+0x62/0x190
[ 3437.472683]  ? lock_is_held_type+0x97/0x110
[ 3437.472856]  mnt_want_write+0x46/0x1a0
[ 3437.473025]  ? filename_create+0x62/0x190
[ 3437.473204]  filename_create+0x62/0x190
[ 3437.473380]  user_path_create+0x2d/0x50
[ 3437.473555]  bch2_fs_file_ioctl+0x2ec/0xc90 [bcachefs]
[ 3437.473819]  ? lock_acquire+0xc6/0x2b0
[ 3437.474002]  ? __fget_files+0x2a/0x190
[ 3437.474195]  ? __fget_files+0xbc/0x190
[ 3437.474380]  ? lock_release+0xc5/0x270
[ 3437.474567]  ? __x64_sys_ioctl+0x93/0xd0
[ 3437.474764]  ? __pfx_bch2_fs_file_ioctl+0x10/0x10 [bcachefs]
[ 3437.475090]  __x64_sys_ioctl+0x93/0xd0
[ 3437.475277]  do_syscall_64+0x42/0xf0
[ 3437.475454]  entry_SYSCALL_64_after_hwframe+0x6e/0x76
[ 3437.475691] RIP: 0033:0x7f2743c313af
======================================================

In __bch2_ioctl_subvolume_create(), we grab s_umount unconditionally
and unlock it at the end of the function. There is a comment
"why do we need this lock?" about the lock coming from
commit 42d237320e98 ("bcachefs: Snapshot creation, deletion")
The reason is that __bch2_ioctl_subvolume_create() calls
sync_inodes_sb() which enforce locked s_umount to writeback all dirty
nodes before doing snapshot works.

Fix it by read locking s_umount for snapshotting only and unlocking
s_umount after sync_inodes_sb().

Signed-off-by: Su Yue <glass.su@suse.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
21 months agobcachefs: kvfree bch_fs::snapshots in bch2_fs_snapshots_exit
Su Yue [Tue, 16 Jan 2024 11:05:37 +0000 (19:05 +0800)]
bcachefs: kvfree bch_fs::snapshots in bch2_fs_snapshots_exit

bch_fs::snapshots is allocated by kvzalloc in __snapshot_t_mut.
It should be freed by kvfree not kfree.
Or umount will triger:

[  406.829178 ] BUG: unable to handle page fault for address: ffffe7b487148008
[  406.830676 ] #PF: supervisor read access in kernel mode
[  406.831643 ] #PF: error_code(0x0000) - not-present page
[  406.832487 ] PGD 0 P4D 0
[  406.832898 ] Oops: 0000 [#1] PREEMPT SMP PTI
[  406.833512 ] CPU: 2 PID: 1754 Comm: umount Kdump: loaded Tainted: G           OE      6.7.0-rc7-custom+ #90
[  406.834746 ] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.3-1-1 04/01/2014
[  406.835796 ] RIP: 0010:kfree+0x62/0x140
[  406.836197 ] Code: 80 48 01 d8 0f 82 e9 00 00 00 48 c7 c2 00 00 00 80 48 2b 15 78 9f 1f 01 48 01 d0 48 c1 e8 0c 48 c1 e0 06 48 03 05 56 9f 1f 01 <48> 8b 50 08 48 89 c7 f6 c2 01 0f 85 b0 00 00 00 66 90 48 8b 07 f6
[  406.837810 ] RSP: 0018:ffffb9d641607e48 EFLAGS: 00010286
[  406.838213 ] RAX: ffffe7b487148000 RBX: ffffb9d645200000 RCX: ffffb9d641607dc4
[  406.838738 ] RDX: 000065bb00000000 RSI: ffffffffc0d88b84 RDI: ffffb9d645200000
[  406.839217 ] RBP: ffff9a4625d00068 R08: 0000000000000001 R09: 0000000000000001
[  406.839650 ] R10: 0000000000000001 R11: 000000000000001f R12: ffff9a4625d4da80
[  406.840055 ] R13: ffff9a4625d00000 R14: ffffffffc0e2eb20 R15: 0000000000000000
[  406.840451 ] FS:  00007f0a264ffb80(0000) GS:ffff9a4e2d500000(0000) knlGS:0000000000000000
[  406.840851 ] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  406.841125 ] CR2: ffffe7b487148008 CR3: 000000018c4d2000 CR4: 00000000000006f0
[  406.841464 ] Call Trace:
[  406.841583 ]  <TASK>
[  406.841682 ]  ? __die+0x1f/0x70
[  406.841828 ]  ? page_fault_oops+0x159/0x470
[  406.842014 ]  ? fixup_exception+0x22/0x310
[  406.842198 ]  ? exc_page_fault+0x1ed/0x200
[  406.842382 ]  ? asm_exc_page_fault+0x22/0x30
[  406.842574 ]  ? bch2_fs_release+0x54/0x280 [bcachefs]
[  406.842842 ]  ? kfree+0x62/0x140
[  406.842988 ]  ? kfree+0x104/0x140
[  406.843138 ]  bch2_fs_release+0x54/0x280 [bcachefs]
[  406.843390 ]  kobject_put+0xb7/0x170
[  406.843552 ]  deactivate_locked_super+0x2f/0xa0
[  406.843756 ]  cleanup_mnt+0xba/0x150
[  406.843917 ]  task_work_run+0x59/0xa0
[  406.844083 ]  exit_to_user_mode_prepare+0x197/0x1a0
[  406.844302 ]  syscall_exit_to_user_mode+0x16/0x40
[  406.844510 ]  do_syscall_64+0x4e/0xf0
[  406.844675 ]  entry_SYSCALL_64_after_hwframe+0x6e/0x76
[  406.844907 ] RIP: 0033:0x7f0a2664e4fb

Signed-off-by: Su Yue <glass.su@suse.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
21 months agobcachefs: bios must be 512 byte algined
Kent Overstreet [Tue, 16 Jan 2024 16:38:04 +0000 (11:38 -0500)]
bcachefs: bios must be 512 byte algined

Fixes: 023f9ac9f70f bcachefs: Delete dio read alignment check
Reported-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
21 months agobcachefs: remove redundant variable tmp
Colin Ian King [Tue, 16 Jan 2024 11:07:23 +0000 (11:07 +0000)]
bcachefs: remove redundant variable tmp

The variable tmp is being assigned a value but it isn't being
read afterwards. The assignment is redundant and so tmp can be
removed.

Cleans up clang scan build warning:
warning: Although the value stored to 'ret' is used in the enclosing
expression, the value is never actually read from 'ret'
[deadcode.DeadStores]

Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
21 months agobcachefs: Improve trace_trans_restart_relock
Kent Overstreet [Tue, 16 Jan 2024 01:40:06 +0000 (20:40 -0500)]
bcachefs: Improve trace_trans_restart_relock

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
21 months agobcachefs: Fix excess transaction restarts in __bchfs_fallocate()
Kent Overstreet [Tue, 16 Jan 2024 01:37:23 +0000 (20:37 -0500)]
bcachefs: Fix excess transaction restarts in __bchfs_fallocate()

drop_locks_do() should not be used in a fastpath without first trying
the do in nonblocking mode - the unlock and relock will cause excessive
transaction restarts and potentially livelocking with other threads that
are contending for the same locks.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
21 months agobcachefs: extents_to_bp_state
Kent Overstreet [Mon, 15 Jan 2024 23:19:52 +0000 (18:19 -0500)]
bcachefs: extents_to_bp_state

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
21 months agobcachefs: bkey_and_val_eq()
Kent Overstreet [Mon, 15 Jan 2024 23:08:32 +0000 (18:08 -0500)]
bcachefs: bkey_and_val_eq()

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
21 months agobcachefs: Better journal tracepoints
Kent Overstreet [Mon, 15 Jan 2024 22:59:51 +0000 (17:59 -0500)]
bcachefs: Better journal tracepoints

Factor out bch2_journal_bufs_to_text(), and use it in the
journal_entry_full() tracepoint; when we can't get a journal reservation
we need to know the outstanding journal entry sizes to know if the
problem is due to excessive flushing.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
21 months agobcachefs: Print size of superblock with space allocated
Kent Overstreet [Mon, 15 Jan 2024 22:57:44 +0000 (17:57 -0500)]
bcachefs: Print size of superblock with space allocated

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
21 months agobcachefs: Avoid flushing the journal in the discard path
Kent Overstreet [Mon, 15 Jan 2024 22:56:22 +0000 (17:56 -0500)]
bcachefs: Avoid flushing the journal in the discard path

When issuing discards, we may need to flush the journal if there's too
many buckets that can't be discarded until a journal flush.

But the heuristic was bad; we should be comparing the number of buckets
that need to flushes against the number of free buckets, not the number
of buckets we saw.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
21 months agobcachefs: Improve move_extent tracepoint
Kent Overstreet [Mon, 15 Jan 2024 20:33:39 +0000 (15:33 -0500)]
bcachefs: Improve move_extent tracepoint

Also print out the data_opts, so that we can see what specifically is
being done to an extent.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
21 months agobcachefs: Add missing bch2_moving_ctxt_flush_all()
Kent Overstreet [Mon, 15 Jan 2024 20:06:43 +0000 (15:06 -0500)]
bcachefs: Add missing bch2_moving_ctxt_flush_all()

This fixes a bug with rebalance IOs getting stuck with reads completed,
but writes never being issued.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
21 months agobcachefs: Re-add move_extent_write tracepoint
Kent Overstreet [Mon, 15 Jan 2024 20:04:40 +0000 (15:04 -0500)]
bcachefs: Re-add move_extent_write tracepoint

It appears this was accidentally deleted at some point - also, do a bit
of cleanup.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
21 months agobcachefs: bch2_kthread_io_clock_wait() no longer sleeps until full amount
Kent Overstreet [Mon, 15 Jan 2024 19:15:26 +0000 (14:15 -0500)]
bcachefs: bch2_kthread_io_clock_wait() no longer sleeps until full amount

Drop t he loop in bch2_kthread_io_clock_wait(): this allows the code
that uses it to be woken up for other reasons, and fixes a bug where
rebalance wouldn't wake up when a scan was requested.

This raises the possibility of spurious wakeups, but callers should
always be able to handle that reasonably well.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
21 months agobcachefs: Add .val_to_text() for KEY_TYPE_cookie
Kent Overstreet [Mon, 15 Jan 2024 19:15:03 +0000 (14:15 -0500)]
bcachefs: Add .val_to_text() for KEY_TYPE_cookie

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
21 months agobcachefs: Don't pass memcmp() as a pointer
Kent Overstreet [Mon, 15 Jan 2024 19:12:43 +0000 (14:12 -0500)]
bcachefs: Don't pass memcmp() as a pointer

Some (buggy!) compilers have issues with this.

Fixes: https://github.com/koverstreet/bcachefs/issues/625
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
21 months agoMerge tag 'header_cleanup-2024-01-20' of https://evilpiepirate.org/git/bcachefs
Linus Torvalds [Sun, 21 Jan 2024 18:21:43 +0000 (10:21 -0800)]
Merge tag 'header_cleanup-2024-01-20' of https://evilpiepirate.org/git/bcachefs

Pull header fix from Kent Overstreet:
 "Just one small fixup for the RT build"

* tag 'header_cleanup-2024-01-20' of https://evilpiepirate.org/git/bcachefs:
  spinlock: Fix failing build for PREEMPT_RT

21 months agobcachefs: Reduce would_deadlock restarts
Kent Overstreet [Thu, 11 Jan 2024 04:47:04 +0000 (23:47 -0500)]
bcachefs: Reduce would_deadlock restarts

We don't have to take locks in any particular ordering - we'll make
forward progress just fine - but if we try to stick to an ordering, it
can help to avoid excessive would_deadlock transaction restarts.

This tweaks the reflink path to take extents btree locks in the right
order.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
21 months agobcachefs: bch2_trans_account_disk_usage_change()
Kent Overstreet [Sat, 11 Nov 2023 20:08:36 +0000 (15:08 -0500)]
bcachefs: bch2_trans_account_disk_usage_change()

The disk space accounting rewrite is splitting out accounting for each
replicas set - those are moving to btree keys, instead of percpu
counters.

This breaks bch2_trans_fs_usage_apply() up, splitting out the part we
will still need.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
21 months agobcachefs: bch_fs_usage_base
Kent Overstreet [Fri, 17 Nov 2023 05:03:45 +0000 (00:03 -0500)]
bcachefs: bch_fs_usage_base

Split out base filesystem usage into its own type; prep work for
breaking up bch2_trans_fs_usage_apply().

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
21 months agobcachefs: bch2_prt_compression_type()
Kent Overstreet [Sun, 7 Jan 2024 02:01:47 +0000 (21:01 -0500)]
bcachefs: bch2_prt_compression_type()

bounds checking helper, since compression types are extensible

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
21 months agobcachefs: helpers for printing data types
Kent Overstreet [Sun, 7 Jan 2024 01:57:43 +0000 (20:57 -0500)]
bcachefs: helpers for printing data types

We need bounds checking since new versions may introduce new data types.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
21 months agobcachefs: BTREE_TRIGGER_ATOMIC
Kent Overstreet [Sun, 7 Jan 2024 22:14:46 +0000 (17:14 -0500)]
bcachefs: BTREE_TRIGGER_ATOMIC

Add a new flag to be explicit about when we're running atomic triggers.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
21 months agobcachefs: drop to_text code for obsolete bps in alloc keys
Kent Overstreet [Sun, 7 Jan 2024 00:47:09 +0000 (19:47 -0500)]
bcachefs: drop to_text code for obsolete bps in alloc keys

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
21 months agobcachefs: eytzinger_for_each() declares loop iter
Kent Overstreet [Sun, 7 Jan 2024 00:29:14 +0000 (19:29 -0500)]
bcachefs: eytzinger_for_each() declares loop iter

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
21 months agobcachefs: Don't log errors if BCH_WRITE_ALLOC_NOWAIT
Kent Overstreet [Thu, 11 Jan 2024 04:08:30 +0000 (23:08 -0500)]
bcachefs: Don't log errors if BCH_WRITE_ALLOC_NOWAIT

Previously, we added logging in the write path to ensure that any
unexpected errors getting reported to userspace have a log message; but
BCH_WRITE_ALLOC_NOWAIT is a special case, it's used for promotes where
errors are expected and not reported out to userspace - so we need to
silence those.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
21 months agobcachefs: fix memleak in bch2_split_devs
Su Yue [Mon, 8 Jan 2024 15:11:08 +0000 (23:11 +0800)]
bcachefs: fix memleak in bch2_split_devs

The pointer dev_name can be modified by strseq(),
then causes the memleak:

unreferenced object 0xffff9d08a2916c80 (size 32):
  comm "mount.bcachefs", pid 9090, jiffies 4295856224 (age 17.564s)
  hex dump (first 32 bytes):
    2f 64 65 76 2f 6d 61 70 70 65 72 2f 74 65 73 74  /dev/mapper/test
    2d 30 00 00 00 00 00 00 00 00 00 00 00 00 00 00  -0..............
  backtrace:
    [<00000000c5d3be7d>] __kmem_cache_alloc_node+0x1f3/0x2c0
    [<0000000052215d26>] __kmalloc_node_track_caller+0x51/0x150
    [<0000000069fea956>] kstrdup+0x32/0x60
    [<000000000877fcf1>] bch2_split_devs+0x3f/0x150 [bcachefs]
    [<000000007ee93204>] bch2_mount+0xcb/0x640 [bcachefs]
    [<000000002dd1e04b>] legacy_get_tree+0x30/0x60
    [<000000006afc31d3>] vfs_get_tree+0x28/0xf0
    [<000000007b0c538e>] path_mount+0x475/0xb60
    [<0000000092de5882>] __x64_sys_mount+0x105/0x140
    [<0000000054fc05d8>] do_syscall_64+0x42/0xf0
    [<00000000df584910>] entry_SYSCALL_64_after_hwframe+0x6e/0x76

Fix it by copy pointer dev_name at beginning and free the copied
pointer at end.

Signed-off-by: Su Yue <glass.su@suse.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
21 months agoMerge tag 'v6.8-rc-part2-smb-client' of git://git.samba.org/sfrench/cifs-2.6
Linus Torvalds [Sun, 21 Jan 2024 00:48:07 +0000 (16:48 -0800)]
Merge tag 'v6.8-rc-part2-smb-client' of git://git.samba.org/sfrench/cifs-2.6

Pull smb client updates from Steve French:
 "Various smb client fixes, including multichannel and for SMB3.1.1
  POSIX extensions:

   - debugging improvement (display start time for stats)

   - two reparse point handling fixes

   - various multichannel improvements and fixes

   - SMB3.1.1 POSIX extensions open/create parsing fix

   - retry (reconnect) improvement including new retrans mount parm, and
     handling of two additional return codes that need to be retried on

   - two minor cleanup patches and another to remove duplicate query
     info code

   - two documentation cleanup, and one reviewer email correction"

* tag 'v6.8-rc-part2-smb-client' of git://git.samba.org/sfrench/cifs-2.6:
  cifs: update iface_last_update on each query-and-update
  cifs: handle servers that still advertise multichannel after disabling
  cifs: new mount option called retrans
  cifs: reschedule periodic query for server interfaces
  smb: client: don't clobber ->i_rdev from cached reparse points
  smb: client: get rid of smb311_posix_query_path_info()
  smb: client: parse owner/group when creating reparse points
  smb: client: fix parsing of SMB3.1.1 POSIX create context
  cifs: update known bugs mentioned in kernel docs for cifs
  cifs: new nt status codes from MS-SMB2
  cifs: pick channel for tcon and tdis
  cifs: open_cached_dir should not rely on primary channel
  smb3: minor documentation updates
  Update MAINTAINERS email address
  cifs: minor comment cleanup
  smb3: show beginning time for per share stats
  cifs: remove redundant variable tcon_exist

21 months agoMerge tag 'dmaengine-fix-6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sat, 20 Jan 2024 23:03:25 +0000 (15:03 -0800)]
Merge tag 'dmaengine-fix-6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine

Pull dmaengine updates from Vinod Koul:
 "New support:
   - Loongson LS2X APB DMA controller
   - sf-pdma: mpfs-pdma support
   - Qualcomm X1E80100 GPI dma controller support

  Updates:
   - Xilinx XDMA updates to support interleaved DMA transfers
   - TI PSIL threads for AM62P and J722S and cfg register regions
     description
   - axi-dmac Improving the cyclic DMA transfers
   - Tegra Support dma-channel-mask property
   - Remaining platform remove callback returning void conversions

 Driver fixes for:
   - Xilinx xdma driver operator precedence and initialization fix
   - Excess kernel-doc warning fix in imx-sdma xilinx xdma drivers
   - format-overflow warning fix for rz-dmac, sh usb dmac drivers
   - 'output may be truncated' fix for shdma, fsl-qdma and dw-edma
     drivers"

* tag 'dmaengine-fix-6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine: (58 commits)
  dmaengine: dw-edma: increase size of 'name' in debugfs code
  dmaengine: fsl-qdma: increase size of 'irq_name'
  dmaengine: shdma: increase size of 'dev_id'
  dmaengine: xilinx: xdma: Fix kernel-doc warnings
  dmaengine: usb-dmac: Avoid format-overflow warning
  dmaengine: sh: rz-dmac: Avoid format-overflow warning
  dmaengine: imx-sdma: fix Excess kernel-doc warnings
  dmaengine: xilinx: xdma: Fix initialization location of desc in xdma_channel_isr()
  dmaengine: xilinx: xdma: Fix operator precedence in xdma_prep_interleaved_dma()
  dmaengine: xilinx: xdma: statify xdma_prep_interleaved_dma
  dmaengine: xilinx: xdma: Workaround truncation compilation error
  dmaengine: pl330: issue_pending waits until WFP state
  dmaengine: xilinx: xdma: Implement interleaved DMA transfers
  dmaengine: xilinx: xdma: Prepare the introduction of interleaved DMA transfers
  dmaengine: xilinx: xdma: Add transfer error reporting
  dmaengine: xilinx: xdma: Add error checking in xdma_channel_isr()
  dmaengine: xilinx: xdma: Rework xdma_terminate_all()
  dmaengine: xilinx: xdma: Ease dma_pool alignment requirements
  dmaengine: xilinx: xdma: Add necessary macro definitions
  dmaengine: xilinx: xdma: Get rid of unused code
  ...

21 months agoMerge tag 'coccinelle-for-6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/jlawa...
Linus Torvalds [Sat, 20 Jan 2024 22:20:34 +0000 (14:20 -0800)]
Merge tag 'coccinelle-for-6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/jlawall/linux

Pull coccinelle updates from Julia Lawall:
 "Updates to the device_attr_show semantic patch to reflect the new
  guidelines of the Linux kernel documentation.

  The problem was identified by Li Zhijian <lizhijian@fujitsu.com>, who
  proposed an initial fix"

* tag 'coccinelle-for-6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/jlawall/linux:
  coccinelle: device_attr_show: simplify patch case
  coccinelle: device_attr_show: Adapt to the latest Documentation/filesystems/sysfs.rst

21 months agomedia: solo6x10: replace max(a, min(b, c)) by clamp(b, a, c)
Aurelien Jarno [Sat, 13 Jan 2024 18:33:31 +0000 (19:33 +0100)]
media: solo6x10: replace max(a, min(b, c)) by clamp(b, a, c)

This patch replaces max(a, min(b, c)) by clamp(b, a, c) in the solo6x10
driver.  This improves the readability and more importantly, for the
solo6x10-p2m.c file, this reduces on my system (x86-64, gcc 13):

 - the preprocessed size from 121 MiB to 4.5 MiB;

 - the build CPU time from 46.8 s to 1.6 s;

 - the build memory from 2786 MiB to 98MiB.

In fine, this allows this relatively simple C file to be built on a
32-bit system.

Reported-by: Jiri Slaby <jirislaby@gmail.com>
Closes: https://lore.kernel.org/lkml/18c6df0d-45ed-450c-9eda-95160a2bbb8e@gmail.com/
Cc: <stable@vger.kernel.org> # v6.7+
Suggested-by: David Laight <David.Laight@ACULAB.COM>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Reviewed-by: David Laight <David.Laight@ACULAB.COM>
Reviewed-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
21 months agococcinelle: device_attr_show: simplify patch case
Julia Lawall [Sat, 20 Jan 2024 20:56:11 +0000 (21:56 +0100)]
coccinelle: device_attr_show: simplify patch case

Replacing the final expression argument by ... allows the format
string to have multiple arguments.

It also has the advantage of allowing the change to be recognized as
a change in a single statement, thus avoiding adding unneeded braces.

Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
21 months agoexecve: open the executable file before doing anything else
Linus Torvalds [Tue, 9 Jan 2024 00:43:04 +0000 (16:43 -0800)]
execve: open the executable file before doing anything else

No point in allocating a new mm, counting arguments and environment
variables etc if we're just going to return ENOENT.

This patch does expose the fact that 'do_filp_open()' that execve() uses
is still unnecessarily expensive in the failure case, because it
allocates the 'struct file *' early, even if the path lookup (which is
heavily optimized) fails.

So that remains an unnecessary cost in the "no such executable" case,
but it's a separate issue.  Regardless, I do not want to do _both_ a
filename_lookup() and a later do_filp_open() like the origin patch by
Josh Triplett did in [1].

Reported-by: Josh Triplett <josh@joshtriplett.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Mateusz Guzik <mjguzik@gmail.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Link: https://lore.kernel.org/lkml/5c7333ea4bec2fad1b47a8fa2db7c31e4ffc4f14.1663334978.git.josh@joshtriplett.org/
Link: https://lore.kernel.org/lkml/202209161637.9EDAF6B18@keescook/
Link: https://lore.kernel.org/lkml/CAHk-=wgznerM-xs+x+krDfE7eVBiy_HOam35rbsFMMOwvYuEKQ@mail.gmail.com/
Link: https://lore.kernel.org/lkml/CAHk-=whf9qLO8ipps4QhmS0BkM8mtWJhvnuDSdtw5gFjhzvKNA@mail.gmail.com/
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
21 months agoMerge tag 'riscv-for-linus-6.8-mw4' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sat, 20 Jan 2024 19:06:04 +0000 (11:06 -0800)]
Merge tag 'riscv-for-linus-6.8-mw4' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux

Pull more RISC-V updates from Palmer Dabbelt:

 - Support for tuning for systems with fast misaligned accesses.

 - Support for SBI-based suspend.

 - Support for the new SBI debug console extension.

 - The T-Head CMOs now use PA-based flushes.

 - Support for enabling the V extension in kernel code.

 - Optimized IP checksum routines.

 - Various ftrace improvements.

 - Support for archrandom, which depends on the Zkr extension.

 - The build is no longer broken under NET=n, KUNIT=y for ports that
   don't define their own ipv6 checksum.

* tag 'riscv-for-linus-6.8-mw4' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (56 commits)
  lib: checksum: Fix build with CONFIG_NET=n
  riscv: lib: Check if output in asm goto supported
  riscv: Fix build error on rv32 + XIP
  riscv: optimize ELF relocation function in riscv
  RISC-V: Implement archrandom when Zkr is available
  riscv: Optimize hweight API with Zbb extension
  riscv: add dependency among Image(.gz), loader(.bin), and vmlinuz.efi
  samples: ftrace: Add RISC-V support for SAMPLE_FTRACE_DIRECT[_MULTI]
  riscv: ftrace: Add DYNAMIC_FTRACE_WITH_DIRECT_CALLS support
  riscv: ftrace: Make function graph use ftrace directly
  riscv: select FTRACE_MCOUNT_USE_PATCHABLE_FUNCTION_ENTRY
  lib/Kconfig.debug: Update AS_HAS_NON_CONST_LEB128 comment and name
  riscv: Restrict DWARF5 when building with LLVM to known working versions
  riscv: Hoist linker relaxation disabling logic into Kconfig
  kunit: Add tests for csum_ipv6_magic and ip_fast_csum
  riscv: Add checksum library
  riscv: Add checksum header
  riscv: Add static key for misaligned accesses
  asm-generic: Improve csum_fold
  RISC-V: selftests: cbo: Ensure asm operands match constraints
  ...

21 months agoMerge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Linus Torvalds [Sat, 20 Jan 2024 17:42:32 +0000 (09:42 -0800)]
Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI updates from James Bottomley:
 "Final round of fixes that came in too late to send in the first
  request.

  It's nine bug fixes and one version update (because of a bug fix) and
  one set of PCI ID additions. There's one bug fix in the core which is
  really a one liner (except that an additional sdev pointer was added
  for convenience) and the rest are in drivers"

* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: target: core: Add TMF to tmr_list handling
  scsi: core: Kick the requeue list after inserting when flushing
  scsi: fnic: unlock on error path in fnic_queuecommand()
  scsi: fcoe: Fix unsigned comparison with zero in store_ctlr_mode()
  scsi: mpi3mr: Fix mpi3mr_fw.c kernel-doc warnings
  scsi: smartpqi: Bump driver version to 2.1.26-030
  scsi: smartpqi: Fix logical volume rescan race condition
  scsi: smartpqi: Add new controller PCI IDs
  scsi: ufs: qcom: Remove unnecessary goto statement from ufs_qcom_config_esi()
  scsi: ufs: core: Remove the ufshcd_hba_exit() call from ufshcd_async_scan()
  scsi: ufs: core: Simplify power management during async scan

21 months agoMerge tag 'sh-for-v6.8-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/glaubit...
Linus Torvalds [Sat, 20 Jan 2024 17:24:06 +0000 (09:24 -0800)]
Merge tag 'sh-for-v6.8-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/glaubitz/sh-linux

Pull sh updates from John Paul Adrian Glaubitz:
 "Since the large patch series to convert arch/sh to device tree support
  has not been finalized yet due to various maintainers still asking for
  changes to the series, this ended up being rather small consisting of
  just two fixes.

  The first patch by Geert Uytterhoeven addresses a build failure in the
  EcoVec platform code. And the second patch by Masahiro Yamada removes
  an unnecessary $(foreach ...) found in a Makefile of the vsyscall
  code.

   - Rename missed backlight field from fbdev to dev

   - Remove unnecessary $(foreach ...)"

* tag 'sh-for-v6.8-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/glaubitz/sh-linux:
  sh: vsyscall: Remove unnecessary $(foreach ...)
  sh: ecovec24: Rename missed backlight field from fbdev to dev

21 months agoMerge tag 'fbdev-for-6.8-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sat, 20 Jan 2024 17:14:04 +0000 (09:14 -0800)]
Merge tag 'fbdev-for-6.8-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/linux-fbdev

Pull fbdev fix from Helge Deller:
 "There were various reports from people without any graphics output on
  the screen and it turns out one commit triggers the problem.

   - Revert 'firmware/sysfb: Clear screen_info state after consuming it'"

* tag 'fbdev-for-6.8-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/linux-fbdev:
  Revert "firmware/sysfb: Clear screen_info state after consuming it"

21 months agoMerge tag 'perf-tools-for-v6.8-1-2024-01-09' of git://git.kernel.org/pub/scm/linux...
Linus Torvalds [Fri, 19 Jan 2024 22:25:23 +0000 (14:25 -0800)]
Merge tag 'perf-tools-for-v6.8-1-2024-01-09' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools

Pull perf tools updates from Arnaldo Carvalho de Melo:
 "Add Namhyung Kim as tools/perf/ co-maintainer, we're taking turns
  processing patches, switching roles from perf-tools to perf-tools-next
  at each Linux release.

  Data profiling:

   - Associate samples that identify loads and stores with data
     structures. This uses events available on Intel, AMD and others and
     DWARF info:

       # To get memory access samples in kernel for 1 second (on Intel)
       $ perf mem record -a -K --ldlat=4 -- sleep 1

       # Similar for the AMD (but it requires 6.3+ kernel for BPF filters)
       $ perf mem record -a --filter 'mem_op == load || mem_op == store, ip > 0x8000000000000000' -- sleep 1

     Then, amongst several modes of post processing, one can do things like:

       $ perf report -s type,typeoff --hierarchy --group --stdio
       ...
       #
       # Samples: 10K of events 'cpu/mem-loads,ldlat=4/P, cpu/mem-stores/P, dummy:u'
       # Event count (approx.): 602758064
       #
       #                    Overhead  Data Type / Data Type Offset
       # ...........................  ............................
       #
           26.09%   3.28%   0.00%     long unsigned int
              26.09%   3.28%   0.00%     long unsigned int +0 (no field)
           18.48%   0.73%   0.00%     struct page
              10.83%   0.02%   0.00%     struct page +8 (lru.next)
               3.90%   0.28%   0.00%     struct page +0 (flags)
               3.45%   0.06%   0.00%     struct page +24 (mapping)
               0.25%   0.28%   0.00%     struct page +48 (_mapcount.counter)
               0.02%   0.06%   0.00%     struct page +32 (index)
               0.02%   0.00%   0.00%     struct page +52 (_refcount.counter)
               0.02%   0.01%   0.00%     struct page +56 (memcg_data)
               0.00%   0.01%   0.00%     struct page +16 (lru.prev)
           15.37%  17.54%   0.00%     (stack operation)
              15.37%  17.54%   0.00%     (stack operation) +0 (no field)
           11.71%  50.27%   0.00%     (unknown)
              11.71%  50.27%   0.00%     (unknown) +0 (no field)

       $ perf annotate --data-type
       ...
       Annotate type: 'struct cfs_rq' in [kernel.kallsyms] (13 samples):
       ============================================================================
           samples     offset       size  field
                13          0        640  struct cfs_rq         {
                 2          0         16      struct load_weight       load {
                 2          0          8          unsigned long        weight;
                 0          8          4          u32  inv_weight;
                                              };
                 0         16          8      unsigned long    runnable_weight;
                 0         24          4      unsigned int     nr_running;
                 1         28          4      unsigned int     h_nr_running;
       ...

       $ perf annotate --data-type=page --group
       Annotate type: 'struct page' in [kernel.kallsyms] (480 samples):
        event[0] = cpu/mem-loads,ldlat=4/P
        event[1] = cpu/mem-stores/P
        event[2] = dummy:u
       ===================================================================================
                samples  offset  size  field
       447  33        0       0    64  struct page     {
       108   8        0       0     8  long unsigned int  flags;
       319  13        0       8    40  union       {
       319  13        0       8    40          struct          {
       236   2        0       8    16              union       {
       236   2        0       8    16                  struct list_head       lru {
       236   1        0       8     8                      struct list_head*  next;
         0   1        0      16     8                      struct list_head*  prev;
                                                       };
       236   2        0       8    16                  struct          {
       236   1        0       8     8                      void*      __filler;
         0   1        0      16     4                      unsigned int       mlock_count;
                                                       };
       236   2        0       8    16                  struct list_head       buddy_list {
       236   1        0       8     8                      struct list_head*  next;
         0   1        0      16     8                      struct list_head*  prev;
                                                       };
       236   2        0       8    16                  struct list_head       pcp_list {
       236   1        0       8     8                      struct list_head*  next;
         0   1        0      16     8                      struct list_head*  prev;
                                                       };
                                                   };
        82   4        0      24     8              struct address_space*      mapping;
         1   7        0      32     8              union       {
         1   7        0      32     8                  long unsigned int      index;
         1   7        0      32     8                  long unsigned int      share;
                                                   };
         0   0        0      40     8              long unsigned int  private;
                                                                 };

     This uses the existing annotate code, calling objdump to do the
     disassembly, with improvements to avoid having this take too long,
     but longer term a switch to a disassembler library, possibly
     reusing code in the kernel will be pursued.

     This is the initial implementation, please use it and report
     impressions and bugs. Make sure the kernel-debuginfo packages match
     the running kernel. The 'perf report' phase for non short perf.data
     files may take a while.

     There is a great article about it on LWN:

       https://lwn.net/Articles/955709/ - "Data-type profiling for perf"

     One last test I did while writing this text, on a AMD Ryzen 5950X,
     using a distro kernel, while doing a simple 'find /' on an
     otherwise idle system resulted in:

     # uname -r
     6.6.9-100.fc38.x86_64
     # perf -vv | grep BPF_
                      bpf: [ on  ]  # HAVE_LIBBPF_SUPPORT
            bpf_skeletons: [ on  ]  # HAVE_BPF_SKEL
     # rpm -qa | grep kernel-debuginfo
     kernel-debuginfo-common-x86_64-6.6.9-100.fc38.x86_64
     kernel-debuginfo-6.6.9-100.fc38.x86_64
     #
     # perf mem record -a --filter 'mem_op == load || mem_op == store, ip > 0x8000000000000000'
     ^C[ perf record: Woken up 1 times to write data ]
     [ perf record: Captured and wrote 2.199 MB perf.data (2913 samples) ]
     #
     # ls -la perf.data
     -rw-------. 1 root root 2346486 Jan  9 18:36 perf.data
     # perf evlist
     ibs_op//
     dummy:u
     # perf evlist -v
     ibs_op//: type: 11, size: 136, config: 0, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ADDR|CPU|PERIOD|IDENTIFIER|DATA_SRC|WEIGHT, read_format: ID, disabled: 1, inherit: 1, freq: 1, sample_id_all: 1
     dummy:u: type: 1 (PERF_TYPE_SOFTWARE), size: 136, config: 0x9 (PERF_COUNT_SW_DUMMY), { sample_period, sample_freq }: 1, sample_type: IP|TID|TIME|ADDR|CPU|IDENTIFIER|DATA_SRC|WEIGHT, read_format: ID, inherit: 1, exclude_kernel: 1, exclude_hv: 1, mmap: 1, comm: 1, task: 1, mmap_data: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, bpf_event: 1
     #
     # perf report -s type,typeoff --hierarchy --group --stdio
     # Total Lost Samples: 0
     #
     # Samples: 2K of events 'ibs_op//, dummy:u'
     # Event count (approx.): 1904553038
     #
     #            Overhead  Data Type / Data Type Offset
     # ...................  ............................
     #
         73.70%   0.00%     (unknown)
            73.70%   0.00%     (unknown) +0 (no field)
          3.01%   0.00%     long unsigned int
             3.00%   0.00%     long unsigned int +0 (no field)
             0.01%   0.00%     long unsigned int +2 (no field)
          2.73%   0.00%     struct task_struct
             1.71%   0.00%     struct task_struct +52 (on_cpu)
             0.38%   0.00%     struct task_struct +2104 (rcu_read_unlock_special.b.blocked)
             0.23%   0.00%     struct task_struct +2100 (rcu_read_lock_nesting)
             0.14%   0.00%     struct task_struct +2384 ()
             0.06%   0.00%     struct task_struct +3096 (signal)
             0.05%   0.00%     struct task_struct +3616 (cgroups)
             0.05%   0.00%     struct task_struct +2344 (active_mm)
             0.02%   0.00%     struct task_struct +46 (flags)
             0.02%   0.00%     struct task_struct +2096 (migration_disabled)
             0.01%   0.00%     struct task_struct +24 (__state)
             0.01%   0.00%     struct task_struct +3956 (mm_cid_active)
             0.01%   0.00%     struct task_struct +1048 (cpus_ptr)
             0.01%   0.00%     struct task_struct +184 (se.group_node.next)
             0.01%   0.00%     struct task_struct +20 (thread_info.cpu)
             0.00%   0.00%     struct task_struct +104 (on_rq)
             0.00%   0.00%     struct task_struct +2456 (pid)
          1.36%   0.00%     struct module
             0.59%   0.00%     struct module +952 (kallsyms)
             0.42%   0.00%     struct module +0 (state)
             0.23%   0.00%     struct module +8 (list.next)
             0.12%   0.00%     struct module +216 (syms)
          0.95%   0.00%     struct inode
             0.41%   0.00%     struct inode +40 (i_sb)
             0.22%   0.00%     struct inode +0 (i_mode)
             0.06%   0.00%     struct inode +76 (i_rdev)
             0.06%   0.00%     struct inode +56 (i_security)
     <SNIP>

  perf top/report:

   - Don't ignore job control, allowing control+Z + bg to work.

   - Add s390 raw data interpretation for PAI (Processor Activity
     Instrumentation) counters.

  perf archive:

   - Add new option '--all' to pack perf.data with DSOs.

   - Add new option '--unpack' to expand tarballs.

  Initialization speedups:

   - Lazily initialize zstd streams to save memory when not using it.

   - Lazily allocate/size mmap event copy.

   - Lazy load kernel symbols in 'perf record'.

   - Be lazier in allocating lost samples buffer in 'perf record'.

   - Don't synthesize BPF events when disabled via the command line
     (perf record --no-bpf-event).

  Assorted improvements:

   - Show note on AMD systems that the :p, :pp, :ppp and :P are all the
     same, as IBS (Instruction Based Sampling) is used and it is
     inherentely precise, not having levels of precision like in Intel
     systems.

   - When 'cycles' isn't available, fall back to the "task-clock" event
     when not system wide, not to 'cpu-clock'.

   - Add --debug-file option to redirect debug output, e.g.:

       $ perf --debug-file /tmp/perf.log record -v true

   - Shrink 'struct map' to under one cacheline by avoiding function
     pointers for selecting if addresses are identity or DSO relative,
     and using just a byte for some boolean struct members.

   - Resolve the arch specific strerrno just once to use in
     perf_env__arch_strerrno().

   - Reduce memory for recording PERF_RECORD_LOST_SAMPLES event.

  Assorted fixes:

   - Fix the default 'perf top' usage on Intel hybrid systems, now it
     starts with a browser showing the number of samples for Efficiency
     (cpu_atom/cycles/P) and Performance (cpu_core/cycles/P). This
     behaviour is similar on ARM64, with its respective set of
     big.LITTLE processors.

   - Fix segfault on build_mem_topology() error path.

   - Fix 'perf mem' error on hybrid related to availability of mem event
     in a PMU.

   - Fix missing reference count gets (map, maps) in the db-export code.

   - Avoid recursively taking env->bpf_progs.lock in the 'perf_env'
     code.

   - Use the newly introduced maps__for_each_map() to add missing
     locking around iteration of 'struct map' entries.

   - Parse NOTE segments until the build id is found, don't stop on the
     first one, ELF files may have several such NOTE segments.

   - Remove 'egrep' usage, its deprecated, use 'grep -E' instead.

   - Warn first about missing libelf, not libbpf, that depends on
     libelf.

   - Use alternative to 'find ... -printf' as this isn't supported in
     busybox.

   - Address python 3.6 DeprecationWarning for string scapes.

   - Fix memory leak in uniq() in libsubcmd.

   - Fix man page formatting for 'perf lock'

   - Fix some spelling mistakes.

  perf tests:

   - Fail shell tests that needs some symbol in perf itself if it is
     stripped. These tests check if a symbol is resolved, if some hot
     function is indeed detected by profiling, etc.

   - The 'perf test sigtrap' test is currently failing on PREEMPT_RT,
     skip it if sleeping spinlocks are detected (using BTF) and point to
     the mailing list discussion about it. This test is also being
     skipped on several architectures (powerpc, s390x, arm and aarch64)
     due to other pending issues with intruction breakpoints.

   - Adjust test case perf record offcpu profiling tests for s390.

   - Fix 'Setup struct perf_event_attr' fails on s390 on z/VM guest,
     addressing issues caused by the fallback from cycles to task-clock
     done in this release.

   - Fix mask for VG register in the user-regs test.

   - Use shellcheck on 'perf test' shell scripts automatically to make
     sure changes don't introduce things it flags as problematic.

   - Add option to change objdump binary and allow it to be set via
     'perf config'.

   - Add basic 'perf script', 'perf list --json" and 'perf diff' tests.

   - Basic branch counter support.

   - Make DSO tests a suite rather than individual.

   - Remove atomics from test_loop to avoid test failures.

   - Fix call chain match on powerpc for the record+probe_libc_inet_pton
     test.

   - Improve Intel hybrid tests.

  Vendor event files (JSON):

  powerpc:

   - Update datasource event name to fix duplicate events on IBM's
     Power10.

   - Add PVN for HX-C2000 CPU with Power8 Architecture.

  Intel:

   - Alderlake/rocketlake metric fixes.

   - Update emeraldrapids events to v1.02.

   - Update icelakex events to v1.23.

   - Update sapphirerapids events to v1.17.

   - Add skx, clx, icx and spr upi bandwidth metric.

  AMD:

   - Add Zen 4 memory controller events.

  RISC-V:

   - Add StarFive Dubhe-80 and Dubhe-90 JSON files.
       https://www.starfivetech.com/en/site/cpu-u

   - Add T-HEAD C9xx JSON file.
       https://github.com/riscv-software-src/opensbi/blob/master/docs/platform/thead-c9xx.md

  ARM64:

   - Remove UTF-8 characters from cmn.json, that were causing build
     failure in some distros.

   - Add core PMU events and metrics for Ampere One X.

   - Rename Ampere One's BPU_FLUSH_MEM_FAULT to GPC_FLUSH_MEM_FAULT

  libperf:

   - Rename several perf_cpu_map constructor names to clarify what they
     really do.

   - Ditto for some other methods, coping with some issues in their
     semantics, like perf_cpu_map__empty() ->
     perf_cpu_map__has_any_cpu_or_is_empty().

   - Document perf_cpu_map__nr()'s behavior

  perf stat:

   - Exit if parse groups fails.

   - Combine the -A/--no-aggr and --no-merge options.

   - Fix help message for --metric-no-threshold option.

  Hardware tracing:

  ARM64 CoreSight:

   - Bump minimum OpenCSD version to ensure a bugfix is present.

   - Add 'T' itrace option for timestamp trace

   - Set start vm addr of exectable file to 0 and don't ignore first
     sample on the arm-cs-trace-disasm.py 'perf script'"

* tag 'perf-tools-for-v6.8-1-2024-01-09' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools: (179 commits)
  MAINTAINERS: Add Namhyung as tools/perf/ co-maintainer
  perf test: test case 'Setup struct perf_event_attr' fails on s390 on z/vm
  perf db-export: Fix missing reference count get in call_path_from_sample()
  perf tests: Add perf script test
  libsubcmd: Fix memory leak in uniq()
  perf TUI: Don't ignore job control
  perf vendor events intel: Update sapphirerapids events to v1.17
  perf vendor events intel: Update icelakex events to v1.23
  perf vendor events intel: Update emeraldrapids events to v1.02
  perf vendor events intel: Alderlake/rocketlake metric fixes
  perf x86 test: Add hybrid test for conflicting legacy/sysfs event
  perf x86 test: Update hybrid expectations
  perf vendor events amd: Add Zen 4 memory controller events
  perf stat: Fix hard coded LL miss units
  perf record: Reduce memory for recording PERF_RECORD_LOST_SAMPLES event
  perf env: Avoid recursively taking env->bpf_progs.lock
  perf annotate: Add --insn-stat option for debugging
  perf annotate: Add --type-stat option for debugging
  perf annotate: Support event group display
  perf annotate: Add --data-type option
  ...

21 months agoMerge tag 'strlcpy-removal-v6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Fri, 19 Jan 2024 21:49:16 +0000 (13:49 -0800)]
Merge tag 'strlcpy-removal-v6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Pull strlcpy removal from Kees Cook:
 "As promised, this is 'part 2' of the hardening tree, late in -rc1 now
  that all the other trees with strlcpy() removals have landed. One new
  user appeared (in bcachefs) but was a trivial refactor. The kernel is
  now free of the strlcpy() API!

   - Remove of the final (very recent) user of strlcpy() (in bcachefs)

   - Remove the strlcpy() API. Long live strscpy()"

* tag 'strlcpy-removal-v6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  string: Remove strlcpy()
  bcachefs: Replace strlcpy() with strscpy()

21 months agoMerge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Linus Torvalds [Fri, 19 Jan 2024 21:36:15 +0000 (13:36 -0800)]
Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 fixes from Will Deacon:
 "I think the main one is fixing the dynamic SCS patching when full LTO
  is enabled (clang was silently getting this horribly wrong), but it's
  all good stuff.

  Rob just pointed out that the fix to the workaround for erratum
  #2966298 might not be necessary, but in the worst case it's harmless
  and since the official description leaves a little to be desired here,
  I've left it in.

  Summary:

   - Fix shadow call stack patching with LTO=full

   - Fix voluntary preemption of the FPSIMD registers from assembly code

   - Fix workaround for A520 CPU erratum #2966298 and extend to A510

   - Fix SME issues that resulted in corruption of the register state

   - Minor fixes (missing includes, formatting)"

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: Fix silcon-errata.rst formatting
  arm64/sme: Always exit sme_alloc() early with existing storage
  arm64/fpsimd: Remove spurious check for SVE support
  arm64/ptrace: Don't flush ZA/ZT storage when writing ZA via ptrace
  arm64: entry: simplify kernel_exit logic
  arm64: entry: fix ARM64_WORKAROUND_SPECULATIVE_UNPRIV_LOAD
  arm64: errata: Add Cortex-A510 speculative unprivileged load workaround
  arm64: Rename ARM64_WORKAROUND_2966298
  arm64: fpsimd: Bring cond_yield asm macro in line with new rules
  arm64: scs: Work around full LTO issue with dynamic SCS
  arm64: irq: include <linux/cpumask.h>

21 months agoMerge tag 'loongarch-6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai...
Linus Torvalds [Fri, 19 Jan 2024 21:30:49 +0000 (13:30 -0800)]
Merge tag 'loongarch-6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson

Pull LoongArch updates from Huacai Chen:

 - Raise minimum clang version to 18.0.0

 - Enable initial Rust support for LoongArch

 - Add built-in dtb support for LoongArch

 - Use generic interface to support crashkernel=X,[high,low]

 - Some bug fixes and other small changes

 - Update the default config file.

* tag 'loongarch-6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson: (22 commits)
  MAINTAINERS: Add BPF JIT for LOONGARCH entry
  LoongArch: Update Loongson-3 default config file
  LoongArch: BPF: Prevent out-of-bounds memory access
  LoongArch: BPF: Support 64-bit pointers to kfuncs
  LoongArch: Fix definition of ftrace_regs_set_instruction_pointer()
  LoongArch: Use generic interface to support crashkernel=X,[high,low]
  LoongArch: Fix and simplify fcsr initialization on execve()
  LoongArch: Let cores_io_master cover the largest NR_CPUS
  LoongArch: Change SHMLBA from SZ_64K to PAGE_SIZE
  LoongArch: Add a missing call to efi_esrt_init()
  LoongArch: Parsing CPU-related information from DTS
  LoongArch: dts: DeviceTree for Loongson-2K2000
  LoongArch: dts: DeviceTree for Loongson-2K1000
  LoongArch: dts: DeviceTree for Loongson-2K0500
  LoongArch: Allow device trees be built into the kernel
  dt-bindings: interrupt-controller: loongson,liointc: Fix dtbs_check warning for interrupt-names
  dt-bindings: interrupt-controller: loongson,liointc: Fix dtbs_check warning for reg-names
  dt-bindings: loongarch: Add Loongson SoC boards compatibles
  dt-bindings: loongarch: Add CPU bindings for LoongArch
  LoongArch: Enable initial Rust support
  ...

21 months agoRevert "firmware/sysfb: Clear screen_info state after consuming it"
Helge Deller [Fri, 19 Jan 2024 20:47:15 +0000 (21:47 +0100)]
Revert "firmware/sysfb: Clear screen_info state after consuming it"

This reverts commit df67699c9cb0ceb70f6cc60630ca938c06773eda.

Jens Axboe reported a regression that his machine is failing to show a
console, or in fact anything, on current -git. There's no output and no
console after:

Loading Linux 6.7.0+ ...
Loading initial ramdisk ...

Signed-off-by: Helge Deller <deller@gmx.de>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: Jens Axboe <axboe@kernel.dk>
21 months agoMerge tag 'devicetree-for-6.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Fri, 19 Jan 2024 21:00:45 +0000 (13:00 -0800)]
Merge tag 'devicetree-for-6.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux

Pull devicetree header detangling from Rob Herring:
 "Remove the circular including of of_device.h and of_platform.h along
  with all of their implicit includes.

  This is the culmination of several kernel cycles worth of fixing
  implicit DT includes throughout the tree"

* tag 'devicetree-for-6.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux:
  of: Stop circularly including of_device.h and of_platform.h
  clk: qcom: gcc-x1e80100: Replace of_device.h with explicit includes
  thermal: loongson2: Replace of_device.h with explicit includes
  net: can: Use device_get_match_data()
  sparc: Use device_get_match_data()

21 months agococcinelle: device_attr_show: Adapt to the latest Documentation/filesystems/sysfs.rst
Li Zhijian [Fri, 19 Jan 2024 06:20:57 +0000 (14:20 +0800)]
coccinelle: device_attr_show: Adapt to the latest Documentation/filesystems/sysfs.rst

Adapt description, warning message and MODE=patch according to the latest
Documentation/filesystems/sysfs.rst:
> show() should only use sysfs_emit() or sysfs_emit_at() when formatting
> the value to be returned to user space.

After this patch:
When MODE=report,
 $ make coccicheck COCCI=scripts/coccinelle/api/device_attr_show.cocci M=drivers/hid/hid-picolcd_core.c MODE=report
 <...snip...>
 drivers/hid/hid-picolcd_core.c:304:8-16: WARNING: please use sysfs_emit or sysfs_emit_at
 drivers/hid/hid-picolcd_core.c:259:9-17: WARNING: please use sysfs_emit or sysfs_emit_at

When MODE=patch,
 $ make coccicheck COCCI=scripts/coccinelle/api/device_attr_show.cocci M=drivers/hid/hid-picolcd_core.c MODE=patch
 <...snip...>
 diff -u -p a/drivers/hid/hid-picolcd_core.c b/drivers/hid/hid-picolcd_core.c
 --- a/drivers/hid/hid-picolcd_core.c
 +++ b/drivers/hid/hid-picolcd_core.c
 @@ -255,10 +255,12 @@ static ssize_t picolcd_operation_mode_sh
  {
         struct picolcd_data *data = dev_get_drvdata(dev);

 -       if (data->status & PICOLCD_BOOTLOADER)
 -               return snprintf(buf, PAGE_SIZE, "[bootloader] lcd\n");
 -       else
 -               return snprintf(buf, PAGE_SIZE, "bootloader [lcd]\n");
 +       if (data->status & PICOLCD_BOOTLOADER) {
 +               return sysfs_emit(buf, "[bootloader] lcd\n");
 +       }
 +       else {
 +               return sysfs_emit(buf, "bootloader [lcd]\n");
 +       }
  }

  static ssize_t picolcd_operation_mode_store(struct device *dev,
 @@ -301,7 +303,7 @@ static ssize_t picolcd_operation_mode_de
  {
         struct picolcd_data *data = dev_get_drvdata(dev);

 -       return snprintf(buf, PAGE_SIZE, "hello world\n");
 +       return sysfs_emit(buf, "hello world\n");
  }

  static ssize_t picolcd_operation_mode_delay_store(struct device *dev,

CC: Julia Lawall <Julia.Lawall@inria.fr>
CC: Nicolas Palix <nicolas.palix@imag.fr>
CC: cocci@inria.fr
Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
21 months agoMerge tag 'spi-fix-v6.8-merge-window' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Fri, 19 Jan 2024 20:50:09 +0000 (12:50 -0800)]
Merge tag 'spi-fix-v6.8-merge-window' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi

Pull spi fix from Mark Brown:
 "One simple fix for the device unbind path in the Coldfire driver.

  A conversion to use a combined get/enable helper missed removing a
  disable"

* tag 'spi-fix-v6.8-merge-window' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
  spi: coldfire-qspi: Remove an erroneous clk_disable_unprepare() from the remove function

21 months agoMerge tag 'sound-fix-6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai...
Linus Torvalds [Fri, 19 Jan 2024 20:30:29 +0000 (12:30 -0800)]
Merge tag 'sound-fix-6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound

Pull sound fixes from Takashi Iwai:
 "A collection of small fixes:

   - Lots of ASoC SOF fixes and related reworks

   - ASoC TAS codec fixes including DT updates

   - A few HD-audio quirks and regression fixes

   - Minor fixes for aloop, oxygen and scarlett2 mixer"

* tag 'sound-fix-6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (23 commits)
  ALSA: hda/realtek: Enable headset mic on Lenovo M70 Gen5
  ALSA: hda/realtek: Enable mute/micmute LEDs and limit mic boost on HP ZBook
  ALSA: hda/relatek: Enable Mute LED on HP Laptop 15s-fq2xxx
  ASoC: SOF: ipc4-loader: remove the CPC check warnings
  ASoC: SOF: ipc4-pcm: remove log message for LLP
  ALSA: hda: generic: Remove obsolete call to ledtrig_audio_get
  ALSA: scarlett2: Fix yet more -Wformat-truncation warnings
  ALSA: hda: Properly setup HDMI stream
  ASoC: audio-graph-card2: fix index check on graph_parse_node_multi_nm()
  ASoC: SOF: icp3-dtrace: Revert "Fix wrong kfree() usage"
  ALSA: oxygen: Fix right channel of capture volume mixer
  ALSA: aloop: Introduce a function to get if access is interleaved mode
  ASoC: mediatek: sof-common: Add NULL check for normal_link string
  ASoC: mediatek: mt8195: Remove afe-dai component and rework codec link
  ASoC: mediatek: mt8192: Check existence of dai_name before dereferencing
  ASoC: Intel: bxt_rt298: Fix kernel ops due to COMP_DUMMY change
  ASoC: Intel: bxt_da7219_max98357a: Fix kernel ops due to COMP_DUMMY change
  ASoC: codecs: rtq9128: Fix TDM enable and DAI format control flow
  ASoC: codecs: rtq9128: Fix PM_RUNTIME usage
  ASoC: tas2781: Add tas2563 into driver
  ...

21 months agostring: Remove strlcpy()
Kees Cook [Thu, 18 Jan 2024 20:31:55 +0000 (12:31 -0800)]
string: Remove strlcpy()

With all the users of strlcpy() removed[1] from the kernel, remove the
API, self-tests, and other references. Leave mentions in Documentation
(about its deprecation), and in checkpatch.pl (to help migrate host-only
tools/ usage). Long live strscpy().

Link: https://github.com/KSPP/linux/issues/89
Cc: Azeem Shaikh <azeemshaikh38@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Whitcroft <apw@canonical.com>
Cc: Joe Perches <joe@perches.com>
Cc: Dwaipayan Ray <dwaipayanray1@gmail.com>
Cc: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Cc: linux-hardening@vger.kernel.org
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
21 months agoMerge tag 'drm-next-2024-01-19' of git://anongit.freedesktop.org/drm/drm
Linus Torvalds [Fri, 19 Jan 2024 19:50:00 +0000 (11:50 -0800)]
Merge tag 'drm-next-2024-01-19' of git://anongit.freedesktop.org/drm/drm

Pull more drm fixes from Dave Airlie:
 "This is mostly amdgpu and xe fixes, with an amdkfd and nouveau fix
  thrown in.

  The amdgpu ones are just the usual couple of weeks of fixes. The xe
  ones are bunch of cleanups for the new xe driver, the fix you put in
  on the merge commit and the kconfig fix that was hiding the problem
  from me.

  amdgpu:
   - DSC fixes
   - DC resource pool fixes
   - OTG fix
   - DML2 fixes
   - Aux fix
   - GFX10 RLC firmware handling fix
   - Revert a broken workaround for SMU 13.0.2
   - DC writeback fix
   - Enable gfxoff when ROCm apps are active on gfx11 with the proper FW
     version

  amdkfd:
   - Fix dma-buf exports using GEM handles

  nouveau:
   - fix a unneeded WARN_ON triggering

  xe:
   - Fix for definition of wakeref_t
   - Fix for an error code aliasing
   - Fix for VM_UNBIND_ALL in the case there are no bound VMAs
   - Fixes for a number of __iomem address space mismatches reported by
     sparse
   - Fixes for the assignment of exec_queue priority
   - A Fix for skip_guc_pc not taking effect
   - Workaround for a build problem on GCC 11
   - A couple of fixes for error paths
   - Fix a Flat CCS compression metadata copy issue
   - Fix a misplace array bounds checking
   - Don't have display support depend on EXPERT (as discussed on IRC)"

* tag 'drm-next-2024-01-19' of git://anongit.freedesktop.org/drm/drm: (71 commits)
  nouveau/vmm: don't set addr on the fail path to avoid warning
  drm/amdgpu: Enable GFXOFF for Compute on GFX11
  drm/amd/display: Drop 'acrtc' and add 'new_crtc_state' NULL check for writeback requests.
  drm/amdgpu: revert "Adjust removal control flow for smu v13_0_2"
  drm/amdkfd: init drm_client with funcs hook
  drm/amd/display: Fix a switch statement in populate_dml_output_cfg_from_stream_state()
  drm/amdgpu: Fix the null pointer when load rlc firmware
  drm/amd/display: Align the returned error code with legacy DP
  drm/amd/display: Fix DML2 watermark calculation
  drm/amd/display: Clear OPTC mem select on disable
  drm/amd/display: Port DENTIST hang and TDR fixes to OTG disable W/A
  drm/amd/display: Add logging resource checks
  drm/amd/display: Init link enc resources in dc_state only if res_pool presents
  drm/amd/display: Fix late derefrence 'dsc' check in 'link_set_dsc_pps_packet()'
  drm/amd/display: Avoid enum conversion warning
  drm/amd/pm: Fix smuv13.0.6 current clock reporting
  drm/amd/pm: Add error log for smu v13.0.6 reset
  drm/amdkfd: Fix 'node' NULL check in 'svm_range_get_range_boundaries()'
  drm/amdgpu: drop exp hw support check for GC 9.4.3
  drm/amdgpu: move debug options init prior to amdgpu device init
  ...

21 months agoMerge tag 'for-v6.8-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux...
Linus Torvalds [Fri, 19 Jan 2024 19:34:19 +0000 (11:34 -0800)]
Merge tag 'for-v6.8-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power-supply

Pull power supply and reset updates from Sebastian Reichel:
 "New features:
   - bq24190: Add support for BQ24296 charger

  Cleanups:
   - all reset drivers: Stop using module_platform_driver_probe()
   - gpio-restart: use devm_register_sys_off_handler
   - pwr-mlxbf: support graceful reboot
   - cw2015: correct time_to_empty units
   - qcom-battmgr: Fix driver initialization sequence
   - bq27xxx: Start/Stop delayed work in suspend/resume
   - minor cleanups and fixes"

* tag 'for-v6.8-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power-supply: (33 commits)
  power: supply: bq24190_charger: Fix "initializer element is not constant" error
  power: supply: bq24190_charger: Add support for BQ24296
  dt-bindings: power: supply: bq24190: Add BQ24296 compatible
  dt-bindings: power: reset: xilinx: Rename node names in examples
  power: supply: qcom_battmgr: Register the power supplies after PDR is up
  dt-bindings: power: reset: qcom-pon: fix inconsistent example
  power: supply: Fix null pointer dereference in smb2_probe
  power: reset: at91: Drop '__init' from at91_wakeup_status()
  power: supply: Use multiple MODULE_AUTHOR statements
  power: supply: Fix indentation and some other warnings
  power: reset: gpio-restart: Use devm_register_sys_off_handler()
  power: supply: bq256xx: fix some problem in bq256xx_hw_init
  power: supply: cw2015: correct time_to_empty units in sysfs
  power: reset: at91-sama5d2_shdwc: Convert to platform remove callback returning void
  power: reset: at91-reset: Convert to platform remove callback returning void
  power: reset: tps65086-restart: Convert to platform remove callback returning void
  power: reset: syscon-poweroff: Convert to platform remove callback returning void
  power: reset: rmobile-reset: Convert to platform remove callback returning void
  power: reset: restart-poweroff: Convert to platform remove callback returning void
  power: reset: regulator-poweroff: Convert to platform remove callback returning void
  ...

21 months agoMerge tag 'apparmor-pr-2024-01-18' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Fri, 19 Jan 2024 18:53:55 +0000 (10:53 -0800)]
Merge tag 'apparmor-pr-2024-01-18' of git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor

Pull AppArmor updates from John Johansen:
 "This adds a single feature, switch the hash used to check policy from
  sha1 to sha256

  There are fixes for two memory leaks, and refcount bug and a potential
  crash when a profile name is empty. Along with a couple minor code
  cleanups.

  Summary:

  Features
   - switch policy hash from sha1 to sha256

  Bug Fixes
   - Fix refcount leak in task_kill
   - Fix leak of pdb objects and trans_table
   - avoid crash when parse profie name is empty

  Cleanups
   - add static to stack_msg and nulldfa
   - more kernel-doc cleanups"

* tag 'apparmor-pr-2024-01-18' of git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor:
  apparmor: Fix memory leak in unpack_profile()
  apparmor: avoid crash when parsed profile name is empty
  apparmor: fix possible memory leak in unpack_trans_table
  apparmor: free the allocated pdb objects
  apparmor: Fix ref count leak in task_kill
  apparmor: cleanup network hook comments
  apparmor: add missing params to aa_may_ptrace kernel-doc comments
  apparmor: declare nulldfa as static
  apparmor: declare stack_msg as static
  apparmor: switch SECURITY_APPARMOR_HASH from sha1 to sha256

21 months agoMerge tag 'ceph-for-6.8-rc1' of https://github.com/ceph/ceph-client
Linus Torvalds [Fri, 19 Jan 2024 17:58:55 +0000 (09:58 -0800)]
Merge tag 'ceph-for-6.8-rc1' of https://github.com/ceph/ceph-client

Pull ceph updates from Ilya Dryomov:
 "Assorted CephFS fixes and cleanups with nothing standing out"

* tag 'ceph-for-6.8-rc1' of https://github.com/ceph/ceph-client:
  ceph: get rid of passing callbacks in __dentry_leases_walk()
  ceph: d_obtain_{alias,root}(ERR_PTR(...)) will do the right thing
  ceph: fix invalid pointer access if get_quota_realm return ERR_PTR
  ceph: remove duplicated code in ceph_netfs_issue_read()
  ceph: send oldest_client_tid when renewing caps
  ceph: rename create_session_open_msg() to create_session_full_msg()
  ceph: select FS_ENCRYPTION_ALGS if FS_ENCRYPTION
  ceph: fix deadlock or deadcode of misusing dget()
  ceph: try to allocate a smaller extent map for sparse read
  libceph: remove MAX_EXTENTS check for sparse reads
  ceph: reinitialize mds feature bit even when session in open
  ceph: skip reconnecting if MDS is not ready

21 months agoMerge tag 'xfs-6.8-merge-4' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux
Linus Torvalds [Fri, 19 Jan 2024 17:57:08 +0000 (09:57 -0800)]
Merge tag 'xfs-6.8-merge-4' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux

Pull xfs fix from Chandan Babu:

 - Fix per-inode space accounting bug

* tag 'xfs-6.8-merge-4' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
  xfs: fix backwards logic in xfs_bmap_alloc_account

21 months agoMerge tag '6.8-rc-smb-server-fixes-part2' of git://git.samba.org/ksmbd
Linus Torvalds [Fri, 19 Jan 2024 17:31:59 +0000 (09:31 -0800)]
Merge tag '6.8-rc-smb-server-fixes-part2' of git://git.samba.org/ksmbd

Pull more smb server updates from Steve French:

 - Fix for incorrect oplock break on directories when leases disabled

 - UAF fix for race between create and destroy of tcp connection

 - Important session setup SPNEGO fix

 - Update ksmbd feature status summary

* tag '6.8-rc-smb-server-fixes-part2' of git://git.samba.org/ksmbd:
  ksmbd: only v2 leases handle the directory
  ksmbd: fix UAF issue in ksmbd_tcp_new_connection()
  ksmbd: validate mech token in session setup
  ksmbd: update feature status in documentation

21 months agoMerge tag 'vfs-6.8.netfs' of gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs
Linus Torvalds [Fri, 19 Jan 2024 17:10:23 +0000 (09:10 -0800)]
Merge tag 'vfs-6.8.netfs' of gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs

Pull netfs updates from Christian Brauner:
 "This extends the netfs helper library that network filesystems can use
  to replace their own implementations. Both afs and 9p are ported. cifs
  is ready as well but the patches are way bigger and will be routed
  separately once this is merged. That will remove lots of code as well.

  The overal goal is to get high-level I/O and knowledge of the page
  cache and ouf of the filesystem drivers. This includes knowledge about
  the existence of pages and folios

  The pull request converts afs and 9p. This removes about 800 lines of
  code from afs and 300 from 9p. For 9p it is now possible to do writes
  in larger than a page chunks. Additionally, multipage folio support
  can be turned on for 9p. Separate patches exist for cifs removing
  another 2000+ lines. I've included detailed information in the
  individual pulls I took.

  Summary:

   - Add NFS-style (and Ceph-style) locking around DIO vs buffered I/O
     calls to prevent these from happening at the same time.

   - Support for direct and unbuffered I/O.

   - Support for write-through caching in the page cache.

   - O_*SYNC and RWF_*SYNC writes use write-through rather than writing
     to the page cache and then flushing afterwards.

   - Support for write-streaming.

   - Support for write grouping.

   - Skip reads for which the server could only return zeros or EOF.

   - The fscache module is now part of the netfs library and the
     corresponding maintainer entry is updated.

   - Some helpers from the fscache subsystem are renamed to mark them as
     belonging to the netfs library.

   - Follow-up fixes for the netfs library.

   - Follow-up fixes for the 9p conversion"

* tag 'vfs-6.8.netfs' of gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs: (50 commits)
  netfs: Fix wrong #ifdef hiding wait
  cachefiles: Fix signed/unsigned mixup
  netfs: Fix the loop that unmarks folios after writing to the cache
  netfs: Fix interaction between write-streaming and cachefiles culling
  netfs: Count DIO writes
  netfs: Mark netfs_unbuffered_write_iter_locked() static
  netfs: Fix proc/fs/fscache symlink to point to "netfs" not "../netfs"
  netfs: Rearrange netfs_io_subrequest to put request pointer first
  9p: Use length of data written to the server in preference to error
  9p: Do a couple of cleanups
  9p: Fix initialisation of netfs_inode for 9p
  cachefiles: Fix __cachefiles_prepare_write()
  9p: Use netfslib read/write_iter
  afs: Use the netfs write helpers
  netfs: Export the netfs_sreq tracepoint
  netfs: Optimise away reads above the point at which there can be no data
  netfs: Implement a write-through caching option
  netfs: Provide a launder_folio implementation
  netfs: Provide a writepages implementation
  netfs, cachefiles: Pass upper bound length to allow expansion
  ...

21 months agocifs: update iface_last_update on each query-and-update
Shyam Prasad N [Wed, 3 Jan 2024 12:51:49 +0000 (12:51 +0000)]
cifs: update iface_last_update on each query-and-update

iface_last_update was an unused field when it was introduced.
Later, when we had periodic update of server interface list,
this field was used regularly to decide when to update next.

However, with the new logic of updating the interfaces, it
becomes crucial that this field be updated whenever
parse_server_interfaces runs successfully.

This change updates this field when either the server does
not support query of interfaces; so that we do not query
the interfaces repeatedly. It also updates the field when
the function reaches the end.

Fixes: aa45dadd34e4 ("cifs: change iface_list from array to sorted linked list")
Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
21 months agocifs: handle servers that still advertise multichannel after disabling
Shyam Prasad N [Tue, 2 Jan 2024 13:14:46 +0000 (13:14 +0000)]
cifs: handle servers that still advertise multichannel after disabling

Some servers like Azure SMB servers always advertise multichannel
capability in server capabilities list. Such servers return error
STATUS_NOT_IMPLEMENTED for ioctl calls to query server interfaces,
and expect clients to consider that as a sign that they do not support
multichannel.

We already handled this at mount time. Soon after the tree connect,
we query server interfaces. And when server returned STATUS_NOT_IMPLEMENTED,
we kept interface list as empty. When cifs_try_adding_channels gets
called, it would not find any interfaces, so will not add channels.

For the case where an active multichannel mount exists, and multichannel
is disabled by such a server, this change will now allow the client
to disable secondary channels on the mount. It will check the return
status of query server interfaces call soon after a tree reconnect.
If the return status is EOPNOTSUPP, then instead of the check to add
more channels, we'll disable the secondary channels instead.

For better code reuse, this change also moves the common code for
disabling multichannel to a helper function.

Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
21 months agocifs: new mount option called retrans
Shyam Prasad N [Wed, 17 Jan 2024 06:09:16 +0000 (06:09 +0000)]
cifs: new mount option called retrans

We have several places in the code where we treat the
error -EAGAIN very differently. Some code retry for
arbitrary number of times.

Introducing this new mount option named "retrans", so
that all these handlers of -EAGAIN can retry a fixed
number of times. This applies only to soft mounts.

Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
21 months agocifs: reschedule periodic query for server interfaces
Shyam Prasad N [Wed, 3 Jan 2024 08:36:22 +0000 (08:36 +0000)]
cifs: reschedule periodic query for server interfaces

Today, we schedule periodic query for server interfaces
once every 10 minutes once a tree connection has been
established. Recent change to handle disabling of
multichannel disabled this delayed work.

This change reenables it following a reconnect, and
the server advertises multichannel.

Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>