]> www.infradead.org Git - users/griffoul/linux.git/log
users/griffoul/linux.git
12 months agobcachefs: kill redundant is_vmalloc_addr()
Kent Overstreet [Sun, 1 Sep 2024 19:09:11 +0000 (15:09 -0400)]
bcachefs: kill redundant is_vmalloc_addr()

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
12 months agobcachefs: convert __bch2_encrypt_bio() to darray
Kent Overstreet [Sun, 1 Sep 2024 19:33:17 +0000 (15:33 -0400)]
bcachefs: convert __bch2_encrypt_bio() to darray

like the previous patch, kill use of bare arrays; the encryption code
likes to work in big batches, so this is a small performance
improvement.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
12 months agobcachefs: do_encrypt() now handles allocation failures
Kent Overstreet [Sun, 1 Sep 2024 19:24:11 +0000 (15:24 -0400)]
bcachefs: do_encrypt() now handles allocation failures

convert to darray, and add a fallback when allocation fails

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
12 months agobcachefs: Add pinned to btree cache not freed counters
Kent Overstreet [Sun, 1 Sep 2024 17:36:42 +0000 (13:36 -0400)]
bcachefs: Add pinned to btree cache not freed counters

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
13 months agobcachefs: Annotate bch_replicas_entry_{v0,v1} with __counted_by()
Thorsten Blum [Mon, 26 Aug 2024 10:11:36 +0000 (12:11 +0200)]
bcachefs: Annotate bch_replicas_entry_{v0,v1} with __counted_by()

Add the __counted_by compiler attribute to the flexible array members
devs to improve access bounds-checking via CONFIG_UBSAN_BOUNDS and
CONFIG_FORTIFY_SOURCE.

Increment nr_devs before adding a new device to the devs array and
adjust the array indexes accordingly. Add a helper macro for adding a
new device.

In bch2_journal_read(), explicitly set nr_devs to 0.

Signed-off-by: Thorsten Blum <thorsten.blum@toblux.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
13 months agobcachefs: support idmap mounts
Hongbo Li [Sat, 24 Aug 2024 01:27:24 +0000 (09:27 +0800)]
bcachefs: support idmap mounts

We enable idmapped mounts for bcachefs. Here, we just pass down
the user_namespace argument from the VFS methods to the relevant
helpers.

The idmap test in bcachefs is as following:

```
1. losetup /dev/loop1 bcachefs.img
2. ./bcachefs format /dev/loop1
3. mount -t bcachefs /dev/loop1 /mnt/bcachefs/
4. ./mount-idmapped --map-mount b:0:1000:1 /mnt/bcachefs /mnt/idmapped1/

ll /mnt/bcachefs
total 2
drwx------. 2 root root    0 Jun 14 14:10 lost+found
-rw-r--r--. 1 root root 1945 Jun 14 14:12 profile

ll /mnt/idmapped1/

total 2
drwx------. 2 1000 1000    0 Jun 14 14:10 lost+found
-rw-r--r--. 1 1000 1000 1945 Jun 14 14:12 profile

Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
Reviewed-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
13 months agobcachefs: Annotate struct bch_xattr with __counted_by()
Thorsten Blum [Sat, 24 Aug 2024 13:57:41 +0000 (15:57 +0200)]
bcachefs: Annotate struct bch_xattr with __counted_by()

Add the __counted_by compiler attribute to the flexible array member
x_name to improve access bounds-checking via CONFIG_UBSAN_BOUNDS and
CONFIG_FORTIFY_SOURCE.

Signed-off-by: Thorsten Blum <thorsten.blum@toblux.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
13 months agobcachefs: Switch gc bucket array to a genradix
Kent Overstreet [Sat, 24 Aug 2024 15:38:21 +0000 (11:38 -0400)]
bcachefs: Switch gc bucket array to a genradix

A user with a 30 tb device is overflowing the INT_MAX limit on vmalloc
allocations...

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
13 months agobcachefs: darray: convert to alloc_hooks()
Kent Overstreet [Thu, 22 Aug 2024 07:50:22 +0000 (03:50 -0400)]
bcachefs: darray: convert to alloc_hooks()

better memory allocation profiling support

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
13 months agobcachefs: Convert to use jiffies macros
Chen Yufan [Thu, 22 Aug 2024 02:57:31 +0000 (10:57 +0800)]
bcachefs: Convert to use jiffies macros

Use jiffies macros instead of using jiffies directly to handle wraparound.

Signed-off-by: Chen Yufan <chenyufan@vivo.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
13 months agobcachefs: Refactor bch2_bset_fix_lookup_table
Alan Huang [Thu, 15 Aug 2024 15:40:53 +0000 (23:40 +0800)]
bcachefs: Refactor bch2_bset_fix_lookup_table

bch2_bset_fix_lookup_table is too complicated to be easily understood,
the comment "l now > where" there is also incorrect when where ==
t->end_offset. This patch therefore refactor the function, the idea is
that when where >= rw_aux_tree(b, t)[t->size - 1].offset, we don't need
to adjust the rw aux tree.

Signed-off-by: Alan Huang <mmpgouride@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
13 months agobcachefs: Assert that we don't lock nodes when !trans->locked
Kent Overstreet [Sun, 30 Jun 2024 13:25:56 +0000 (09:25 -0400)]
bcachefs: Assert that we don't lock nodes when !trans->locked

We rely on the trans->locked to know if a trans has nodes locked for
assertions about deadlocks; there can't be more than one trans in the
same process that is locked.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
13 months agobcachefs: Do not check folio_has_private()
Matthew Wilcox (Oracle) [Tue, 20 Aug 2024 04:10:11 +0000 (05:10 +0100)]
bcachefs: Do not check folio_has_private()

folio_has_private() is an attractive nuisance; filesystem authors
generally don't realise that it actually checks two flags (one of which
is never set by bcachefs).  There's no need to check the private flag at
all; for folios owned by bcachefs, we know that folio->private is NULL
when the private flag is clear and non-NULL when the private flag is set.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
13 months agobcachefs: bch2_time_stats_reset()
Kent Overstreet [Mon, 19 Aug 2024 19:33:38 +0000 (15:33 -0400)]
bcachefs: bch2_time_stats_reset()

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
13 months agobcachefs: Drop memalloc_nofs_save() in bch2_btree_node_mem_alloc()
Kent Overstreet [Mon, 19 Aug 2024 19:11:20 +0000 (15:11 -0400)]
bcachefs: Drop memalloc_nofs_save() in bch2_btree_node_mem_alloc()

It's really not needed: the only locks used here are the btree cache
lock, which we drop for GFP_WAIT allocations, and btree node locks - but
we also drop those for GFP_WAIT allocations.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
13 months agobcachefs: Simplify bch2_xattr_emit() implementation
Youling Tang [Thu, 15 Aug 2024 08:57:44 +0000 (16:57 +0800)]
bcachefs: Simplify bch2_xattr_emit() implementation

Use helper functions to make code more readable.

Similar to commit a5488f29835c ("fs: simplify ->listxattr() implementation")

Signed-off-by: Youling Tang <tangyouling@kylinos.cn>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
13 months agobcachefs: drop unused posix acl handlers
Youling Tang [Thu, 15 Aug 2024 08:57:43 +0000 (16:57 +0800)]
bcachefs: drop unused posix acl handlers

Remove struct nop_posix_acl_{access,default} for bcachefs filesystem
that don't depend on the xattr handler in their inode->i_op->listxattr()
method in any way. There's nothing more to do than to simply remove the
handler. It's been effectively unused ever since we introduced the new
posix acl api. See [1] for details.

Link [1]: https://patchwork.kernel.org/project/linux-fsdevel/cover/20230125-fs-acl-remove-generic-xattr-handlers-v3-0-f760cc58967d@kernel.org/

Signed-off-by: Youling Tang <tangyouling@kylinos.cn>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
13 months agobcachefs: Remove unused parameter
Alan Huang [Wed, 14 Aug 2024 14:20:07 +0000 (22:20 +0800)]
bcachefs: Remove unused parameter

iter here is unused, remove it.

Signed-off-by: Alan Huang <mmpgouride@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
13 months agobcachefs: Remove the prev array stuff
Alan Huang [Mon, 12 Aug 2024 09:04:04 +0000 (17:04 +0800)]
bcachefs: Remove the prev array stuff

After reducing the search range when building the aux tree, the prev array
stuff is no longer useful, so remove it.

Signed-off-by: Alan Huang <mmpgouride@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
13 months agobcachefs: Minimize the search range used to calculate the mantissa
Alan Huang [Mon, 12 Aug 2024 08:06:09 +0000 (16:06 +0800)]
bcachefs: Minimize the search range used to calculate the mantissa

When the search key's mantissa is larger than the node i's, we know that
the search key is larger than the first key of the cacheline corresponding
to node i, so that when we are calculating the mantissa of right side
nodes of node i, the left side of the search range can be the first key
of node i. Once the search range is minimized, the mantissa we are
calculating can have more useful bits, thus reduce the slow path
comparison. Besides, we can now remove all the prev array stuff.

Signed-off-by: Alan Huang <mmpgouride@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
13 months agobcachefs: Convert open-coded extra computation to helper
Alan Huang [Sat, 10 Aug 2024 16:11:46 +0000 (00:11 +0800)]
bcachefs: Convert open-coded extra computation to helper

This patch replaces open-coded extra computation to eytzinger1_extra.

Signed-off-by: Alan Huang <mmpgouride@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
13 months agobcachefs: Remove dead code in __build_ro_aux_tree
Alan Huang [Sat, 10 Aug 2024 15:51:40 +0000 (23:51 +0800)]
bcachefs: Remove dead code in __build_ro_aux_tree

This logic is no longer useful since commit
3ce8b463e3e0 ("bcachefs: kill bset_tree->max_key"), so remove it.

Signed-off-by: Alan Huang <mmpgouride@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
13 months agobcachefs: Remove unused parameter of bkey_mantissa_bits_dropped
Alan Huang [Sat, 10 Aug 2024 16:52:25 +0000 (00:52 +0800)]
bcachefs: Remove unused parameter of bkey_mantissa_bits_dropped

The idx parameter of bkey_mantissa_bits_dropped is unused, remove it.

Signed-off-by: Alan Huang <mmpgouride@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
13 months agobcachefs: Remove unused parameter of bkey_mantissa
Alan Huang [Sat, 10 Aug 2024 16:52:24 +0000 (00:52 +0800)]
bcachefs: Remove unused parameter of bkey_mantissa

The idx parameter of bkey_mantissa became unused since commit
b904a7991802 ("bcachefs: Go back to 16 bit mantissa bkey floats"),
so remove it.

Signed-off-by: Alan Huang <mmpgouride@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
13 months agobcachefs: bch2_sb_nr_devices()
Kent Overstreet [Thu, 8 Aug 2024 15:40:47 +0000 (11:40 -0400)]
bcachefs: bch2_sb_nr_devices()

factoring out a helper

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
13 months agobcachefs: trivial open_bucket_add_buckets() cleanup
Kent Overstreet [Wed, 7 Aug 2024 19:44:57 +0000 (15:44 -0400)]
bcachefs: trivial open_bucket_add_buckets() cleanup

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
13 months agobcachefs: Fix a spelling error in docs
Xiaxi Shen [Wed, 7 Aug 2024 07:10:05 +0000 (00:10 -0700)]
bcachefs: Fix a spelling error in docs

Signed-off-by: Xiaxi Shen <shenxiaxi26@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
13 months agobcachefs: promote_whole_extents is now a normal option
Kent Overstreet [Thu, 1 Aug 2024 03:56:04 +0000 (23:56 -0400)]
bcachefs: promote_whole_extents is now a normal option

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
13 months agobcachefs: Move rebalance_status out of sysfs/internal
Kent Overstreet [Thu, 1 Aug 2024 03:39:49 +0000 (23:39 -0400)]
bcachefs: Move rebalance_status out of sysfs/internal

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
13 months agobcachefs: remove the unused parameter in macro bkey_crc_next
Julian Sun [Sun, 21 Jul 2024 12:55:20 +0000 (08:55 -0400)]
bcachefs: remove the unused parameter in macro bkey_crc_next

In the macro definition of bkey_crc_next, five parameters
were accepted, but only four of them were used. Let's remove
the unused one.

The patch has only passed compilation tests, but it should be fine.

Signed-off-by: Julian Sun <sunjunchao2870@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
13 months agobcachefs: fix macro definition allocate_dropping_locks
Julian Sun [Sun, 21 Jul 2024 12:45:47 +0000 (08:45 -0400)]
bcachefs: fix macro definition allocate_dropping_locks

The macro allocate_dropping_locks accepts a parameter _trans,
but it was not used, rather the variable trans was directly used,
which may be a local variable inside a function that calls the macros.

Signed-off-by: Julian Sun <sunjunchao2870@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
13 months agobcachefs: fix macro definition allocate_dropping_locks_errcode
Julian Sun [Sun, 21 Jul 2024 12:44:24 +0000 (08:44 -0400)]
bcachefs: fix macro definition allocate_dropping_locks_errcode

The macro allocate_dropping_locks_errocode accepts a parameter _trans,
but it was not used, rather the variable trans was directly used,
which may be a local variable inside a function that calls the macros.

Signed-off-by: Julian Sun <sunjunchao2870@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
13 months agobcachefs: remove the unused macro definition
Julian Sun [Sun, 21 Jul 2024 12:43:24 +0000 (08:43 -0400)]
bcachefs: remove the unused macro definition

macro bch2_kthread_wait_event_ioclock_timeout is no longer used,
let's remove it.

The patch has passed compilation test.

Signed-off-by: Julian Sun <sunjunchao2870@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
13 months agobcachefs: quota_reserve_range() -> for_each_btree_key_in_subvolume_upto
Kent Overstreet [Wed, 17 Jul 2024 17:24:28 +0000 (13:24 -0400)]
bcachefs: quota_reserve_range() -> for_each_btree_key_in_subvolume_upto

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
13 months agobcachefs: bch2_folio_set() -> for_each_btree_key_in_subvolume_upto
Kent Overstreet [Wed, 17 Jul 2024 17:34:35 +0000 (13:34 -0400)]
bcachefs: bch2_folio_set() -> for_each_btree_key_in_subvolume_upto

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
13 months agobcachefs: range_has_data() -> for_each_btree_key_in_subvolume_upto
Kent Overstreet [Wed, 17 Jul 2024 17:30:23 +0000 (13:30 -0400)]
bcachefs: range_has_data() -> for_each_btree_key_in_subvolume_upto

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
13 months agobcachefs: bch2_seek_hole() -> for_each_btree_key_in_subvolume_upto
Kent Overstreet [Wed, 17 Jul 2024 17:28:23 +0000 (13:28 -0400)]
bcachefs: bch2_seek_hole() -> for_each_btree_key_in_subvolume_upto

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
13 months agobcachefs: bch2_seek_data() -> for_each_btree_key_in_subvolume_upto
Kent Overstreet [Wed, 17 Jul 2024 17:26:54 +0000 (13:26 -0400)]
bcachefs: bch2_seek_data() -> for_each_btree_key_in_subvolume_upto

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
13 months agobcachefs: bch2_xattr_list() -> for_each_btree_key_in_subvolume_upto
Kent Overstreet [Wed, 17 Jul 2024 17:24:28 +0000 (13:24 -0400)]
bcachefs: bch2_xattr_list() -> for_each_btree_key_in_subvolume_upto

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
13 months agobcachefs: bch2_readdir() -> for_each_btree_key_in_subvolume_upto
Kent Overstreet [Wed, 17 Jul 2024 17:24:28 +0000 (13:24 -0400)]
bcachefs: bch2_readdir() -> for_each_btree_key_in_subvolume_upto

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
13 months agobcachefs: for_each_btree_key_in_subvolume_upto()
Kent Overstreet [Wed, 17 Jul 2024 16:59:51 +0000 (12:59 -0400)]
bcachefs: for_each_btree_key_in_subvolume_upto()

New helper for looping over keys in a given subvolume

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
13 months agobcachefs: bch2_fiemap(): call trans_begin() on every loop iter
Kent Overstreet [Wed, 17 Jul 2024 15:50:54 +0000 (11:50 -0400)]
bcachefs: bch2_fiemap(): call trans_begin() on every loop iter

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
13 months agobcachefs: bchfs_read(): call trans_begin() on every loop iter
Kent Overstreet [Wed, 17 Jul 2024 15:47:01 +0000 (11:47 -0400)]
bcachefs: bchfs_read(): call trans_begin() on every loop iter

Same as the recent change for __bch2_read(); also, kill now unnecessary
btree_trans_too_many_iters() calls.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
13 months agobcachefs: kill bch2_btree_iter_peek_and_restart()
Kent Overstreet [Wed, 17 Jul 2024 15:42:11 +0000 (11:42 -0400)]
bcachefs: kill bch2_btree_iter_peek_and_restart()

dead code

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
13 months agobcachefs: Btree path tracepoints
Kent Overstreet [Wed, 10 Aug 2022 23:57:46 +0000 (19:57 -0400)]
bcachefs: Btree path tracepoints

Fastpath tracepoints, rarely needed, only enabled with
CONFIG_BCACHEFS_PATH_TRACEPOINTS.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
13 months agobcachefs: Add check for btree_path ref overflow
Kent Overstreet [Tue, 16 Jul 2024 21:23:10 +0000 (17:23 -0400)]
bcachefs: Add check for btree_path ref overflow

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
13 months agobcachefs: Mark bch_inode_info as SLAB_ACCOUNT
Youling Tang [Wed, 3 Jul 2024 07:09:55 +0000 (15:09 +0800)]
bcachefs: Mark bch_inode_info as SLAB_ACCOUNT

After commit 230e9fc28604 ("slab: add SLAB_ACCOUNT flag"), we need to mark
the inode cache as SLAB_ACCOUNT, similar to commit 5d097056c9a0 ("kmemcg:
account for certain kmem allocations to memcg")

Signed-off-by: Youling Tang <tangyouling@kylinos.cn>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
13 months agobcachefs: allocate inode by using alloc_inode_sb()
Youling Tang [Tue, 16 Jul 2024 02:58:16 +0000 (10:58 +0800)]
bcachefs: allocate inode by using alloc_inode_sb()

The inode allocation is supposed to use alloc_inode_sb(), so convert
kmem_cache_alloc() to alloc_inode_sb().

It will also fix [1] to avoid the NULL pointer dereference BUG in
list_lru_add() when CONFIG_MEMCG is enabled.

Links:
[1]: https://lore.kernel.org/all/20589721-46c0-4344-b2ef-6ab48bbe2ea5@linux.dev/
[2]: https://lore.kernel.org/all/7db60e36-9c96-4938-a28d-a9745e287386@linux.dev/

Fixes: 86d81ec5f5f0 ("bcachefs: Mark bch_inode_info as SLAB_ACCOUNT")
Signed-off-by: Youling Tang <tangyouling@kylinos.cn>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
13 months agobcachefs: Opt_durability can now be set via bch2_opt_set_sb()
Kent Overstreet [Mon, 15 Jul 2024 23:54:51 +0000 (19:54 -0400)]
bcachefs: Opt_durability can now be set via bch2_opt_set_sb()

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
13 months agobcachefs: bch2_opt_set_sb() can now set (some) device options
Kent Overstreet [Mon, 15 Jul 2024 23:26:46 +0000 (19:26 -0400)]
bcachefs: bch2_opt_set_sb() can now set (some) device options

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
13 months agobcachefs: data_allowed is now an opts.h option
Kent Overstreet [Mon, 15 Jul 2024 20:53:49 +0000 (16:53 -0400)]
bcachefs: data_allowed is now an opts.h option

need this so cmd_option in userspace can handle it

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
13 months agobcachefs: Annotate struct bucket_array with __counted_by()
Thorsten Blum [Wed, 21 Aug 2024 16:29:22 +0000 (18:29 +0200)]
bcachefs: Annotate struct bucket_array with __counted_by()

Add the __counted_by compiler attribute to the flexible array member
bucket to improve access bounds-checking via CONFIG_UBSAN_BOUNDS and
CONFIG_FORTIFY_SOURCE.

Signed-off-by: Thorsten Blum <thorsten.blum@toblux.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
13 months agobcachefs: Fix format specifier in bch2_btree_key_cache_to_text()
Nathan Chancellor [Wed, 21 Aug 2024 20:43:03 +0000 (13:43 -0700)]
bcachefs: Fix format specifier in bch2_btree_key_cache_to_text()

When building for a 32-bit architecture, for which 'size_t' is
'unsigned int', there is a compiler warning due to use of '%lu':

  In file included from fs/bcachefs/vstructs.h:5,
                   from fs/bcachefs/bcachefs_format.h:80,
                   from fs/bcachefs/bcachefs.h:207,
                   from fs/bcachefs/btree_key_cache.c:3:
  fs/bcachefs/btree_key_cache.c: In function 'bch2_btree_key_cache_to_text':
  fs/bcachefs/btree_key_cache.c:795:25: error: format '%lu' expects argument of type 'long unsigned int', but argument 3 has type 'size_t' {aka 'unsigned int'} [-Werror=format=]
    795 |         prt_printf(out, "pending:\t%lu\r\n",            per_cpu_sum(bc->nr_pending));
        |                         ^~~~~~~~~~~~~~~~~~~
  fs/bcachefs/util.h:78:63: note: in definition of macro 'prt_printf'
     78 | #define prt_printf(_out, ...)           bch2_prt_printf(_out, __VA_ARGS__)
        |                                                               ^~~~~~~~~~~
  fs/bcachefs/btree_key_cache.c:795:38: note: format string is defined here
    795 |         prt_printf(out, "pending:\t%lu\r\n",            per_cpu_sum(bc->nr_pending));
        |                                    ~~^
        |                                      |
        |                                      long unsigned int
        |                                    %u
  cc1: all warnings being treated as errors

Use the proper specifier, '%zu', to resolve the warning.

Fixes: e447e49977b8 ("bcachefs: key cache can now allocate from pending")
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
13 months agobcachefs: key cache can now allocate from pending
Kent Overstreet [Thu, 13 Jun 2024 19:35:47 +0000 (15:35 -0400)]
bcachefs: key cache can now allocate from pending

btree_trans objects can hold the btree_trans_barrier srcu read lock for
an extended amount of time (they shouldn't, but it's difficult to
guarantee).

the srcu barrier blocks memory reclaim, so to avoid too many stranded
key cache items, this uses the new pending_rcu_items to allocate from
pending items - like we did before, but now without a global lock on the
key cache.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
13 months agobcachefs: Rip out freelists from btree key cache
Kent Overstreet [Sun, 9 Jun 2024 02:32:40 +0000 (22:32 -0400)]
bcachefs: Rip out freelists from btree key cache

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
13 months agobcachefs: rcu_pending now works in userspace
Kent Overstreet [Fri, 23 Aug 2024 22:21:31 +0000 (18:21 -0400)]
bcachefs: rcu_pending now works in userspace

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
13 months agobcachefs: rcu_pending
Kent Overstreet [Tue, 11 Jun 2024 00:47:03 +0000 (20:47 -0400)]
bcachefs: rcu_pending

Generic data structure for explicitly tracking pending RCU items,
allowing items to be dequeued (i.e. allocate from items pending
freeing). Works with conventional RCU and SRCU, and possibly other RCU
flavors in the future, meaning this can serve as a more generic
replacement for SLAB_TYPESAFE_BY_RCU.

Pending items are tracked in radix trees; if memory allocation fails, we
fall back to linked lists.

A rcu_pending is initialized with a callback, which is invoked when
pending items's grace periods have expired. Two types of callback
processing are handled specially:

- RCU_PENDING_KVFREE_FN

  New backend for kvfree_rcu(). Slightly faster, and eliminates the
  synchronize_rcu() slowpath in kvfree_rcu_mightsleep() - instead, an
  rcu_head is allocated if we don't have one and can't use the radix
  tree

  TODO:
  - add a shrinker (as in the existing kvfree_rcu implementation) so that
    memory reclaim can free expired objects if callback processing isn't
    keeping up, and to expedite a grace period if we're under memory
    pressure and too much memory is stranded by RCU

  - add a counter for amount of memory pending

- RCU_PENDING_CALL_RCU_FN

  Accelerated backend for call_rcu() - pending callbacks are tracked in
  a radix tree to eliminate linked list overhead.

to serve as replacement backends for kvfree_rcu() and call_rcu(); these
may be of interest to other uses (e.g. SLAB_TYPESAFE_BY_RCU users).

Note:

Internally, we're using a single rearming call_rcu() callback for
notifications from the core RCU subsystem for notifications when objects
are ready to be processed.

Ideally we would be getting a callback every time a grace period
completes for which we have objects, but that would require multiple
rcu_heads in flight, and since the number of gp sequence numbers with
uncompleted callbacks is not bounded, we can't do that yet.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
13 months agolib/generic-radix-tree.c: add preallocation
Kent Overstreet [Sun, 11 Aug 2024 03:14:44 +0000 (23:14 -0400)]
lib/generic-radix-tree.c: add preallocation

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
13 months agolib/generic-radix-tree.c: genradix_ptr_inlined()
Kent Overstreet [Mon, 17 Jun 2024 23:00:33 +0000 (19:00 -0400)]
lib/generic-radix-tree.c: genradix_ptr_inlined()

Provide an inlined fast path

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
13 months agobcachefs: Fix deadlock in __wait_on_freeing_inode()
Kent Overstreet [Fri, 16 Aug 2024 16:31:53 +0000 (12:31 -0400)]
bcachefs: Fix deadlock in __wait_on_freeing_inode()

We can't call __wait_on_freeing_inode() with btree locks held; we're
waiting on another thread that's in evict(), and before it clears that
bit it needs to write that inode to flush timestamps - deadlock.

Fixing this involves a fair amount of re-jiggering to plumb a new
transaction restart.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
13 months agobcachefs: switch to rhashtable for vfs inodes hash
Kent Overstreet [Sun, 9 Jun 2024 01:41:01 +0000 (21:41 -0400)]
bcachefs: switch to rhashtable for vfs inodes hash

the standard vfs inode hash table suffers from painful lock contention -
this is long overdue

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
13 months agoinode: make __iget() a static inline
Kent Overstreet [Thu, 8 Aug 2024 15:18:21 +0000 (11:18 -0400)]
inode: make __iget() a static inline

bcachefs is switching to an rhashtable for vfs inodes instead of the
standard inode.c hashtable, so we need this exported, or - a static
inline makes more sense for a single atomic_inc().

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
13 months agobcachefs: Replace div_u64 with div64_u64 where second param is u64
Reed Riley [Thu, 5 Sep 2024 16:59:29 +0000 (16:59 +0000)]
bcachefs: Replace div_u64 with div64_u64 where second param is u64

Bcachefs often uses this function to divide by nanosecond times - which
can easily cause problems when cast to u32.  For example, `cat
/sys/fs/bcachefs/*/internal/rebalance_status` would return invalid data
in the `duration waited` field because dividing by the number of
nanoseconds in a minute requires the divisor parameter to be u64.

Signed-off-by: Reed Riley <reed@riley.engineer>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
13 months agobcachefs: Fix sysfs rebalance duration waited formatting
Feiko Nanninga [Sun, 1 Sep 2024 17:08:05 +0000 (19:08 +0200)]
bcachefs: Fix sysfs rebalance duration waited formatting

cat /sys/fs/bcachefs/*/internal/rebalance_status
waiting
  io wait duration:  13.5 GiB
  io wait remaining: 627 MiB
  duration waited:   1392 m

duration waited was increasing at a rate of about 14 times the expected
rate.

div_u64 takes a u32 divisor, but u->nsecs (from time_units[]) can be
bigger than u32.

Signed-off-by: Feiko Nanninga <feiko.nanninga@fnanninga.de>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
13 months agobcachefs: Fix negative timespecs
Alyssa Ross [Sat, 7 Sep 2024 16:00:26 +0000 (18:00 +0200)]
bcachefs: Fix negative timespecs

This fixes two problems in the handling of negative times:

 • rem is signed, but the rem * c->sb.nsec_per_time_unit operation
   produced a bogus unsigned result, because s32 * u32 = u32.

 • The timespec was not normalized (it could contain more than a
   billion nanoseconds).

For example, { .tv_sec = -14245441, .tv_nsec = 750000000 }, after
being round tripped through timespec_to_bch2_time and then
bch2_time_to_timespec would come back as
{ .tv_sec = -14245440, .tv_nsec = 4044967296 } (more than 4 billion
nanoseconds).

Cc: stable@vger.kernel.org
Fixes: 595c1e9bab7f ("bcachefs: Fix time handling")
Closes: https://github.com/koverstreet/bcachefs/issues/743
Co-developed-by: Erin Shepherd <erin.shepherd@e43.eu>
Signed-off-by: Erin Shepherd <erin.shepherd@e43.eu>
Co-developed-by: Ryan Lahfa <ryan@lahfa.xyz>
Signed-off-by: Ryan Lahfa <ryan@lahfa.xyz>
Signed-off-by: Alyssa Ross <hi@alyssa.is>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
13 months agobcachefs: Don't delete open files in online fsck
Kent Overstreet [Sun, 8 Sep 2024 05:06:57 +0000 (01:06 -0400)]
bcachefs: Don't delete open files in online fsck

If a file is unlinked but still open, we don't want online fsck to
delete it - or fun inconsistencies will happen.

https://github.com/koverstreet/bcachefs/issues/727

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
13 months agobcachefs: fix btree_key_cache sysfs knob
Kent Overstreet [Fri, 6 Sep 2024 01:18:57 +0000 (21:18 -0400)]
bcachefs: fix btree_key_cache sysfs knob

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
13 months agobcachefs: More BCH_SB_MEMBER_INVALID support
Kent Overstreet [Wed, 4 Sep 2024 21:50:20 +0000 (17:50 -0400)]
bcachefs: More BCH_SB_MEMBER_INVALID support

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
13 months agobcachefs: Simplify bch2_bkey_drop_ptrs()
Kent Overstreet [Wed, 4 Sep 2024 21:49:20 +0000 (17:49 -0400)]
bcachefs: Simplify bch2_bkey_drop_ptrs()

bch2_bkey_drop_ptrs() had a some complicated machinery for avoiding
O(n^2) when dropping multiple pointers - but when n is only going to be
~4, it's not worth it.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
13 months agobcachefs: Add a cond_resched() to __journal_keys_sort()
Kent Overstreet [Thu, 5 Sep 2024 19:43:03 +0000 (15:43 -0400)]
bcachefs: Add a cond_resched() to __journal_keys_sort()

Without this, we'd potentially sort multiple times without a
cond_resched(), leading to hung task warnings on larger systems.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
13 months agobcachefs: Fix ca->io_ref usage
Kent Overstreet [Wed, 4 Sep 2024 19:48:59 +0000 (15:48 -0400)]
bcachefs: Fix ca->io_ref usage

ca->io_ref does not protect against the filesystem going way,
c->write_ref does. Much like

0b50b7313ef2 bcachefs: Fix refcounting in discard path

the other async paths need fixing.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
13 months agobcachefs: BCH_SB_MEMBER_INVALID
Kent Overstreet [Sun, 1 Sep 2024 22:09:18 +0000 (18:09 -0400)]
bcachefs: BCH_SB_MEMBER_INVALID

Create a sentinal value for "invalid device".

This is needed for removing devices that have stripes on them (force
removing, without evacuating); we need a sentinal value for the stripe
pointers to the device being removed.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
13 months agobcachefs: fix rebalance accounting
Kent Overstreet [Sun, 1 Sep 2024 19:53:03 +0000 (15:53 -0400)]
bcachefs: fix rebalance accounting

Fixes: 49aa7830396b ("bcachefs: Fix rebalance_work accounting")
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
13 months agobcachefs: Mark more errors as autofix
Kent Overstreet [Thu, 22 Aug 2024 15:47:32 +0000 (11:47 -0400)]
bcachefs: Mark more errors as autofix

errors that are known to always be safe to fix should be autofix: this
should be most errors even at this point, but that will need some
thorough review.

note that errors are still logged in the superblock, so we'll still know
that they happened.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
13 months agobcachefs: Revert lockless buffered IO path
Kent Overstreet [Sat, 31 Aug 2024 21:44:51 +0000 (17:44 -0400)]
bcachefs: Revert lockless buffered IO path

We had a report of data corruption on nixos when building installer
images.

https://github.com/NixOS/nixpkgs/pull/321055#issuecomment-2184131334

It seems that writes are being dropped, but only when issued by QEMU,
and possibly only in snapshot mode. It's undetermined if it's write
calls are being dropped or dirty folios.

Further testing, via minimizing the original patch to just the change
that skips the inode lock on non appends/truncates, reveals that it
really is just not taking the inode lock that causes the corruption: it
has nothing to do with the other logic changes for preserving write
atomicity in corner cases.

It's also kernel config dependent: it doesn't reproduce with the minimal
kernel config that ktest uses, but it does reproduce with nixos's distro
config. Bisection the kernel config initially pointer the finger at page
migration or compaction, but it appears that was erroneous; we haven't
yet determined what kernel config option actually triggers it.

Sadly it appears this will have to be reverted since we're getting too
close to release and my plate is full, but we'd _really_ like to fully
debug it.

My suspicion is that this patch is exposing a preexisting bug - the
inode lock actually covers very little in IO paths, and we have a
different lock (the pagecache add lock) that guards against races with
truncate here.

Fixes: 7e64c86cdc6c ("bcachefs: Buffered write path now can avoid the inode lock")
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
13 months agobcachefs: Fix bch2_extents_match() false positive
Kent Overstreet [Mon, 26 Aug 2024 23:11:00 +0000 (19:11 -0400)]
bcachefs: Fix bch2_extents_match() false positive

This was caught as a very rare nonce inconsistency, on systems with
encryption and replication (and tiering, or some form of rebalance
operation running):

[Wed Jul 17 13:30:03 2024] about to insert invalid key in data update path
[Wed Jul 17 13:30:03 2024] old: u64s 10 type extent 671283510:6392:U32_MAX len 16 ver 106595503: durability: 2 crc: c_size 8 size 16 offset 0 nonce 0 csum chacha20_poly1305_80 compress zstd ptr: 3:355968:104 gen 7 ptr: 4:513244:48 gen 6 rebalance: target hdd compression zstd
[Wed Jul 17 13:30:03 2024] k:   u64s 10 type extent 671283510:6400:U32_MAX len 16 ver 106595508: durability: 2 crc: c_size 8 size 16 offset 0 nonce 0 csum chacha20_poly1305_80 compress zstd ptr: 3:355968:112 gen 7 ptr: 4:513244:56 gen 6 rebalance: target hdd compression zstd
[Wed Jul 17 13:30:03 2024] new: u64s 14 type extent 671283510:6392:U32_MAX len 8 ver 106595508: durability: 2 crc: c_size 8 size 16 offset 0 nonce 0 csum chacha20_poly1305_80 compress zstd ptr: 3:355968:112 gen 7 cached ptr: 4:513244:56 gen 6 cached rebalance: target hdd compression zstd crc: c_size 8 size 16 offset 8 nonce 0 csum chacha20_poly1305_80 compress zstd ptr: 1:10860085:32 gen 0 ptr: 0:17285918:408 gen 0
[Wed Jul 17 13:30:03 2024] bcachefs (cca5bc65-fe77-409d-a9fa-465a6e7f4eae): fatal error - emergency read only

bch2_extents_match() was reporting true for extents that did not
actually point to the same data.

bch2_extent_match() iterates over pairs of pointers, looking for
pointers that point to the same location on disk (with matching
generation numbers). However one or both extents may have been trimmed
(or merged) and they might not have the same disk offset: it corrects
for this by subtracting the key offset and the checksum entry offset.

However, this failed when an extent was immediately partially
overwritten, and the new overwrite was allocated the next adjacent disk
space.

Normally, with compression off, this would never cause a bug, since the
new extent would have to be immediately after the old extent for the
pointer offsets to match, and the rebalance index update path is not
looking for an extent outside the range of the extent it moved.

However with compression enabled, extents take up less space on disk
than they do in the btree index space - and spuriously matching after
partial overwrite is possible.

To fix this, add a secondary check, that strictly checks that the
regions pointed to on disk overlap.

https://github.com/koverstreet/bcachefs/issues/717

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
13 months agobcachefs: Fix failure to return error in data_update_index_update()
Kent Overstreet [Mon, 26 Aug 2024 19:11:38 +0000 (15:11 -0400)]
bcachefs: Fix failure to return error in data_update_index_update()

This fixes an assertion pop in io_write.c - if we don't return an error
we're supposed to have completed all the btree updates.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
13 months agobcachefs: Fix rebalance_work accounting
Kent Overstreet [Fri, 23 Aug 2024 19:35:22 +0000 (15:35 -0400)]
bcachefs: Fix rebalance_work accounting

rebalance_work was keying off of the presence of rebelance_opts in the
extent - but that was incorrect, we keep those around after rebalance
for indirect extents since the inode's options are not directly
available

Fixes: 20ac515a9cc7 ("bcachefs: bch_acct_rebalance_work")
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
13 months agobcachefs: Fix failure to flush moves before sleeping in copygc
Kent Overstreet [Fri, 23 Aug 2024 21:38:41 +0000 (17:38 -0400)]
bcachefs: Fix failure to flush moves before sleeping in copygc

This fixes an apparent deadlock - rebalance would get stuck trying to
take nocow locks because they weren't being released by copygc.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
13 months agobcachefs: don't use rht_bucket() in btree_key_cache_scan()
Kent Overstreet [Mon, 19 Aug 2024 20:41:00 +0000 (16:41 -0400)]
bcachefs: don't use rht_bucket() in btree_key_cache_scan()

rht_bucket() does strange complicated things when a rehash is in
progress.

Instead, just skip scanning when a rehash is in progress: scanning is
going to be more expensive (many more empty slots to cover), and some
sort of infinite loop is being observed

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
13 months agobcachefs: add missing inode_walker_exit()
Kent Overstreet [Thu, 22 Aug 2024 07:57:39 +0000 (03:57 -0400)]
bcachefs: add missing inode_walker_exit()

fix a small leak

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
13 months agobcachefs: clear path->should_be_locked in bch2_btree_key_cache_drop()
Kent Overstreet [Thu, 22 Aug 2024 06:13:02 +0000 (02:13 -0400)]
bcachefs: clear path->should_be_locked in bch2_btree_key_cache_drop()

bch2_btree_key_cache_drop() evicts the key cache entry - it's used when
we're doing an update that bypasses the key cache, because for cache
coherency reasons a key can't be in the key cache unless it also exists
in the btree - i.e. creates have to bypass the cache.

After evicting, the path no longer points to a key cache key, and
relock() will always fail if should_be_locked is true.

Prep for improving path->should_be_locked assertions

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
13 months agobcachefs: Fix double assignment in check_dirent_to_subvol()
Yuesong Li [Thu, 22 Aug 2024 06:21:58 +0000 (14:21 +0800)]
bcachefs: Fix double assignment in check_dirent_to_subvol()

ret was assigned twice in check_dirent_to_subvol(). Reported by cocci.

Signed-off-by: Yuesong Li <liyuesong@vivo.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
13 months agobcachefs: Fix refcounting in discard path
Kent Overstreet [Thu, 22 Aug 2024 03:21:52 +0000 (23:21 -0400)]
bcachefs: Fix refcounting in discard path

bch_dev->io_ref does not protect against the filesystem going away;
bch_fs->writes does.

Thus the filesystem write ref needs to be the last ref we release.

Reported-by: syzbot+9e0404b505e604f67e41@syzkaller.appspotmail.com
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
13 months agobcachefs: Fix compat issue with old alloc_v4 keys
Kent Overstreet [Thu, 22 Aug 2024 02:57:56 +0000 (22:57 -0400)]
bcachefs: Fix compat issue with old alloc_v4 keys

we allow new fields to be added to existing key types, and new versions
should treat them as being zeroed; this was not handled in
alloc_v4_validate.

Reported-by: syzbot+3b2968fa4953885dd66a@syzkaller.appspotmail.com
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
13 months agobcachefs: Fix warning in bch2_fs_journal_stop()
Kent Overstreet [Thu, 22 Aug 2024 02:27:45 +0000 (22:27 -0400)]
bcachefs: Fix warning in bch2_fs_journal_stop()

j->last_empty_seq needs to match j->seq when the journal is empty

Reported-by: syzbot+4093905737cf289b6b38@syzkaller.appspotmail.com
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
13 months agofs/super.c: improve get_tree() error message
Kent Overstreet [Thu, 22 Aug 2024 02:06:44 +0000 (22:06 -0400)]
fs/super.c: improve get_tree() error message

seeing an odd bug where we fail to correctly return an error from
.get_tree():

https://syzkaller.appspot.com/bug?extid=c0360e8367d6d8d04a66

we need to be able to distinguish between accidently returning a
positive error (as implied by the log) and no error.

Cc: David Howells <dhowells@redhat.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
13 months agobcachefs: Fix missing validation in bch2_sb_journal_v2_validate()
Kent Overstreet [Thu, 22 Aug 2024 01:10:45 +0000 (21:10 -0400)]
bcachefs: Fix missing validation in bch2_sb_journal_v2_validate()

Reported-by: syzbot+47ecc948aadfb2ab3efc@syzkaller.appspotmail.com
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
13 months agobcachefs: Fix replay_now_at() assert
Kent Overstreet [Thu, 22 Aug 2024 00:49:07 +0000 (20:49 -0400)]
bcachefs: Fix replay_now_at() assert

Journal replay, in the slowpath where we insert keys in journal order,
was inserting keys in the wrong order; keys from early repair come last.

Reported-by: syzbot+2c4fcb257ce2b6a29d0e@syzkaller.appspotmail.com
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
13 months agobcachefs: Fix locking in bch2_ioc_setlabel()
Kent Overstreet [Tue, 20 Aug 2024 23:31:20 +0000 (19:31 -0400)]
bcachefs: Fix locking in bch2_ioc_setlabel()

Fixes: 7a254053a590 ("bcachefs: support FS_IOC_SETFSLABEL")
Reported-by: syzbot+7e9efdfec27fbde0141d@syzkaller.appspotmail.com
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
13 months agobcachefs: fix failure to relock in btree_node_fill()
Kent Overstreet [Tue, 20 Aug 2024 19:04:15 +0000 (15:04 -0400)]
bcachefs: fix failure to relock in btree_node_fill()

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
13 months agobcachefs: fix failure to relock in bch2_btree_node_mem_alloc()
Kent Overstreet [Mon, 19 Aug 2024 19:22:55 +0000 (15:22 -0400)]
bcachefs: fix failure to relock in bch2_btree_node_mem_alloc()

We weren't always so strict about trans->locked state - but now we are,
and new assertions are shaking some bugs out.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
13 months agobcachefs: unlock_long() before resort in journal replay
Kent Overstreet [Tue, 20 Aug 2024 16:10:33 +0000 (12:10 -0400)]
bcachefs: unlock_long() before resort in journal replay

Fix another SRCU splat - this one pretty harmless.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
13 months agobcachefs: fix missing bch2_err_str()
Kent Overstreet [Tue, 20 Aug 2024 15:25:39 +0000 (11:25 -0400)]
bcachefs: fix missing bch2_err_str()

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
13 months agobcachefs: fix time_stats_to_text()
Kent Overstreet [Mon, 19 Aug 2024 20:13:16 +0000 (16:13 -0400)]
bcachefs: fix time_stats_to_text()

Fixes: 7423330e30ab ("bcachefs: prt_printf() now respects \r\n\t")
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
13 months agobcachefs: Fix bch2_bucket_gens_init()
Kent Overstreet [Mon, 19 Aug 2024 00:38:49 +0000 (20:38 -0400)]
bcachefs: Fix bch2_bucket_gens_init()

Comparing the wrong bpos - this was missed because normally
bucket_gens_init() runs on brand new filesystems, but this bug caused it
to overwrite bucket_gens keys with 0s when upgrading ancient
filesystems.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
13 months agobcachefs: Fix bch2_trigger_alloc assert
Kent Overstreet [Mon, 19 Aug 2024 00:18:34 +0000 (20:18 -0400)]
bcachefs: Fix bch2_trigger_alloc assert

On testing on an old mangled filesystem, we missed a case.

Fixes: bd864bc2d907 ("bcachefs: Fix bch2_trigger_alloc when upgrading from old versions")
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
13 months agobcachefs: Fix failure to relock in btree_node_get()
Kent Overstreet [Sun, 18 Aug 2024 19:08:12 +0000 (15:08 -0400)]
bcachefs: Fix failure to relock in btree_node_get()

discovered by new trans->locked asserts

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
13 months agobcachefs: setting bcachefs_effective.* xattrs is a noop
Kent Overstreet [Sun, 18 Aug 2024 17:24:26 +0000 (13:24 -0400)]
bcachefs: setting bcachefs_effective.* xattrs is a noop

bcachefs_effective.* xattrs show the options inherited from parent
directories (as well as explicitly set); this namespace is not for
setting bcachefs options.

Change the .set() handler to a noop so that if e.g. rsync is copying
xattrs it'll do the right thing, and only copy xattrs in the bcachefs.*
namespace. We don't want to return an error, because that will cause
rsync to bail out or get spammy.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
13 months agobcachefs: Fix "trying to move an extent, but nr_replicas=0"
Kent Overstreet [Sun, 18 Aug 2024 17:13:39 +0000 (13:13 -0400)]
bcachefs: Fix "trying to move an extent, but nr_replicas=0"

data_update_init() does a bunch of complicated stuff to decide how many
replicas to add, since we only want to increase an extent's durability
on an explicit rereplicate, but extent pointers may be on devices with
different durability settings.

There was a corner case when evacuating a device that had been set to
durability=0 after data had been written to it, and extents on that
device had already been rereplicated - then evacuate only needs to drop
pointers on that device, not move them.

So the assert for !m->op.nr_replicas was spurious; this was a perfectly
legitimate case that needed to be handled.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>