]> www.infradead.org Git - users/willy/xarray.git/log
users/willy/xarray.git
2 years agobcachefs: Make sure to use BTREE_ITER_PREFETCH in fsck
Kent Overstreet [Fri, 14 May 2021 20:56:26 +0000 (16:56 -0400)]
bcachefs: Make sure to use BTREE_ITER_PREFETCH in fsck

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2 years agobcachefs: Fix bch2_btree_iter_peek_with_updates()
Kent Overstreet [Fri, 30 Apr 2021 01:44:05 +0000 (21:44 -0400)]
bcachefs: Fix bch2_btree_iter_peek_with_updates()

By not re-fetching the next update we were going into an infinite loop.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2 years agobcachefs: Fix reflink trigger
Kent Overstreet [Tue, 4 May 2021 00:31:27 +0000 (20:31 -0400)]
bcachefs: Fix reflink trigger

The trigger for reflink pointers wasn't always incrementing/decrementing
the refcounts correctly - this patch fixes that logic.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2 years agobcachefs: Fix some refcounting bugs
Kent Overstreet [Sat, 8 May 2021 00:43:43 +0000 (20:43 -0400)]
bcachefs: Fix some refcounting bugs

We really need debug mode assertions that ca->ref and ca->io_ref are
used correctly.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2 years agobcachefs: Fix oob write in __bch2_btree_node_write
Dan Robertson [Sat, 8 May 2021 02:29:02 +0000 (22:29 -0400)]
bcachefs: Fix oob write in __bch2_btree_node_write

Fix a possible out of bounds write in __bch2_btree_node_write when
the data buffer padding is cleared up to the block size. The out of
bounds write is possible if the data buffers size is not a multiple
of the block size.

Signed-off-by: Dan Robertson <dan@dlrobertson.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2 years agobcachefs: Fix usage of last_seq + encryption
Kent Overstreet [Sat, 8 May 2021 03:32:26 +0000 (23:32 -0400)]
bcachefs: Fix usage of last_seq + encryption

jset->last_seq is in the region that's encrypted - on journal write
completion, we were using it and getting garbage. This patch shadows it
to fix.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2 years agobcachefs: Clean up bch2_btree_and_journal_walk()
Kent Overstreet [Thu, 29 Apr 2021 19:37:47 +0000 (15:37 -0400)]
bcachefs: Clean up bch2_btree_and_journal_walk()

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2 years agobcachefs: Mark newly allocated btree nodes as accessed
Kent Overstreet [Thu, 29 Apr 2021 20:55:26 +0000 (16:55 -0400)]
bcachefs: Mark newly allocated btree nodes as accessed

This was a major oversight - this means under memory pressure we can end
up reading in a btree node, then having it evicted before we get to use
it.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2 years agobcachefs: Fix time handling
Kent Overstreet [Thu, 29 Apr 2021 02:51:42 +0000 (22:51 -0400)]
bcachefs: Fix time handling

There were some overflows in the time conversion functions - fix this by
converting tv_sec and tv_nsec separately. Also, set sb->time_min and
sb->time_max.

Fixes xfstest generic/258.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2 years agobcachefs: Add a tracepoint for when we block on journal reclaim
Kent Overstreet [Thu, 29 Apr 2021 04:21:54 +0000 (00:21 -0400)]
bcachefs: Add a tracepoint for when we block on journal reclaim

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2 years agobcachefs: Make sure to initialize j->last_flushed
Kent Overstreet [Thu, 29 Apr 2021 02:12:07 +0000 (22:12 -0400)]
bcachefs: Make sure to initialize j->last_flushed

If the journal reclaim thread makes it to the timeout without ever
initializing j->last_flushed, we could end up sleeping for a very long
time.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2 years agobcachefs: Ensure that fpunch updates inode timestamps
Kent Overstreet [Wed, 28 Apr 2021 23:36:12 +0000 (19:36 -0400)]
bcachefs: Ensure that fpunch updates inode timestamps

Fixes xfstests generic/059

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2 years agobcachefs: Change copygc wait amount to be min of per device waits
Kent Overstreet [Tue, 27 Apr 2021 18:03:13 +0000 (14:03 -0400)]
bcachefs: Change copygc wait amount to be min of per device waits

We're seeing a filesystem get stuck when all devices but one have no
more reclaimable buckets - because the copygc wait amount is curretly
filesystem wide.

This patch should fix that, possibly at the expensive of running too
much when only one or a few devices is full and the rebalance thread
needs to move data around.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2 years agobcachefs: Change bch2_btree_key_cache_count() to exclude dirty keys
Kent Overstreet [Tue, 27 Apr 2021 18:02:00 +0000 (14:02 -0400)]
bcachefs: Change bch2_btree_key_cache_count() to exclude dirty keys

We're seeing livelocks that appear to be due to
bch2_btree_key_cache_scan repeatedly scanning and blocking other tasks
from using the key cache lock - we probably shouldn't be reporting
objects that can't actually be freed yet.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2 years agobcachefs: Call bch2_inconsistent_error() on missing stripe/indirect extent
Kent Overstreet [Fri, 30 Apr 2021 02:32:44 +0000 (22:32 -0400)]
bcachefs: Call bch2_inconsistent_error() on missing stripe/indirect extent

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2 years agobcachefs: New tracepoint for bch2_trans_get_iter()
Kent Overstreet [Thu, 29 Apr 2021 20:56:17 +0000 (16:56 -0400)]
bcachefs: New tracepoint for bch2_trans_get_iter()

Trying to debug an issue where after traverse_all() we shouldn't have to
traverse any iterators... yet we are

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2 years agobcachefs: Fix __bch2_trans_get_iter()
Kent Overstreet [Tue, 27 Apr 2021 15:12:17 +0000 (11:12 -0400)]
bcachefs: Fix __bch2_trans_get_iter()

We need to also set iter->uptodate to indicate it needs to be traversed.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2 years agobcachefs: Evict btree nodes we're deleting
Kent Overstreet [Sun, 25 Apr 2021 20:24:03 +0000 (16:24 -0400)]
bcachefs: Evict btree nodes we're deleting

There was a bug that led to duplicate btree node pointers being inserted
at the wrong level. The new topology repair code can fix that, except
that the btree cache code gets confused when we read in a btree node
from the pointer that was at the wrong level. This patch evicts nodes
that we're deleting to, which nicely solves the problem.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2 years agobcachefs: New check_nlinks algorithm for snapshots
Kent Overstreet [Thu, 22 Apr 2021 01:08:49 +0000 (21:08 -0400)]
bcachefs: New check_nlinks algorithm for snapshots

With snapshots, using a radix tree for the table of link counts won't
work anymore because we also need to distinguish between inodes with
different snapshot IDs. Instead, this patch builds up a sorted array of
inodes that have hardlinks that we can binary search on - taking
advantage of the fact that with inode backpointers, the check_nlinks()
pass _only_ needs to concern itself with inodes that have hardlinks now.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2 years agobcachefs: Fix a null ptr deref
Kent Overstreet [Sun, 25 Apr 2021 02:33:25 +0000 (22:33 -0400)]
bcachefs: Fix a null ptr deref

Fix a few memory safety issues, found by asan in userspace.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2 years agobcachefs: New and improved topology repair code
Kent Overstreet [Sat, 24 Apr 2021 20:32:35 +0000 (16:32 -0400)]
bcachefs: New and improved topology repair code

This splits out btree topology repair into a separate pass, and makes
some improvements:
 - When we have to pick which of two overlapping nodes to drop keys
   from, we use the btree node header sequence number to preserve the
   newer node

 - the gc code has been changed so that it doesn't bail out if we're
   continuing/ignoring on fsck error - this way the dump tool can skip
   running the repair pass but still walk all reachable metadata

 - add a new superblock flag indicating when a filesystem is known to
   have btree topology issues, and the topology repair pass should be
   run

 - changing the start/end of a node might mean keys in that node have to
   be deleted: this patch handles that better by splitting it out into a
   separate function and running it explicitly in the topology repair
   code, previously those keys were only being dropped when the btree
   node was read in.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2 years agobcachefs: Fix key cache assertion
Kent Overstreet [Sat, 24 Apr 2021 22:02:59 +0000 (18:02 -0400)]
bcachefs: Fix key cache assertion

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2 years agobcachefs: New helper __bch2_btree_insert_keys_interior()
Kent Overstreet [Fri, 23 Apr 2021 23:25:27 +0000 (19:25 -0400)]
bcachefs: New helper __bch2_btree_insert_keys_interior()

Consolidate common parts of bch2_btree_insert_keys_interior() and
btree_split_insert_keys() - prep work for adding some new topology
assertions.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2 years agobcachefs: Rewrite btree nodes with errors
Kent Overstreet [Sat, 24 Apr 2021 06:47:41 +0000 (02:47 -0400)]
bcachefs: Rewrite btree nodes with errors

This patch adds self healing functionality for btree nodes - if we
notice a problem when reading a btree node, we just rewrite it.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2 years agobcachefs: Fix bch2_verify_keylist_sorted
Kent Overstreet [Sat, 24 Apr 2021 04:59:29 +0000 (00:59 -0400)]
bcachefs: Fix bch2_verify_keylist_sorted

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2 years agobcachefs: Fix an out of bounds read
Kent Overstreet [Sat, 24 Apr 2021 04:42:02 +0000 (00:42 -0400)]
bcachefs: Fix an out of bounds read

bch2_varint_decode() can read up to 7 bytes past the end of the buffer,
which means we need to allocate slightly larger key cache buffers.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2 years agobcachefs: Use mmap() instead of vmalloc_exec() in userspace
Kent Overstreet [Sat, 24 Apr 2021 04:38:16 +0000 (00:38 -0400)]
bcachefs: Use mmap() instead of vmalloc_exec() in userspace

Calling mmap() directly is much better than malloc() then mprotect(), we
end up with much less address space fragmentation.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2 years agobcachefs: Don't BUG_ON() btree topology error
Kent Overstreet [Fri, 23 Apr 2021 20:05:49 +0000 (16:05 -0400)]
bcachefs: Don't BUG_ON() btree topology error

This replaces an assertion in the btree merge path with a
bch2_inconsistent_error() - fsck will fix it.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2 years agobcachefs: Fix repair leading to replicas not marked
Kent Overstreet [Fri, 23 Apr 2021 20:18:43 +0000 (16:18 -0400)]
bcachefs: Fix repair leading to replicas not marked

bch2_check_fix_ptrs() was being called after checking if the replicas
set was marked - but repair could change which replicas set needed to be
marked. Oops.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2 years agobcachefs: Lookup/create lost+found lazily
Kent Overstreet [Tue, 20 Apr 2021 02:19:18 +0000 (22:19 -0400)]
bcachefs: Lookup/create lost+found lazily

This is prep work for subvolumes - each subvolume will have its own
lost+found.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2 years agobcachefs: Don't BUG() in update_replicas
Kent Overstreet [Wed, 21 Apr 2021 22:08:39 +0000 (18:08 -0400)]
bcachefs: Don't BUG() in update_replicas

Apparently, we have a bug where in mark and sweep while accounting for a
key, a replicas entry isn't found. Change the code to print out the key
we couldn't mark and halt instead of a BUG_ON().

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2 years agobcachefs: Fix a deadlock on journal reclaim
Kent Overstreet [Tue, 20 Apr 2021 21:09:25 +0000 (17:09 -0400)]
bcachefs: Fix a deadlock on journal reclaim

Flushing the btree key cache needs to use allocation reserves - journal
reclaim depends on flushing the btree key cache for making forward
progress, and the allocator and copygc depend on journal reclaim making
forward progress.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2 years agobcachefs: Update bch2_btree_verify()
Kent Overstreet [Wed, 21 Apr 2021 00:21:12 +0000 (20:21 -0400)]
bcachefs: Update bch2_btree_verify()

bch2_btree_verify() verifies that the btree node on disk matches what we
have in memory. This patch changes it to verify every replica, and also
fixes it for interior btree nodes - there's a mem_ptr field which is
used as a scratch space and needs to be zeroed out for comparing with
what's on disk.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2 years agobcachefs: Fix two btree iterator leaks
Kent Overstreet [Wed, 21 Apr 2021 00:21:39 +0000 (20:21 -0400)]
bcachefs: Fix two btree iterator leaks

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2 years agobcachefs: Punt btree writes to workqueue to submit
Kent Overstreet [Tue, 6 Apr 2021 19:28:34 +0000 (15:28 -0400)]
bcachefs: Punt btree writes to workqueue to submit

We don't want to be submitting IO with btree locks held, and btree
writes usually aren't latency sensitive.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2 years agobcachefs: Fix a use after free
Kent Overstreet [Mon, 19 Apr 2021 21:17:34 +0000 (17:17 -0400)]
bcachefs: Fix a use after free

Turns out, we weren't waiting on in flight btree writes when freeing
existing btree nodes. This lead to stray btree writes overwriting newly
allocated buckets, but only started showing itself with some of the
recent allocator work and another patch to move submitting of btree
writes to worqueues.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2 years agobcachefs: Fix for btree_gc repairing interior btree ptrs
Kent Overstreet [Mon, 19 Apr 2021 21:07:20 +0000 (17:07 -0400)]
bcachefs: Fix for btree_gc repairing interior btree ptrs

Using the normal transaction commit path to insert and journal updates
to interior nodes hadn't been done before this repair code was written,
not surprising that there was a bug.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2 years agobcachefs: Preallocate trans mem in bch2_migrate_index_update()
Kent Overstreet [Mon, 19 Apr 2021 04:33:05 +0000 (00:33 -0400)]
bcachefs: Preallocate trans mem in bch2_migrate_index_update()

This will help avoid transaction restarts.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2 years agobcachefs: Allocator refactoring
Kent Overstreet [Sun, 18 Apr 2021 00:37:04 +0000 (20:37 -0400)]
bcachefs: Allocator refactoring

This uses the kthread_wait_freezable() macro to simplify a lot of the
allocator thread code, along with cleaning up bch2_invalidate_bucket2().

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2 years agobcachefs: Always check for invalid bkeys in trans commit path
Kent Overstreet [Sun, 18 Apr 2021 21:44:35 +0000 (17:44 -0400)]
bcachefs: Always check for invalid bkeys in trans commit path

We check for this prior to metadata being written, but we're seeing some
strange bugs lately, and this will help catch those closer to where they
occur.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2 years agobcachefs: Check that keys are in the correct btrees
Kent Overstreet [Sun, 18 Apr 2021 03:18:17 +0000 (23:18 -0400)]
bcachefs: Check that keys are in the correct btrees

We've started seeing bug reports of pointers to btree nodes being
detected in leaf nodes. This should catch that before it's happened, and
it's something we should've been checking anyways.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2 years agobcachefs: Handle errors in bch2_trans_mark_update()
Kent Overstreet [Sun, 18 Apr 2021 21:26:34 +0000 (17:26 -0400)]
bcachefs: Handle errors in bch2_trans_mark_update()

It's not actually the case that iterators are always checked here -
__bch2_trans_commit() checks for that after running triggers.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2 years agobcachefs: Allocator thread doesn't need gc_lock anymore
Kent Overstreet [Sat, 17 Apr 2021 01:53:23 +0000 (21:53 -0400)]
bcachefs: Allocator thread doesn't need gc_lock anymore

Even with runtime gc (which currently isn't supported), runtime gc no
longer clears/recalculates the main set of bucket marks - it allocates
and calculates another set, updating the primary at the end.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2 years agobcachefs: gc shouldn't care about owned_by_allocator
Kent Overstreet [Sat, 17 Apr 2021 01:34:00 +0000 (21:34 -0400)]
bcachefs: gc shouldn't care about owned_by_allocator

The owned_by_allocator field is a purely in memory thing, even if/when
we bring back GC at runtime there's no need for it to be recalculating
this field. This is prep work for pulling it out of struct bucket, and
eventually getting rid of the bucket array.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2 years agobcachefs: Refactor bchfs_fallocate() to not nest btree_trans on stack
Kent Overstreet [Sat, 17 Apr 2021 00:35:20 +0000 (20:35 -0400)]
bcachefs: Refactor bchfs_fallocate() to not nest btree_trans on stack

Upcoming patch is going to disallow multiple btree_trans on the stack.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2 years agobcachefs: Fix an unused var warning in userspace
Kent Overstreet [Fri, 16 Apr 2021 21:34:53 +0000 (17:34 -0400)]
bcachefs: Fix an unused var warning in userspace

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2 years agobcachefs: Fix some small memory leaks
Kent Overstreet [Fri, 16 Apr 2021 21:26:25 +0000 (17:26 -0400)]
bcachefs: Fix some small memory leaks

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2 years agobcachefs: Simplify fsck remove_dirent()
Kent Overstreet [Fri, 16 Apr 2021 18:48:51 +0000 (14:48 -0400)]
bcachefs: Simplify fsck remove_dirent()

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2 years agobcachefs: Fix transaction restarts due to upgrading of cloned iterators
Kent Overstreet [Fri, 16 Apr 2021 18:29:26 +0000 (14:29 -0400)]
bcachefs: Fix transaction restarts due to upgrading of cloned iterators

This fixes a regression from
  52d86202fd bcachefs: Improve bch2_btree_iter_traverse_all()

We want to avoid mucking with other iterators in the btree transaction
in operations that are only supposed to be touching individual iterators
- that patch was a cleanup to move lock ordering handling to
bch2_btree_iter_traverse_all(). But it broke upgrading of cloned
iterators.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2 years agobcachefs: Fix journal reclaim loop
Kent Overstreet [Fri, 16 Apr 2021 16:38:14 +0000 (12:38 -0400)]
bcachefs: Fix journal reclaim loop

When dirty key cache keys were separated from other journal pins, we
broke the loop conditional in __bch2_journal_reclaim() - it's supposed
to keep looping as long as there's work to do.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2 years agobcachefs: Fix an RCU splat
Kent Overstreet [Thu, 15 Apr 2021 22:31:58 +0000 (18:31 -0400)]
bcachefs: Fix an RCU splat

Writepoints are never deallocated so the rcu_read_lock() isn't really
needed, but we are doing lockless list traversal.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2 years agobcachefs: Simplify bch2_set_nr_journal_buckets()
Kent Overstreet [Thu, 15 Apr 2021 00:23:58 +0000 (20:23 -0400)]
bcachefs: Simplify bch2_set_nr_journal_buckets()

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2 years agobcachefs: Fix bch2_trans_mark_dev_sb()
Kent Overstreet [Thu, 15 Apr 2021 00:25:33 +0000 (20:25 -0400)]
bcachefs: Fix bch2_trans_mark_dev_sb()

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2 years agobcachefs: Improve trans_restart_mem_realloced tracepoint
Kent Overstreet [Thu, 15 Apr 2021 16:50:09 +0000 (12:50 -0400)]
bcachefs: Improve trans_restart_mem_realloced tracepoint

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2 years agobcachefs: Don't downgrade iterators in bch2_trans_get_iter()
Kent Overstreet [Thu, 15 Apr 2021 16:36:40 +0000 (12:36 -0400)]
bcachefs: Don't downgrade iterators in bch2_trans_get_iter()

This fixes a livelock with btree node splits.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2 years agobcachefs: Improve bch2_btree_iter_traverse_all()
Kent Overstreet [Wed, 14 Apr 2021 17:26:15 +0000 (13:26 -0400)]
bcachefs: Improve bch2_btree_iter_traverse_all()

By changing it to upgrade iterators to intent locks to avoid lock
restarts we can simplify __bch2_btree_node_lock() quite a bit - this
fixes a probable bug where it could potentially drop a lock on an
unrelated error but still succeed instead of causing a transaction
restart.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2 years agobcachefs: Fix journal_reclaim_wait_done()
Kent Overstreet [Thu, 15 Apr 2021 02:15:55 +0000 (22:15 -0400)]
bcachefs: Fix journal_reclaim_wait_done()

Can't run arbitrary code inside a wait_event() conditional, due to
task state being weird...

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2 years agobcachefs: Fix bch2_gc_done() error messages
Kent Overstreet [Thu, 15 Apr 2021 00:22:10 +0000 (20:22 -0400)]
bcachefs: Fix bch2_gc_done() error messages

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2 years agobcachefs: Don't call bch2_btree_iter_traverse() unnecessarily
Kent Overstreet [Wed, 14 Apr 2021 21:45:31 +0000 (17:45 -0400)]
bcachefs: Don't call bch2_btree_iter_traverse() unnecessarily

If we let bch2_trans_commit() do it, it'll traverse iterators in sorted
order which means we'll get fewer lock restarts.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2 years agobcachefs: Better iterator picking
Kent Overstreet [Wed, 14 Apr 2021 17:29:34 +0000 (13:29 -0400)]
bcachefs: Better iterator picking

Avoid cloning iterators if we don't have to.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2 years agobcachefs: Drop old style btree node coalescing
Kent Overstreet [Wed, 14 Apr 2021 16:17:41 +0000 (12:17 -0400)]
bcachefs: Drop old style btree node coalescing

We have foreground btree node merging now, and any future btree node
merging improvements are going to be based off of that code.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2 years agobcachefs: Add a perf test for multiple updates per commit
Kent Overstreet [Wed, 14 Apr 2021 16:10:17 +0000 (12:10 -0400)]
bcachefs: Add a perf test for multiple updates per commit

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2 years agobcachefs: Ensure bucket gen gc completes
Kent Overstreet [Tue, 13 Apr 2021 19:10:39 +0000 (15:10 -0400)]
bcachefs: Ensure bucket gen gc completes

We don't want it to block, if it can't allocate it should just continue
instead of possibly deadlocking.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2 years agobcachefs: Add the status of bucket gen gc to sysfs
Kent Overstreet [Tue, 13 Apr 2021 19:00:40 +0000 (15:00 -0400)]
bcachefs: Add the status of bucket gen gc to sysfs

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2 years agobcachefs: Fix heap overrun in bch2_fs_usage_read() XXX squash
Kent Overstreet [Tue, 13 Apr 2021 14:30:58 +0000 (10:30 -0400)]
bcachefs: Fix heap overrun in bch2_fs_usage_read() XXX squash

oops

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2 years agobcachefs: BCH_BEATURE_atomic_nlink is obsolete
Kent Overstreet [Tue, 13 Apr 2021 14:26:59 +0000 (10:26 -0400)]
bcachefs: BCH_BEATURE_atomic_nlink is obsolete

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2 years agobcachefs: Improved check_directory_structure()
Kent Overstreet [Wed, 7 Apr 2021 07:11:07 +0000 (03:11 -0400)]
bcachefs: Improved check_directory_structure()

Now that we have inode backpointers, we can simplify checking directory
structure: instead of doing a DFS from the filesystem root and then
checking if we found everything, we can iterate over every inode and see
if we can go up until we get to the root.

This patch also has a number of fixes and simplifications for the inode
backpointer checks. Also, it turns out we don't actually need the
BCH_INODE_BACKPTR_UNTRUSTED flag.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2 years agobcachefs: Fix fsck to not use bch2_link_trans()
Kent Overstreet [Fri, 9 Apr 2021 07:25:37 +0000 (03:25 -0400)]
bcachefs: Fix fsck to not use bch2_link_trans()

bch2_link_trans() uses the btree key cache for inode updates, and fsck
isn't supposed to - also, it's not really what we want for reattaching
unreachable inodes anyways.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2 years agobcachefs: Fix bch2_trans_relock()
Kent Overstreet [Mon, 12 Apr 2021 18:00:07 +0000 (14:00 -0400)]
bcachefs: Fix bch2_trans_relock()

The patch that changed bch2_trans_relock() to not look at iter->uptodate
also tried to add an optimization by only having it relock
btree_iter_key() iterators (iterators that are live or have been marked
as keep). But, this wasn't thought through - this pops internal iterator
assertions because on transaction restart, when we're traversing
iterators we traverse all iterators marked as linked, and having
bch2_trans_relock() skip some of those mean that it can skil the
iterator that bch2_btree_iter_traverse_one() is currently traversing.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2 years agobcachefs: Redo check_nlink fsck pass
Kent Overstreet [Thu, 8 Apr 2021 19:25:29 +0000 (15:25 -0400)]
bcachefs: Redo check_nlink fsck pass

Now that we have inode backpointers the check_nlink pass only is
concerned with files that have hardlinks, and can be simplified.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2 years agobcachefs: Inode backpointers are now required
Kent Overstreet [Wed, 7 Apr 2021 00:15:26 +0000 (20:15 -0400)]
bcachefs: Inode backpointers are now required

This lets us simplify fsck quite a bit, which we need for making fsck
snapshot aware.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2 years agobcachefs: Simplify hash table checks
Kent Overstreet [Wed, 7 Apr 2021 05:55:57 +0000 (01:55 -0400)]
bcachefs: Simplify hash table checks

Very early on there was a period where we were accidentally generating
dirents with trailing garbage; we've since dropped support for
filesystems that old and the fsck code can be dropped.

Also, this patch switches to a simpler algorithm for checking hash
tables. It's less efficient on hash collision - but with 64 bit keys,
those are very rare.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2 years agobcachefs: Check inodes at start of fsck
Kent Overstreet [Wed, 7 Apr 2021 01:41:48 +0000 (21:41 -0400)]
bcachefs: Check inodes at start of fsck

This splits out checking inode nlinks from the rest of the inode checks
and moves most of the inode checks to the start of fsck, so that other
fsck passes can depend on it.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2 years agobcachefs: Fix BTREE_ITER_NOT_EXTENTS
Kent Overstreet [Fri, 9 Apr 2021 20:52:30 +0000 (16:52 -0400)]
bcachefs: Fix BTREE_ITER_NOT_EXTENTS

bch2_btree_iter_peek() wasn't properly checking for
BTREE_ITER_IS_EXTENTS when updating iter->pos.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2 years agobcachefs: Fix bch2_gc_btree_gens()
Kent Overstreet [Fri, 9 Apr 2021 19:10:24 +0000 (15:10 -0400)]
bcachefs: Fix bch2_gc_btree_gens()

Since we're using a NOT_EXTENTS iterator, we shouldn't be setting the
iter pos to the start of the extent.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2 years agobcachefs: Make sure to kick journal reclaim when we're waiting on it
Kent Overstreet [Thu, 8 Apr 2021 20:15:03 +0000 (16:15 -0400)]
bcachefs: Make sure to kick journal reclaim when we're waiting on it

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2 years agobcachefs: Don't wait for ALLOC_SCAN_BATCH buckets in allocator
Kent Overstreet [Thu, 8 Apr 2021 01:04:04 +0000 (21:04 -0400)]
bcachefs: Don't wait for ALLOC_SCAN_BATCH buckets in allocator

It used to be necessary for the allocator thread to batch up
invalidating buckets when possible - but since we added the btree key
cache that hasn't been a concern, and now it's causing the allocator
thread to livelock when the filesystem is nearly full.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2 years agobcachefs: Drop bch2_fsck_inode_nlink()
Kent Overstreet [Wed, 7 Apr 2021 01:19:25 +0000 (21:19 -0400)]
bcachefs: Drop bch2_fsck_inode_nlink()

We've had BCH_FEATURE_atomic_nlink for quite some time, we can drop this
now.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2 years agobcachefs: Move some dirent checks to bch2_dirent_invalid()
Kent Overstreet [Wed, 7 Apr 2021 00:11:28 +0000 (20:11 -0400)]
bcachefs: Move some dirent checks to bch2_dirent_invalid()

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2 years agobcachefs: Improve bset compaction
Kent Overstreet [Tue, 6 Apr 2021 19:33:19 +0000 (15:33 -0400)]
bcachefs: Improve bset compaction

The previous patch that fixed btree nodes being written too aggressively
now meant that we weren't sorting btree node bsets optimally - this
patch fixes that.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2 years agobcachefs: Don't flush btree writes more aggressively because of btree key cache
Kent Overstreet [Thu, 1 Apr 2021 01:44:55 +0000 (21:44 -0400)]
bcachefs: Don't flush btree writes more aggressively because of btree key cache

We need to flush the btree key cache when it's too dirty, because
otherwise the shrinker won't be able to reclaim memory - this is done by
journal reclaim. But journal reclaim also kicks btree node writes: this
meant that btree node writes were getting kicked much too often just
because we needed to flush btree key cache keys.

This patch splits journal pins into two different lists, and teaches
journal reclaim to not flush btree node writes when it only needs to
flush key cache keys.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2 years agobcachefs: Eliminate more PAGE_SIZE uses
Kent Overstreet [Tue, 6 Apr 2021 18:00:56 +0000 (14:00 -0400)]
bcachefs: Eliminate more PAGE_SIZE uses

In userspace, we don't really have a well defined PAGE_SIZE and shouln't
be relying on it. This is some more incremental work to remove
references to it.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2 years agobcachefs: Increase BSET_CACHELINE to 256 bytes
Kent Overstreet [Tue, 6 Apr 2021 17:43:31 +0000 (13:43 -0400)]
bcachefs: Increase BSET_CACHELINE to 256 bytes

Linear searches have gotten cheaper relative to binary searches on
modern hardware, due to better branch prediction behaviour.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2 years agobcachefs: Fix a startup race
Kent Overstreet [Mon, 5 Apr 2021 05:23:55 +0000 (01:23 -0400)]
bcachefs: Fix a startup race

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2 years agobcachefs: Fix an uninitialized variable
Kent Overstreet [Mon, 5 Apr 2021 02:38:07 +0000 (22:38 -0400)]
bcachefs: Fix an uninitialized variable

Fortunately it was just used in an error message

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2 years agobcachefs: kill bset_tree->max_key
Kent Overstreet [Sun, 4 Apr 2021 01:54:14 +0000 (21:54 -0400)]
bcachefs: kill bset_tree->max_key

Since we now ensure a btree node's max key fits in its packed format,
this isn't needed for the reasons it used to be - and, it was being used
inconsistently.

Also reorder struct btree a bit for performance, and kill some dead
code.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2 years agobcachefs: Eliminate memory barrier from fast path of journal_preres_put()
Kent Overstreet [Sun, 4 Apr 2021 01:31:02 +0000 (21:31 -0400)]
bcachefs: Eliminate memory barrier from fast path of journal_preres_put()

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2 years agobcachefs: Drop some memset() calls
Kent Overstreet [Sun, 4 Apr 2021 01:09:13 +0000 (21:09 -0400)]
bcachefs: Drop some memset() calls

gcc is emitting rep stos here, which is silly (and slow) for an 8 byte
memset.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2 years agobcachefs: Kill bch2_fs_usage_scratch_get()
Kent Overstreet [Sun, 4 Apr 2021 00:29:05 +0000 (20:29 -0400)]
bcachefs: Kill bch2_fs_usage_scratch_get()

This is an important cleanup, eliminating an unnecessary copy in the
transaction commit path.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2 years agobcachefs: Fix livelock calling bch2_mark_bkey_replicas()
Kent Overstreet [Sat, 3 Apr 2021 23:41:09 +0000 (19:41 -0400)]
bcachefs: Fix livelock calling bch2_mark_bkey_replicas()

The bug was that we were trying to find a replicas entry that wasn't
sorted - but, we can also simplify the code by not using
bch2_mark_bkey_replicas and instead ensuring the list of replicas
entries exists directly.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2 years agobcachefs: Be more careful about JOURNAL_RES_GET_RESERVED
Kent Overstreet [Sat, 3 Apr 2021 20:24:13 +0000 (16:24 -0400)]
bcachefs: Be more careful about JOURNAL_RES_GET_RESERVED

JOURNAL_RES_GET_RESERVED should only be used for updatse that need to be
done to free up space in the journal. In particular, when we're flushing
keys from the key cache, if we're flushing them out of order we
shouldn't be using it, since we're using up our remaining space in the
journal without dropping a pin that will let us make forward progress.

With this patch, BTREE_INSERT_JOURNAL_RECLAIM without
BTREE_INSERT_JOURNAL_RESERVED may return -EAGAIN - we can't wait on
journal reclaim if we're already in journal reclaim.

This means we need to propagate these errors up to journal reclaim,
indicating that flushing a journal pin should be retried in the future.

This is prep work for a patch to change the way journal reclaim works,
to split out flushing key cache keys because the btree key cache is too
dirty from journal reclaim because we need space in the journal.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2 years agobcachefs: Fix journal deadlock
Kent Overstreet [Sat, 3 Apr 2021 23:27:05 +0000 (19:27 -0400)]
bcachefs: Fix journal deadlock

After we get a journal reservation, we need to use it - if we erorr out
of a transaction commit, we'll be eating into space in the journal and
if our transaction needs to make forward progress in order to reclaim
space in the journal, we'll deadlock.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2 years agobcachefs: Fix this_cpu_ptr() usage
Kent Overstreet [Sat, 3 Apr 2021 22:37:09 +0000 (18:37 -0400)]
bcachefs: Fix this_cpu_ptr() usage

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2 years agobcachefs: Increase commality between BTREE_ITER_NODES and BTREE_ITER_KEYS
Kent Overstreet [Sat, 3 Apr 2021 01:29:05 +0000 (21:29 -0400)]
bcachefs: Increase commality between BTREE_ITER_NODES and BTREE_ITER_KEYS

Eventually BTREE_ITER_NODES should be going away. This patch is to fix a
transaction iterator overflow in the btree node merge path because
BTREE_ITER_NODES iterators couldn't be reused.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2 years agobcachefs: Fix BTREE_FOREGROUND_MERGE_HYSTERESIS
Kent Overstreet [Wed, 31 Mar 2021 20:10:21 +0000 (16:10 -0400)]
bcachefs: Fix BTREE_FOREGROUND_MERGE_HYSTERESIS

We were multiplying instead of dividing - oops.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2 years agobcachefs: Drop trans->nounlock
Kent Overstreet [Wed, 31 Mar 2021 20:43:50 +0000 (16:43 -0400)]
bcachefs: Drop trans->nounlock

Since we're no longer doing btree node merging post commit, we can now
delete a bunch of code.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2 years agobcachefs: Move btree node merging to before transaction commit
Kent Overstreet [Mon, 29 Mar 2021 05:13:31 +0000 (01:13 -0400)]
bcachefs: Move btree node merging to before transaction commit

Currently, BTREE_INSERT_NOUNLOCK makes it hard to ensure btree node
merging happens reliably - since btree node merging happens after
transaction commit, we can't drop btree locks and block when starting
the btree update.

This patch moves it to before transaction commit - and failure to do a
merge that we wanted to do just restarts the transaction.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2 years agobcachefs: bch2_foreground_maybe_merge() now correctly reports lock restarts
Kent Overstreet [Wed, 31 Mar 2021 20:16:39 +0000 (16:16 -0400)]
bcachefs: bch2_foreground_maybe_merge() now correctly reports lock restarts

This means that btree node splits don't have to automatically trigger a
transaction restart.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2 years agobcachefs: Kill bch2_btree_node_get_sibling()
Kent Overstreet [Mon, 29 Mar 2021 05:13:31 +0000 (01:13 -0400)]
bcachefs: Kill bch2_btree_node_get_sibling()

This patch reworks the btree node merge path to use a second btree
iterator to get the sibling node - which means
bch2_btree_iter_get_sibling() can be deleted. Also, it uses
bch2_btree_iter_traverse_all() if necessary - which means it should be
more reliable. We don't currently even try to make it work when
trans->nounlock is set - after a BTREE_INSERT_NOUNLOCK transaction
commit, hopefully this will be a worthwhile tradeoff.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2 years agobcachefs: Change where merging of interior btree nodes is trigger from
Kent Overstreet [Wed, 31 Mar 2021 19:39:16 +0000 (15:39 -0400)]
bcachefs: Change where merging of interior btree nodes is trigger from

Previously, we were doing btree node merging from
bch2_btree_insert_node() - but this is called from the split path, when
we're in the middle of creating new nodes and deleting new nodes and the
iterators are in a weird state.

Also, this means we're starting a new btree_update while in the middle
of an existing one, and that's asking for deadlocks.

Much simpler and saner to trigger btree node merging _after_ the whole
btree node split path is finished.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>