Lars Ellenberg [Wed, 23 Mar 2011 13:31:09 +0000 (14:31 +0100)]
drbd: distribute former syncer_conf settings to disk, connection, and resource level
This commit breaks the API again.
Move per-volume former syncer options into disk_conf.
Move per-connection former syncer options into net_conf.
Renamed the remainign sync_conf to res_opts
Syncer settings have been changeable at runtime, so we need to prepare
for these settings to be runtime-changeable in their new home as well.
Introduce new configuration operations, and share the netlink attribute
between "attach" (create new disk) and "disk-opts" (change options).
Same for "connect" and "net-opts".
Some fields cannot be changed at runtime, however.
Introduce a new flag GENLA_F_INVARIANT to be able to trigger on that in
the generated validation and assignment functions.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Lars Ellenberg [Tue, 15 Mar 2011 15:04:09 +0000 (16:04 +0100)]
drbd: add forgotten spin_unlock
somehow a "goto abort" was introduced with commit
drbd: Extracted is_valid_transition() out of sanitize_state()
which left drbd_req_state still holding the spin lock.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Lars Ellenberg [Mon, 14 Mar 2011 12:58:03 +0000 (13:58 +0100)]
drbd: bail out if a config requrest is over-determined, and not matching
We have resources resp. connections, volumes, and minor numbers.
A config request may specifies all three of them.
If it turns out that the minor belongs to a different connection, or a
different volume number in the same connection, that configuration
request is invalid.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Lars Ellenberg [Mon, 14 Mar 2011 12:22:35 +0000 (13:22 +0100)]
drbd: new-connection and new-minor succeed, if the object already exists
Follow O_CREAT semantics when creating connection or minor device/volume
objects. If we need O_CREAT|O_EXCL semantics some time down the road,
we can add NLM_F_EXCL to the netlink message flags.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Lars Ellenberg [Thu, 10 Mar 2011 22:28:13 +0000 (23:28 +0100)]
drbd: simplify conn_all_vols_unconf, make it bool
Get rid of a temporary variable and, funny bitand assignment.
Just short circuit, returning false, once we encounter the first
still configured volume.
FIXME verify call sites for need of rcu_read_lock or stronger.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Lars Ellenberg [Mon, 7 Mar 2011 09:00:58 +0000 (10:00 +0100)]
drbd: get rid of drbd_bcast_ee, it is of no use anymore
This function was used to broadcast the (leading part of the)
bio payload in case we see a data integrity error. It could be received
from userland with the drbdsetup events subcommand,
to have a peek into the payload that caused the checksum mismatch,
and guess from there what may have caused the mismatch,
mainly to guess wether it was modification of in-flight data,
or data corruption by broken hardware or software bugs.
Meanwhile we support bios that are larger than the maximum payload a
netlink datagram can carry.
And we have means to reliably detect modification of in-flight data by
calculating, and comparing, the checksum before and after sendmsg.
There is no need to carry this around anymore.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Lars Ellenberg [Wed, 23 Feb 2011 15:10:01 +0000 (16:10 +0100)]
drbd: only wakeup if something changed in update_peer_seq
This commit got it wrong:
drbd: Make the peer_seq updating code more obvious
Make it more clear that update_peer_seq() is supposed to wake up the
seq_wait queue whenever the sequence number changes.
We don't need to wake up everytime we receive a sequence number
that is _different_ from our currently stored "newest" sequence number,
but only if we receive a sequence number _newer_ than what we already
have, when we actually change mdev->peer_seq.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Philipp Reisner [Fri, 18 Feb 2011 13:23:11 +0000 (14:23 +0100)]
drbd: Reworked the unconfiguring and thread stopping code
* Moved CONFIG_PENDING and DEVICE_DYING from mdev to tconn.
* Renamed drbd_reconfig_start() and drbd_reconfig_done() to
conn_reconfig_start() and conn_reconfig_done().
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Andreas Gruenbacher [Mon, 14 Mar 2011 16:27:45 +0000 (17:27 +0100)]
drbd: Get rid of P_MAX_CMD
Instead of artificially enlarging the command decoding arrays to
P_MAX_CMD entries, check if an index is within the valid range using the
ARRAY_SIZE() macro.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Andreas Gruenbacher [Fri, 28 Jan 2011 12:28:51 +0000 (13:28 +0100)]
drbd: Remove redundant check
Opening a device only succeeds on a primary node, or when explicitly
setting the allow_oos module parameter to allow opening the device
read-only on a secondary node. There is no other way that a request can
get into drbd_make_request(), so this code cannot trigger.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Andreas Gruenbacher [Tue, 22 Feb 2011 01:15:32 +0000 (02:15 +0100)]
drbd: Improve how conflicting writes are handled
The previous algorithm for dealing with overlapping concurrent writes
was generating unnecessary warnings for scenarios which could be
legitimate, and did not always handle partially overlapping requests
correctly. Improve it algorithm as follows:
* While local or remote write requests are in progress, conflicting new
local write requests will be delayed (commit 82172f7).
* When a conflict between a local and remote write request is detected,
the node with the discard flag decides how to resolve the conflict: It
will ask its peer to discard conflicting requests which are fully
contained in the local request and retry requests which overlap only
partially. This involves a protocol change.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Andreas Gruenbacher [Tue, 1 Mar 2011 14:40:43 +0000 (15:40 +0100)]
drbd: Use ping-timeout when waiting for missing ack packets
When the node with the discard flag resolves write conflicts in
dual-primary mode, it may determine that its peer has sent ack packets
on the metadata socket which did not arrive, yet. Wait for the next ack
with ping-timeout instead of a hard-coded 30 seconds.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Andreas Gruenbacher [Sat, 26 Feb 2011 22:19:15 +0000 (23:19 +0100)]
drbd: Concurrent write detection fix
Commit 9b1e63e changed the concurrent write detection algorithm to only insert
peer requests into write_requests tree after determining that there is no
conflict. With this change, new conflicting local requests could be added
while the algorithm runs, but this case was not handled correctly. Instead of
making the algorithm deal with this case, switch back to adding peer requests
to the write_requests tree immediately: this improves fairness.
When a peer request is discarded, remove that request from the write_requests
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Lars Ellenberg [Mon, 21 Feb 2011 12:21:00 +0000 (13:21 +0100)]
drbd: allow to select specific bitmap pages for writeout
We are about to allow several changes to the active set in one activity
log transaction. We have to write out the corresponding bitmap pages as
well, if changed.
Introduce drbd_bm_mark_for_writeout(), then re-use the existing bitmap
writeout path to submit all marked pages in one go.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>