www.infradead.org Git - users/dwmw2/openconnect.git/commit

Increase SO_SNDBUF on UDP socket

In commit 3444f811ae ("Set SO_SNDBUF on DTLS socket and handle -EAGAIN
on it") in 2013 I cut the socket wmem to just 2 packets in size, to fix
bufferbloat issues when handling VoIP streams simultaneously with bulk
uploads.

In #250 we discovered that this is a problem for high-bandwidth setups
because we aren't actually keeping the send buffers full. Even though
there's only ~100µs of latency between getting -EAGAIN on the UDP
socket, and finally sending the packet successfully...

20:03:38.803525 sendmsg(5, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="\27\376\375\0\1\0\0\0\2m\226\5Q\0\1\0\0\0\2m\226\177\202#\344>-\315H6N\233"..., iov_len=1374}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, 0) = -1 EAGAIN (Resource temporarily unavailable)
20:03:38.803556 select(4, [3], NULL, NULL, {tv_sec=0, tv_usec=0}) = 0 (Timeout)
20:03:38.803585 select(7, [3 5 6], [5], [5], {tv_sec=6, tv_usec=0}) = 1 (in [6], left {tv_sec=5, tv_usec=999998})
20:03:38.803616 sendmsg(5, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="\27\376\375\0\1\0\0\0\2m\226\5Q\0\1\0\0\0\2m\226\177\202#\344>-\315H6N\233"..., iov_len=1374}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, 0) = 1374

... that's 100µs when the physical network device actually has nothing to
send. And it's happening over and over again as the UDP socket buffers
fill and then we poll for them to be sufficiently empty again.

This is exacerbated by the fact that ip_info.mtu is actually zero at
this point for PPP protocols — but only slightly because 2*1500 isn't
actually much more than the minimum value we get when we ask for zero
anyway.

It also doesn't help that the kernel deliberately doesn't wake a waiter
until they can make "significant" progress and the buffers are down to
*half* the sndbuf value (see sock_def_write_space() in net/core/sock.c).

Testing with 'iperf3 -u -b 900M' on my home network with 1Gb/s network
between client and (Fortinet) server, I find I have to have to use a
value of around 20000 (doubled to 40000) in order to avoid seeing drops
on the *tun* interface because we aren't moving packets fast enough.

We need to find a decent balance between high-bandwidth and bufferbloat,
so let's try the following: First "assume" 1500 for the MTU if it isn't
actually set, and then multiply by the configured queue len which
defaults to 10. That ought to be a reasonable compromise for bandwidth
vs. bufferbloat for the general case, *and* allows users to tweak it.

Signed-off-by: David Woodhouse <dwmw2@infradead.org>

author	David Woodhouse <dwmw2@infradead.org>
	Tue, 8 Jun 2021 19:34:45 +0000 (20:34 +0100)
committer	David Woodhouse <dwmw2@infradead.org>
	Tue, 8 Jun 2021 19:52:49 +0000 (20:52 +0100)
commit	d4ba1e1decbbdbe0f13ef27f327836c130af7ad4
tree	f0babc722b8d1b6b98c1df8f02f0f5ce989066f2	tree
parent	69a076958fefda880db60ea50d5d3c8f52d3cb39	commit \| diff