Daniel Lenski [Sat, 20 Nov 2021 21:12:34 +0000 (13:12 -0800)]
Refuse to handle forms without ->auth_id (but do it in the right place, and noisily)
In 0b47ea1882346fdedfcd8a315f51aeb39e13459e ("Refuse to handle forms without
->auth_id"), the process_auth_form_cb for the OpenConnect CLI was modified
to silently reject forms with auth_id unset.
Issues with this:
1. If a form doesn't have its auth_id set, it'll fail *silently*, which
makes it confusingly difficult to identify the root cause. (See #351.)
2. As that commit message says, GUIs/front-ends need the auth_id to be set,
but it didn't do anything to enforce it other than for the CLI.
The solution is to reject forms with auth_id unset in process_auth_form()
itself, rather than expecting the front-ends’ callback functions to check
this, and to do so with an error message explaining that this is a bug in
OpenConnect.
Daniel Lenski [Fri, 19 Nov 2021 19:26:59 +0000 (11:26 -0800)]
Bugfix F5 'plain' login form
Fixes #351.
It turns out that the "plain auth form" for F5 has been broken since 0b47ea1882346fdedfcd8a315f51aeb39e13459e. (This form is used as a stand-in
when the server fails to provide a real HTML login form, and only gives a
JavaScript mess that dynamically creates a form.)
DW probably didn't notice this when making that commit, because a07bbbcd711c6fc4ef20fd9719e0e5da40f141f9 had previously fixed the other
places where a form with no `auth_id` could be created.
This ensures that the "plain auth form" has its `auth_id` set, and adds test
coverage of this form to `f5-auth-and-config`.
Daniel Lenski [Wed, 13 Oct 2021 01:25:40 +0000 (18:25 -0700)]
Juniper/NC ESP rekey fix
This should fix https://gitlab.com/openconnect/openconnect/-/issues/322.
This was an immensely tricky issue to identify and solve. Many thanks to
@john508 for his patient logging, testing, and bisecting.
This came down to a subtle combination of 2 problems, described further in
https://gitlab.com/openconnect/openconnect/-/issues/322#note_702122197.
First, a bug:
The code for Juniper/Pulse 'dontsend' added in
https://gitlab.com/openconnect/openconnect/-/commit/b4f50f8bd5da7e6ac926ddd5095501edbc204cd0
caused IPv6 packets to get sent over the Juniper oNCP/TLS channel in a
completely mangled form, thereby confusing the server and making it not
respond to subsequent ESP rekey packets sent over the oNCP/TLS channel.
The fix is to use this code path *only* for the Pulse protocol, not for the
Juniper protocol. Juniper cannot do ESP-over-IPv6 at all, and cannot send
tunneled IPv6 packets at all (neither via ESP-over-IPv4, nor via oNCP/TLS).
Second, we introduced a subtle regression against a Juniper server behavior
which we weren't previously aware of:
Juniper servers apparently don't always resend the IP address in a
configuration packet, if that configuration packet is for
rekey/reconnection only.
To address this, we need to add a special case for Juniper to the
install_vpn_opts() function, which was added in
https://gitlab.com/openconnect/openconnect/-/commit/3d845bc9bf62b3e816e0d2fd72970ef33964a191.
Daniel Lenski [Sat, 13 Nov 2021 18:35:36 +0000 (10:35 -0800)]
Fix/update comments in fake-*-server.py scripts
Also adds a stub to make sure that 'fake-juniper-server.py' rejects attempts
to start speaking the Junos/Pulse protocol, rather than the Juniper/oNCP
protocol.
Daniel Lenski [Fri, 12 Nov 2021 01:16:01 +0000 (17:16 -0800)]
Re-add TAP-Windows driver to installer, and update docs to reference its inclusion
This mostly reverts commit 8c1d2de09829d0930d1a7642b0729cd6cbf5ab67 ("Remove
TAP-Windows driver from installer, and update docs to reference Wintun's
default inclusion").
It appears that Wintun is not fully stable yet for OpenConnect VPN connections.
See #338 for two reports of instability (with #324 as a possible third), including
one where switching to TAP-Windows clearly resolved the issue.
Per https://gitlab.com/openconnect/openconnect/-/issues/338#note_731175712,
we should revert to installing and using TAP-Windows by default until we can figure out why.
OpenConnect will continue to use Wintun if TAP-Windows is unavailable, but will
issue a loud warning to the log.
Daniel Lenski [Thu, 21 Oct 2021 00:48:12 +0000 (17:48 -0700)]
Add support for Fortinet's HTML-type multi-factor authentication
Fortinet is known to support *two* different types of MFA/2FA challenges
(see details in
https://gitlab.com/openconnect/openconnect/-/issues/332#note_709413011):
1. HTTP 200 response to initial login with plain-text body
(`ret=...,reqid=...,polid=...,...tokeninfo=,chal_msg=...`) containing
prompt and fields to be parroted back.
2. HTTP 401 response to initial login with HTML body containing a "normal"
HTML form to be filled out and sent back.
Daniel Lenski [Mon, 18 Oct 2021 03:03:23 +0000 (20:03 -0700)]
Enable Fortinet DPD even if server doesn't say that reconnect-after-drop is allowed
Even if we can't automatically reconnect, disabling DPD doesn't seem like a
good idea. This will just recapitulate the poor behavior of official
Fortinet clients, which don't provide useful signals about dropped
connection.
Instead, let's just tell users explicitly that reconnect probably won't work
in the logging messages.
Daniel Lenski [Fri, 24 Sep 2021 22:50:53 +0000 (22:50 +0000)]
Annotate vpnc-script-win.js with a header documenting its exact source revision
By annotating the vpnc-script-win.js included in the Windows installers
with information about the exact source revision, we will ease reporting
and reduce confusion about problems resulting from interactions between
OpenConnect and vpnc-script.
Example of the header added to vpnc-script-win.js:
// This script matches the version found at https://gitlab.com/openconnect/vpnc-scripts/-/blob/b749c2cadc2f32e2efffa69302861f9a7d4a4e5f/vpnc-script-win.js
// Updated on 2021-09-24 by Daniel Lenski <dlenski@gmail.com> ("Ensure that vpnc-script-win.js works even if INTERNAL_IP4_{NETADDR,NETMASK} are unset")
//
The logging statement has the call of the form
logging.(format_string % (format_args...))
For such calls, it is recommended to leave string interpolation to the
logging method itself and be written as
logging.(format_string, format_args...)
so that the program may avoid incurring the cost of the interpolation
in those cases in which no message will be logged. For more details, see
PEP 282.
Do not use a mutable like `list` or `dictionary` as a default value to an
argument. Python’s default arguments are evaluated once when the function
is defined. Using a mutable default argument and mutating it will mutate
that object for all future calls to the function as well.
Consider using literal syntax to create the data structure
Using the literal syntax can give minor performance bumps compared to
using function calls to create `dict`, `list` and `tuple`.
This is because here, the name `dict` must be looked up in the global
scope in case it has been rebound. Same goes for the other two types
`list()` and `tuple()`.
Using the `len` function to check if a sequence is empty is not idiomatic
and can be less performant than checking the truthiness of the object.
`len` doesn't know the context in which it is called, so if computing the
length means traversing the entire sequence, it must; it doesn't know
that the result is just being compared to 0. Computing the boolean value
can stop after it sees the first element, regardless of how long the
sequence actually is.
Daniel Lenski [Thu, 16 Sep 2021 19:51:43 +0000 (12:51 -0700)]
Provide the vpnc-script with our PID (as $VPNPID)
This will enable a vpnc-script to more easily identify which VPN connection
is calling it, in the case of multiple concurrent or "stacked" VPN
connections.
Because OpenConnect (and vpnc) invoke the vpnc-script via an intermediate
shell process, the vpnc-script would otherwise need to determine its
GRANDparent PID, which is an error-prone, and not easily portable, process.
See https://gitlab.com/openconnect/vpnc-scripts/-/issues/28 and
https://gitlab.com/openconnect/vpnc-scripts/-/merge_requests/36 for issues
with the current approaches.
Daniel Lenski [Fri, 2 Apr 2021 07:16:59 +0000 (00:16 -0700)]
Try to delete-and-reclaim IP addresses from down interfaces
As mentioned previously, Windows will not allow us to set an IP address on
the tunnel interface (using 'netsh' in vpnc-script-win.js) if that address
is assigned to another interface, *even if* that interface is not currently
up.
In order to figure out if our interface's desired IP address(es) are
assigned to any other interface(s), we have to iterate over the results of
`GetAdaptersAddresses`. This retrieves a list of all IP addresses (both
Legacy and IPv6) by interface, along with up/down status thereof.
If we find another interface that is assigned our desired IP address, but is
not up, we should attempt to unassign/delete/reclaim the address from this
interface. In order to do that, we need to use a completely separate API
(`GetUnicastIpAddressTable`) to get a list of all assigned IP addresses in a
form that can be used to delete them (with `DeleteUnicastIpAddressEntry`).
The simplest way to structure the code to do this is to use two nested
loops:
- The outer loop iterates over the adapters and addresses returned by
`GetAdaptersAddresses`
- The inner loop — which runs only in the case of a conflicting address —
iterates over the addresses as returned by `GetUnicastIpAddressTables`.
- Perhaps this could be optimized, but because this runs only once per VPN
connection, and a typical client system will have <<N=100 adapters and IP
addresses… it doesn't seem worth it.
NB: I had previously tried avoiding the use of `GetUnicastIpAddressTable`
altogether by creating a fake/synthetic `MIB_UNICASTIPADDRESS_ROW` with
most of the fields filled in using information returned by
`GetAdaptersAddresses`, and passing this directly to
`DeleteUnicastIpAddressEntry`. It simply did not work, perhaps due to
the absence of one or two seemingly-unimportant fields whose values
couldn't be determined. (See 4acc773c69d8b06da2983fd5fe70cca357822503)
Daniel Lenski [Fri, 2 Apr 2021 05:57:02 +0000 (22:57 -0700)]
Add check_address_conflicts() to tun-win32.c
If other adapters have conflicting IPv4/IPv6 addresses, vpnc-script-win.js
will fail to configure the tunnel addresses correctly.
For now, check_address_conflicts() only checks and fails; it doesn't try to
delete/reclaim the conflicting addresses.
[ Yay, Windows, for persisting a value we don't need or want persisted, and
not allowing us to override this with 'netsh interface ipvX set address'.
This appears to be a limitation of the Windows 'netsh' configuration
utility. ]
Daniel Lenski [Thu, 1 Apr 2021 00:21:41 +0000 (17:21 -0700)]
Don't set Legacy IP address on Windows tunnel interface within OpenConnect itself
The pre-existing Windows routing connection script (vpnc-script-win.js) was
not able to reliably set the Legacy IP address of the tunnel interface, so
we've been setting it in OpenConnect itself since 60d1f092e35f05217f1c96823c4f1b86c7915bbd.
As of https://gitlab.com/openconnect/vpnc-scripts/-/merge_requests/26, we've
got vpnc-script-win.js setting the Legacy IP address correctly, and IPv6 as
well. We should leave this as the script's responsibility for
cross-platform consistency.
Daniel Lenski [Wed, 31 Mar 2021 01:08:00 +0000 (18:08 -0700)]
Use hostname as Wintun ifname (if ifname not specified)
Without this change, OpenConnect will *only* attempt to use the Wintun
driver if `-i InterfaceName`, and will require TAP-Windows driver otherwise.
That seems like a surprising and hard-to-discover behavior.
Instead, we should use the VPN server's hostname as a sane default interface
name with Wintun, and only attempt to use TAP-Windows as a fallback in the
case where Wintun can't be initialized.
Daniel Lenski [Wed, 11 Aug 2021 17:12:40 +0000 (17:12 +0000)]
Only remove ERR_GET_FUNC for OpenSSL v3.0 and newer
This function is removed in OpenSSL 3.0 beta 2, per
https://github.com/openssl/openssl/blob/openssl-3.0.0-beta2/CHANGES.md:
> The ERR_GET_FUNC() function was removed. With the loss of
> meaningful function codes, this function can only cause
> problems for calling applications.
It appears that this function may not have had any useful purpose for a long
time (see
https://gitlab.com/openconnect/openconnect/-/merge_requests/262#note_648720006),
but in the absence of clear documentation or testing, we should limit its
remove to OpenSSL 3.0+ to be on the safe side.
Dimitri Papadopoulos [Sun, 1 Aug 2021 21:04:42 +0000 (23:04 +0200)]
Build with OpenSSL 3.0 beta 2 Release Candidate
From the OpenSSL 3.0 Migration guide:
The function code part of an OpenSSL error code is no longer relevant
This code is now always set to zero. Related functions are deprecated.
In our case, removing calls to ERR_GET_FUNC() will not change anything:
The reason code PKCS12_R_MAC_VERIFY_FAILURE is raised in two OpenSSL functions:
* PKCS12_newpass() in p12_npas.c,
* PKCS12_parse() in p12_kiss.c.
In out code, we check the reason code is PKCS12_R_MAC_VERIFY_FAILURE after
calling PKCS12_parse(), so the incriminated function is necessarily
PKCS12_parse(). Verifying the function code is PKCS12_F_PKCS12_PARSE is
redundant.
EVP_F_EVP_DECRYPTFINAL_EX / EVP_R_BAD_DECRYPT
The reason code EVP_R_BAD_DECRYPT is raised in a single OpenSSL function:
* EVP_DecryptFinal_ex() in evp_enc.c
Therefore verifying the function code is EVP_F_EVP_DECRYPTFINAL_EX is
useless, EVP_F_EVP_DECRYPTFINAL_EX is the only possible value.
Mark auth-swtpm test as XFAIL on Fedora/OpenSSL and Fedora/OpenSSL/clang
Apparently, verifying that either 'tsstartup' or 'tpm2_startup' is available
is *not* sufficient to make auth-swtpm tests work again. See error log at
https://gitlab.com/openconnect/openconnect/-/issues/287#note_641338923
[Originally by DP. DL added Fedora/OpenSSL/clang as well]
Signed-off-by: Dimitri Papadopoulos <3350651-DimitriPapadopoulos@users.noreply.gitlab.com> Signed-off-by: Daniel Lenski <dlenski@gmail.com>
Daniel Lenski [Mon, 2 Aug 2021 20:41:29 +0000 (13:41 -0700)]
Verify that TPMv2 startup tools are present in order to enable auth-swtpm tests
Autoconf source now verifies that either 'tpm2_startup' or 'tsstartup' is
found before enabling this test.
See discussion at https://gitlab.com/openconnect/openconnect/-/issues/287#note_640185660.
I also added tpm2-tools (package providing 'tpm2_startup') to the Fedora
build image, in https://gitlab.com/openconnect/build-images/-/commit/35ee4ffb88ba319014c321dc8999e48fce81f130.
Daniel Lenski [Mon, 2 Aug 2021 16:58:04 +0000 (09:58 -0700)]
Mark sync/no-HDLC PPP tests as XFAIL for all CI images
See https://gitlab.com/openconnect/openconnect/-/issues/287#note_641198529 for discussions.
Across all CI images, non-HDLC PPP tests are consistently failing (this is
described as "synchronous" framing in the '90s-era terminology of pppd, and
is supported by 'pppd sync').
FAIL: ppp-over-tls
==================
Testing PPP ...
[...]
Starting PPP peer (sync/no-HDLC, IPv4+IPv6, DNS, extraneous VJ and CCP)... started in 0 seconds
2021/07/31 20:54:18 socat[10622] E waitpid(): child 10625 exited with status 1
Connecting to it with openconnect --protocol=nullppp... failed (after 0 seconds)
[...]
===== START pppd log =====
Couldn't set tty to PPP discipline: Invalid argument
The 'pppd sync' support has always appeared to be a fairly marginal part of
pppd capabilities, brittle and not well-tested, and I've run into other
problems with it before (see eaabbb09 for example).
This is frustrating because non-HDLC/pre-framed PPP is the version that is
(and should be!) used in all modern implementations of PPP, including F5 and
Fortinet's implementations.
This patch splits the sync/no-HDLC PPP test into a separate script
(ppp-over-tls-sync), and marks it as XFAIL for all CI runs, so that we can
continue to test it by default when running locally, and to fail on the
other PPP tests (which use async mode aka “HDLC-like” framing).
Daniel Lenski [Sat, 31 Jul 2021 14:42:12 +0000 (07:42 -0700)]
Use sysctl to un-disable IPv6 for all CI runs where PPP tests are enabled
See https://gitlab.com/openconnect/openconnect/-/issues/287#note_640115686,
and https://gitlab.com/openconnect/vpnc-scripts/-/issues/12#note_547951023
for where this issue was originally discovered (specifically on the Ubuntu
18.04 CI runs).
David Woodhouse [Wed, 28 Jul 2021 15:52:26 +0000 (16:52 +0100)]
Make all cert rules order-only
For some reason, perhaps a make update or perhaps just higher precision
timestamps causing some files to actually appear as older than others,
the CI has taken to rebuilding all the certs. Don't do that.
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
The comments on TPM2TSS_GENKEY and CREATE_TPM2_KEY say the former can
only create keys, while the latter can import them too, but we used them
the other way around. This causes the auth-hwtpm test to fail on
machines just with tpm2-tss-engine installed.
While I haven't found official documentation for the TAP_IOCTL_GET_VERSION
control code, clearly the DeviceIoControl() parameters were incorrect,
see other online examples:
https://github.com/juhovh/tapcfg/blob/3d5ef74/src/lib/tapcfg_windows.c#L140-L146
https://github.com/OpenVPN/openvpn/blob/34b4254/src/openvpn/tun.c#L6030-L6032