Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

drivers/libusb{0,1}.c: reset driver parameters while enumerating #2611

Merged
merged 3 commits into from
Sep 11, 2024

Conversation

tofurky
Copy link
Contributor

@tofurky tofurky commented Sep 1, 2024

With an Arduino Leonardo compatible board attached alongside a CyberPower CP1500PFCLCD, usbhid-ups would incorrectly conclude that there was no kernel driver attached to the UPS and then fail subsequent steps accordingly.

The cause is the global usb_communication_subdriver_t struct; when a subdriver (e.g. arduino-hid) sets different values during the enumeration loop, they're not set back to defaults afterwards, causing issues with other subdrivers (e.g. cps-hid).

So, if a subdriver doesn't match, set a selection of fields back to their default values using the newly added nut_usb_subdriver_defaults() before the next attempt.

General points

  • Described the changes in the PR submission or a separate issue, e.g.
    known published or discovered protocols, applicable hardware (expected
    compatible and actually tested/developed against), limitations, etc.

  • There may be multiple commits in the PR, aligned and commented with
    a functional change. Notably, coding style changes better belong in a
    separate PR, but certainly in a dedicated commit to simplify reviews
    of "real" changes in the other commits. Similarly for typo fixes in
    comments or text documents.

  • Please star NUT on GitHub, this helps with sponsorships! ;)

Frequent "underwater rocks" for driver addition/update PRs

  • Revised existing driver families and added a sub-driver if applicable
    (nutdrv_qx, usbhid-ups...) or added a brand new driver in the other
    case.

  • Did not extend obsoleted drivers with new hardware support features
    (notably blazer and other single-device family drivers for Qx protocols,
    except the new nutdrv_qx which should cover them all).

  • For updated existing device drivers, bumped the DRIVER_VERSION macro
    or its equivalent.

  • For USB devices (HID or not), revised that the driver uses unique
    VID/PID combinations, or raised discussions when this is not the case
    (several vendors do use same interface chips for unrelated protocols).

  • For new USB devices, built and committed the changes for the
    scripts/upower/95-upower-hid.hwdb file

  • Proposed NUT data mapping is aligned with existing docs/nut-names.txt
    file. If the device exposes useful data points not listed in the file, the
    experimental.* namespace can be used as documented there, and discussion
    should be raised on the NUT Developers mailing list to standardize the new
    concept.

  • Updated data/driver.list.in if applicable (new tested device info)

Frequent "underwater rocks" for general C code PRs

  • Did not "blindly assume" default integer type sizes and value ranges,
    structure layout and alignment in memory, endianness (layout of bytes and
    bits in memory for multi-byte numeric types), or use of generic int where
    language or libraries dictate the use of size_t (or ssize_t sometimes).
  • Progress and errors are handled with upsdebugx(), upslogx(),
    fatalx() and related methods, not with direct printf() or exit().
    Similarly, NUT helpers are used for error-checked memory allocation and
    string operations (except where customized error handling is needed,
    such as unlocking device ports, etc.)

  • Coding style (including whitespace for indentations) follows precedent
    in the code of the file, and examples/guide in docs/developers.txt file.

  • For newly added files, the Makefile.am recipes were updated and the
    make distcheck target passes.

General documentation updates

  • Updated docs/acknowledgements.txt (for vendor-backed device support)

  • Added or updated manual page information in docs/man/*.txt files
    and corresponding recipe lists in docs/man/Makefile.am for new pages

  • Passed make spellcheck, updated spell-checking dictionary in the
    docs/nut.dict file if needed (did not remove any words -- the make
    rule printout in case of changes suggests how to maintain it).

Additional work may be needed after posting this PR

  • Propose a PR for NUT DDL with detailed device data dumps from tests
    against real hardware (the more models, the better).

  • Address NUT CI farm build failures for the PR: testing on numerous
    platforms and toolkits can expose issues not seen on just one system.

  • Revise suggestions from LGTM.COM analysis about "new issues" with
    the changed codebase.

@tofurky
Copy link
Contributor Author

tofurky commented Sep 1, 2024

Hey, so per the commit date I've been sitting on this one for a while. I saw that a new parameter was added in e1fc248 and so modified the new function to set that back to the default as well. I also saw 66c4e10 was added in the interim; I think this should still play nicely with it.

Some background, here is what happens without the patch with an Arduino Leonardo compatible board attached along with the CyberPower UPS (ID 2341:8036 Arduino SA Leonardo (CDC ACM, HID)):

Sep 01 19:42:23 aquos nut-driver@cyberpower[374383]: Network UPS Tools driver 2.8.2.989-989-g4bf05e115 (development iteration after 2.8.2) - Generic HID driver 0.56
Sep 01 19:42:23 aquos nut-driver@cyberpower[374383]: USB communication driver (libusb 1.0) 0.49
Sep 01 19:42:23 aquos nut-driver@cyberpower[374383]: Can't claim USB device [0764:0501]@0/2/0: Invalid parameter
Sep 01 19:42:23 aquos nut-driver@cyberpower[374383]: upsnotify: failed to notify about state 4: no notification tech defined, will not spam more about it
Sep 01 19:42:23 aquos nut-driver@cyberpower[374327]: Driver failed to start (exit status=1)

And with the patch, things enumerate as expected:

Sep 01 19:44:26 aquos nut-driver@cyberpower[428389]: Network UPS Tools upsdrvctl - UPS driver controller 2.8.2.989.1-990-g0d53e8d4e (development iteration after 2.8.2)
Sep 01 19:44:26 aquos nut-driver@cyberpower[428390]: Network UPS Tools driver 2.8.2.989.1-990-g0d53e8d4e (development iteration after 2.8.2) - Generic HID driver 0.56
Sep 01 19:44:26 aquos nut-driver@cyberpower[428390]: USB communication driver (libusb 1.0) 0.49
Sep 01 19:44:26 aquos nut-driver@cyberpower[428390]: Using subdriver: CyberPower HID 0.81
Sep 01 19:44:27 aquos usbhid-ups[428393]: Startup successful
Sep 01 19:44:27 aquos usbhid-ups[428393]: upsnotify: failed to notify about state 2: no notification tech defined, will not spam more about it
Sep 01 19:44:27 aquos nut-driver@cyberpower[428389]: WARNING: /sbin/upsdrvctl was called directly on a system with systemd support.
Sep 01 19:44:27 aquos nut-driver@cyberpower[428389]:     Please consider using 'upsdrvsvcctl' instead, to avoid conflicts with
Sep 01 19:44:27 aquos nut-driver@cyberpower[428389]:     nut-driver service instances prepared by 'nut-driver-enumerator'!
Sep 01 19:44:27 aquos systemd[1]: Started [email protected] - Network UPS Tools - device driver for NUT device 'cyberpower'.
Sep 01 19:44:27 aquos usbhid-ups[428393]: sock_connect: enabling asynchronous mode (auto)

@tofurky
Copy link
Contributor Author

tofurky commented Sep 2, 2024

Eh, looking more closely at 66c4e10 I think I might need to now change this slightly to avoid stepping on user-defined values.

I could copy the values after they're parsed e.g. after

		if ((s = getval("usb_hid_ep_out"))) {
			if (!str_to_ushort(s, &us, 16) || (us > USB_CTRL_ENDPOINT_MAX)) {
				fatalx(EXIT_FAILURE, "%s: could not parse usb_hid_ep_out", __func__);
			}
			usb_subdriver.hid_ep_out = (usb_ctrl_endpoint)us;
		}

@tofurky
Copy link
Contributor Author

tofurky commented Sep 2, 2024

What about wrapping each assignment back to defaults with getval()?

static void nut_usb_subdriver_defaults(usb_communication_subdriver_t *subdriver)
{
	if (!getval("usb_config_index"))
		subdriver->usb_config_index = LIBUSB_DEFAULT_CONF_INDEX;
	if (!getval("usb_hid_rep_index"))
		subdriver->hid_rep_index = LIBUSB_DEFAULT_INTERFACE;
	if (!getval("usb_hid_desc_index"))
		subdriver->hid_desc_index = LIBUSB_DEFAULT_DESC_INDEX;
	if (!getval("usb_hid_ep_in"))
		subdriver->hid_ep_in = LIBUSB_DEFAULT_HID_EP_IN;
	if (!getval("usb_hid_ep_out"))
		subdriver->hid_ep_out = LIBUSB_DEFAULT_HID_EP_OUT;
}

@jimklimov
Copy link
Member

jimklimov commented Sep 2, 2024

Sounds right, thanks!

Note there's also some USB work brewing in PR #2604, including some (static) method renaming into a common style with nut_libusb_*() for static internal methods and nut_usb_*() for a sort of public API - please check if that could impact your work.

Also, would be consistent to have a similar fix in libusb0.c (if applicable), in case some platforms would require to use it as long as we ship it.

And a NEWS.adoc entry would be great :)

@jimklimov jimklimov added bug USB service/daemon start/stop General subject for starting and stopping NUT daemons (drivers, server, monitor); also BG/FG/Debug Incorrect or missing readings On some devices driver-reported values are systemically off (e.g. x10, x0.1, const+Value, etc.) labels Sep 2, 2024
@jimklimov jimklimov added this to the 2.8.3 milestone Sep 2, 2024
@tormodvolden
Copy link
Contributor

tormodvolden commented Sep 3, 2024

(This should probably be discussed in a Discussion item, sorry for hi-jacking)
While working on #2604, and further investigating possible solutions for what is now filed as #2613 and #2614, I was thinking it would be good to get rid of most global variables, and stuff more into device or device handle structures. I actually started on some work[1] on an extended USB device handle USBHandle_t to replace the current use of "udev", which encapsulates the libusb_device_handle that udev originally points to. Here langid and what not could be carried around. I also thought the "subdriver" could fit into this scheme, but as soon as I started to port one driver over to the new "udev" I found it entangled with another and soon this appeared as much more work than I optimistically had set out for. Basically I don't know the nut architecture well enough to properly attack this, for instance the scope and lifetime of various elements, and how much goes on in one and a single process.

EDIT:
[1] This doesn't even build, but illustrates the idea: https://github.com/tormodvolden/nut/commits/extended_handle/

@tormodvolden
Copy link
Contributor

Something to think of (after merging this) is to rename nut's LIBUSB_* macros. "LIBUSB_" is kind of libusb's own namespace, so by including "libusb.h" there will be various LIBUSB_* macros defined there. The nut macros could be renamed to e.g. NUT_LIBUSB_* to avoid any future trouble.

@jimklimov
Copy link
Member

One of these recent USB discussions (can't find quickly via phone) also asked how much data is shared and in what scope - I guess global vars were picked by those before us as a sufficiently little evil for the model that every driver process is independent, even when copies of the same driver are used for different attached devices, or even same device (e.g. SNMP UPS watched via different MIBs).

@jimklimov
Copy link
Member

@tofurky : I was wondering if we can getval() in this code easily. Maybe it is more prudent to store initial values at the beginning of open method, and restore at the end - just into method-local variables?

@tofurky
Copy link
Contributor Author

tofurky commented Sep 5, 2024

@tofurky : I was wondering if we can getval() in this code easily. Maybe it is more prudent to store initial values at the beginning of open method, and restore at the end - just into method-local variables?

I just tested/pushed the change to call getval() - it seems to work OK at least to distinguish if the override variables are set.

@tofurky tofurky force-pushed the subdriver_probe_reset branch 2 times, most recently from 057ee77 to 4ecac8e Compare September 5, 2024 00:48
@tofurky
Copy link
Contributor Author

tofurky commented Sep 5, 2024

I added a note to NEWS.adoc; I think it mostly captures the gist of the what and why for this change.

I will look into building against libusb0. Presently I'm just plopping the debian/ dir into my git repository to build packages so will need to adjust them a bit.

@tofurky
Copy link
Contributor Author

tofurky commented Sep 6, 2024

I've rebuiilt against libusb0 (with no changes to libusb0.c) and I'm not seeing the issue:

Sep 05 20:48:16 aquos systemd[1]: Starting [email protected] - Network UPS Tools - device driver for NUT device 'cyberpower'...
Sep 05 20:48:16 aquos nut-driver@cyberpower[547183]: Network UPS Tools upsdrvctl - UPS driver controller 2.8.2.1010.1-1011-g4ecac8ecc (development iteration after 2.8.2)
Sep 05 20:48:16 aquos nut-driver@cyberpower[547184]: Network UPS Tools driver 2.8.2.1010.1-1011-g4ecac8ecc (development iteration after 2.8.2) - Generic HID driver 0.56
Sep 05 20:48:16 aquos nut-driver@cyberpower[547184]: USB communication driver (libusb 0.1) 0.48
Sep 05 20:48:17 aquos nut-driver@cyberpower[547184]: Using subdriver: CyberPower HID 0.81
Sep 05 20:48:18 aquos nut-driver@cyberpower[547183]: WARNING: /sbin/upsdrvctl was called directly on a system with systemd support.
Sep 05 20:48:18 aquos nut-driver@cyberpower[547183]:     Please consider using 'upsdrvsvcctl' instead, to avoid conflicts with
Sep 05 20:48:18 aquos nut-driver@cyberpower[547183]:     nut-driver service instances prepared by 'nut-driver-enumerator'!

Running with -DDD shows that it's only probing a handful of devices (Linux Foundation N root hub) before finding the UPS. So there may be differences in the enumeration order between the 2 libraries. That's not to say the issue doesn't also exist there, just that I couldn't reproduce it. I am still looking into how to apply the same parameter reset in libusb0.c.

With an Arduino Leonardo compatible board attached alongside a
CyberPower CP1500PFCLCD, usbhid-ups would incorrectly conclude that
there was no kernel driver attached to the UPS and then fail subsequent
steps accordingly.

The cause is the global usb_communication_subdriver_t struct; when a
subdriver (e.g. arduino-hid) sets different values during the
enumeration loop, they're not set back to defaults afterwards, causing
issues with other subdrivers (e.g. cps-hid).

So, if a subdriver doesn't match, set a selection of fields back to
their default values using the newly added nut_usb_subdriver_defaults()
before the next attempt.

Signed-off-by: Matt Merhar <[email protected]>
@tofurky tofurky changed the title drivers/libusb1.c: reset driver parameters while enumerating drivers/libusb{0,1}.c: reset driver parameters while enumerating Sep 6, 2024
@tofurky
Copy link
Contributor Author

tofurky commented Sep 6, 2024

I rebased on master and force-pushed an identical change to libusb0.c.

This was compile and runtime tested, but not confirmed to fix the issue since I can't repro. I intentionally set a wrong usb_hid_rep_index value in ups.conf and confirmed that things still broke as expected, though. A wrong usb_config_index (1) caused a segfault, didn't look too much into it though:

   0.268093	[D2] Trying to match device
   0.268094	[D2] match_function_subdriver (non-SHUT mode): matching a device...
   0.268095	[D3] match_function_regex: matching a device...
   0.268110	[D2] Device matches
   0.268118	[D3] nut_usb_set_altinterface: skipped usb_set_altinterface(udev, 0)
   0.271084	[D2] Retrieved HID descriptor (expected 9, got 9)
   0.271089	[D3] HID descriptor, method 1: (9 bytes) => 09 21 10 01 21 01 22 ae 01
   0.271090	[D3] HID descriptor length (method 1) 430
Segmentation fault

With an Arduino Leonardo compatible board attached alongside a CyberPower CP1500PFCLCD, usbhid-ups would incorrectly conclude that there was no kernel driver attached to the UPS and then fail subsequent steps accordingly.

The cause is the global usb_communication_subdriver_t struct; when a subdriver (e.g. arduino-hid) sets different values during the enumeration loop, they're not set back to defaults afterwards, causing issues with other subdrivers (e.g. cps-hid).

So, if a subdriver doesn't match, set a selection of fields back to their default values using the newly added nut_usb_subdriver_defaults() before the next attempt.

// tofurky

Signed-off-by: Jim Klimov <[email protected]>
@jimklimov jimklimov merged commit 5203957 into networkupstools:master Sep 11, 2024
28 of 30 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Incorrect or missing readings On some devices driver-reported values are systemically off (e.g. x10, x0.1, const+Value, etc.) service/daemon start/stop General subject for starting and stopping NUT daemons (drivers, server, monitor); also BG/FG/Debug USB
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants