Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect data from the **riello_usb** driver #1895

Closed
shgubar opened this issue Apr 3, 2023 · 21 comments
Closed

Incorrect data from the **riello_usb** driver #1895

shgubar opened this issue Apr 3, 2023 · 21 comments
Labels
bug impacts-release-2.8.0 Issues reported against NUT release 2.8.0 (maybe vanilla or with minor packaging tweaks) impacts-release-2.8.1 Issues reported against NUT release 2.8.1 (maybe vanilla or with minor packaging tweaks) Incorrect or missing readings On some devices driver-reported values are systemically off (e.g. x10, x0.1, const+Value, etc.) Riello Riello UPS devices (serial, usb), usually with a variant of Qx protocol USB

Comments

@shgubar
Copy link

shgubar commented Apr 3, 2023

HI,
I have two identical UPS - Riello SEP 1000 ER, they differ only in firmware.

  1. Riello SEP 1000 ER - SWM073-01-00
  2. Riello SEP 1000 ER - SWM073-01-01

The problem is precisely with the one in which the firmware is SWM073-01-01.
When the NUT Server starts, it displays the correct data, but after some time (it is different), the following data is displayed:

output.L1.current: 52432
output.L1.power: 4291611856
output.L1.realpower: 4291611856
output.L2.current: 52432
output.L2.power: 4291611856
output.L2.realpower: 4291611856
output.L3.current: 52432
output.L3.power: 4291611856
output.L3.realpower: 4291611856

All other data are displayed correctly on both UPS.
By the way, there are no problems with a UPS with SWM073-01-00 firmware (only with the exception of the serial number - does not show).

Please help fix this?
Maybe need to provide some more data to analyze this problem?

Thanks.

@jimklimov jimklimov added bug USB Riello Riello UPS devices (serial, usb), usually with a variant of Qx protocol labels Apr 3, 2023
@jimklimov
Copy link
Member

jimklimov commented Apr 3, 2023

I wonder if it is a similar issue to that discussed in https://alioth-lists.debian.net/pipermail/nut-upsdev/2017-November/007358.html or #530 in some way?

  • 4291611856 = 0xFFCCCCD0
  • 52432 = 0xCCD0

So at least the two numbers seem "related" (same lower word). Might be a firmware bug or otherwise unexpected behavior (e.g. new protocol?)

CC @mzampieri70 @CentroricercheRPS

PS: @shgubar Following issues #1744 and #1754, am I correct to assume you are running recent NUT (custom-built from master-branch codebase this year), so possible deficiencies in older distro-packaged code should not be a lead to investigate here?

@shgubar
Copy link
Author

shgubar commented Apr 4, 2023

I wonder if it is a similar issue to that discussed in https://alioth-lists.debian.net/pipermail/nut-upsdev/2017-November/007358.html or #530 in some way?

I'm not sure if this is exactly the same problem. Because I have already done what is described in the given example, but the result is the same.

PS: @shgubar Following issues #1744 and #1754, am I correct to assume you are running recent NUT (custom-built from master-branch codebase this year), so possible deficiencies in older distro-packaged code should not be a lead to investigate here?

Yes, almost everything is true. I used the last master-branch when discussing the problem in the ticket #1858 but still there is a problem.

@jimklimov jimklimov added the Incorrect or missing readings On some devices driver-reported values are systemically off (e.g. x10, x0.1, const+Value, etc.) label Sep 16, 2023
@jimklimov
Copy link
Member

By the way, given that you mentioned two devices and only one of them with the problem - can you please post upsc dumps of both, or better yet - driver start-up logs with debug verbosity 3 or more? Maybe it would help someone figure out what differs or how it becomes wrong.

For example, stop the regularly running driver services and instead run riello_usb -a rps1000 -DDDDDD -d 1 2>&1 | tee -a rps1000.log to collect the outputs into a file (and see them in shell).

Also from the above I understand that both of your devices see correct info after startup, but one becomes wrong after some time? Maybe some bitmask maths vs. int sizes misfired (fixed some bug about those on master in the summer BTW, but I think there the wrong effect was instant). In any case, the number of the -d X parameter means to run X data update loops and dump then. So it might be possible to see with logs in-vivo where it flips from reasonable numbers to bogus ones.

Since you build from source, feel free to add more upsdebugx() messages in interesting points to pinpoint it better, and run the probing driver from the build workspace. Notes in https://github.com/networkupstools/nut/wiki/Building-NUT-for-in%E2%80%90place-upgrades-or-non%E2%80%90disruptive-tests can help set up the build so you would use the same NUT configs as in the regularly running system.

And one more thing, about what I noted earlier, that these numbers seem related in binary representation - on the "physical meaning" side they describe current (Amps) and power (Watts) so should not be intertwined like this...

@shgubar
Copy link
Author

shgubar commented Oct 2, 2023

These are very good ideas for further analysis of the problem device itself.
But unfortunately, I won't be able to run it and test it in the near future to share the logs.
As soon as I do it, I'll let you know right away.

Thank you.

@shgubar
Copy link
Author

shgubar commented Nov 13, 2023

Hi, @jimklimov
An analysis of the operation of the second UPS (with firmware SWM073-01-01) was carried out. The analysis was created in several stages:

  1. NUT version 2.8.0.1 was used (with driver version 0.08):
    1.1 used to launch two drivers at the same time (normal operation of the NUT)
    1.2 applied the operation of only one driver - the other is disabled in all configurations - done both for the first and for the second separately
  2. NUT version 2.8.0.1 was used (with driver version 0.11) - the same application options as in 1.1 and 1.2
  3. NUT version 2.8.1.1 was used (with driver version 0.11) - the same application options - 1.1 and 1.2

So, version 1.1 worked for a long time with only the error I pointed out when creating this issue.

In version 1.2 when working only with the UPS driver (rps1000):

Nov 9 18:17:31 raspserv riello_usb[12234]: Communication with UPS is lost: status read failed!
Nov 9 18:17:31 raspserv nut-server[1022]: Data for UPS [rps1000] is stale - check driver
Nov 9 18:17:31 raspserv upsd[1022]: Data for UPS [rps1000] is stale - check driver
Nov 9 18:17:33 raspserv riello_usb[12234]: Communication with UPS is lost: status read failed!
Nov 9 18:17:34 raspserv nut-monitor[1024]: Poll UPS [rps1000@localhost] failed - Data stale
Nov 9 18:17:34 raspserv nut-monitor[1024]: Communications with UPS rps1000@localhost lost

These errors were constant. Although when both drivers were working, these errors were from rps1000, but periodically.
Further on point 2 (using the same approaches) - it became more and more clear that erroneous data are obtained not only for:

output.L1-3.current
output.L1-3.power
output.L1-3.realpower

and it also applied to the output voltage (reached a value of 15580.8), and the battery charge (a value of 11223), etc.

Regarding point 3 - the same behavior as in option 1, but something has been added in the logs:

Nov 11 12:43:11 raspserv riello_usb[12234]: Communication with UPS is lost: status read failed!
Nov 11 12:43:11 raspserv nut-server[1022]: Data for UPS [rps1000] is stale - check driver
Nov 11 12:43:11 raspserv upsd[1022]: Data for UPS [rps1000] is stale - check driver
Nov 11 12:43:13 raspserv riello_usb[12234]: Communication with UPS is lost: status read failed!
Nov 11 12:43:14 raspserv nut-monitor[1024]: Poll UPS [rps1000@localhost] failed - Data stale
Nov 11 12:43:14 raspserv nut-monitor[1024]: Communications with UPS rps1000@localhost lost
Nov 11 12:43:14 raspserv nut-monitor[1024]: UPS [rps1000@localhost] was last known to be on BYPASS and currently is not communicating, assuming dead
Nov 11 12:43:14 raspserv nut-monitor[1024]: Network UPS Tools upsmon 2.8.1.1
Nov 11 12:43:15 raspserv riello_usb[12234]: Communication with UPS is lost: status read failed!
Nov 11 12:43:17 raspserv riello_usb[12234]: Communication with UPS is lost: status read failed!
Nov 11 12:43:19 raspserv nut-monitor[1024]: Poll UPS [rps1000@localhost] failed - Data stale
Nov 11 12:43:19 raspserv nut-monitor[1024]: UPS [rps1000@localhost] was last known to be on BYPASS and currently is not communicating, assuming dead
Nov 11 12:43:19 raspserv riello_usb[12234]: Communication with UPS is lost: status read failed!
Nov 11 12:43:21 raspserv riello_usb[12234]: Communication with UPS is lost: status read failed!
Nov 11 12:43:23 raspserv riello_usb[12234]: Communication with UPS is lost: status read failed!
Nov 11 12:43:24 raspserv nut-monitor[1024]: Poll UPS [rps1000@localhost] failed - Data stale
Nov 11 12:43:24 raspserv nut-monitor[1024]: UPS [rps1000@localhost] was last known to be on BYPASS and currently is not communicating, assuming dead
Nov 11 12:43:24 raspserv riello_usb[12234]: Communication with UPS is re-established!
Nov 11 12:43:24 raspserv upsd[1022]: UPS [rps1000] data is no longer stale
Nov 11 12:43:24 raspserv nut-server[1022]: UPS [rps1000] data is no longer stale
Nov 11 12:43:29 raspserv nut-monitor[1024]: Communications with UPS rps1000@localhost established
Nov 11 12:43:29 raspserv nut-monitor[1024]: Network UPS Tools upsmon 2.8.1.1

You can also periodically observe the following in the log:

Nov 13 02:43:56 raspserv nut-monitor[1024]: UPS rps1000@localhost on battery
Nov 13 02:43:56 raspserv nut-monitor[1024]: Network UPS Tools upsmon 2.8.1.1
Nov 13 02:43:56 raspserv nut-monitor[1024]: UPS rps1000@localhost: administratively OFF or asleep
Nov 13 02:43:56 raspserv nut-monitor[1024]: UPS [rps1000@localhost] is reported as (administratively) OFF
Nov 13 02:43:56 raspserv nut-monitor[1024]: Network UPS Tools upsmon 2.8.1.1
Nov 13 02:43:56 raspserv nut-monitor[1024]: Network UPS Tools upsmon 2.8.1.1
Nov 13 02:44:01 raspserv nut-monitor[1024]: UPS rps1000@localhost: no longer administratively OFF or asleep
Nov 13 02:44:01 raspserv nut-monitor[1024]: UPS rps1000@localhost on line power
Nov 13 02:44:01 raspserv nut-monitor[1024]: Network UPS Tools upsmon 2.8.1.1
Nov 13 02:44:01 raspserv nut-monitor[1024]: UPS rps1000@localhost: on bypass (powered, not protecting)
Nov 13 02:44:01 raspserv nut-monitor[1024]: Network UPS Tools upsmon 2.8.1.1
Nov 13 02:44:01 raspserv nut-monitor[1024]: Network UPS Tools upsmon 2.8.1.1

The only plus in the latest version is that NUT does not get hung up on the constant "Poll UPS [rps1000@localhost] failed - Data stale", but establishes a correct connection after several iterations.

Next, I provide the logs of the driver for both riello1000 and rps1000 with parameter -d 50:
root@raspserv:~# /lib/nut/riello_usb -a riello1000 -DDDDDD -d 50 2 > & 1 | tee -a riello1000.log
riello1000.log

root@raspserv:~# /lib/nut/riello_usb -a rps1000 -DDDDDD -d 50 2 > & 1 | tee -a rps1000.log
rps1000.log

After such a number of iterations, I did not notice erroneous data, perhaps it should be increased.

I am sure that the problem is the incorrect reading of data from the rps1000. So tell me how else I can show you the problem. I am ready to provide any data.

By the way: for the entire time of testing, I changed 2 different USB cables, USB ports (USB 2.0 and USB 3.0) for rps1000 - to avoid communication problems.

@shgubar
Copy link
Author

shgubar commented Nov 13, 2023

Then the logs (syslog) as soon as nut.target was launched:

Nov 13 12:20:04 raspserv nut-driver@rps1000[14244]: Network UPS Tools - Riello USB driver 0.11 (2.8.1.1)
Nov 13 12:20:04 raspserv nut-driver@rps1000[14244]: Warning: This is an experimental driver.
Nov 13 12:20:04 raspserv nut-driver@rps1000[14244]: Some features may not function correctly.
Nov 13 12:20:10 raspserv nut-driver@rps1000[14244]: riello_usb: riello.c:457: riello_parse_gn: Assertion `pom_long < UINT16_MAX' failed.
Nov 13 12:20:10 raspserv nut-driver@rps1000[14244]: Driver exited abnormally
Nov 13 12:20:10 raspserv nut-driver@rps1000[14244]: Network UPS Tools - UPS driver controller 2.8.1.1
Nov 13 12:20:10 raspserv systemd[1]: [email protected]: Control process exited, code=exited, status=1/FAILURE
Nov 13 12:20:10 raspserv systemd[1]: [email protected]: Failed with result 'exit-code'.
Nov 13 12:20:11 raspserv nut-server[14365]: fopen /run/nut/upsd.pid: No such file or directory
Nov 13 12:20:11 raspserv nut-server[14365]: Could not find PID file '/run/nut/upsd.pid' to see if previous upsd instance is already running!
Nov 13 12:20:11 raspserv nut-server[14365]: /etc/nut/upsd.conf is world readable
Nov 13 12:20:11 raspserv nut-server[14365]: listening on 127.0.0.1 port 3493
Nov 13 12:20:11 raspserv nut-server[14365]: listening on 10.5.0.7 port 3493
Nov 13 12:20:11 raspserv upsd[14365]: /etc/nut/upsd.conf is world readable
Nov 13 12:20:11 raspserv nut-server[14365]: Can't connect to UPS [rps1000] (riello_usb-rps1000): No such file or directory
Nov 13 12:20:11 raspserv nut-server[14365]: Connected to UPS [riello1000]: riello_usb-riello1000
Nov 13 12:20:11 raspserv nut-server[14365]: Found 2 UPS defined in ups.conf
Nov 13 12:20:11 raspserv nut-server[14365]: /etc/nut/upsd.users is world readable
Nov 13 12:20:11 raspserv nut-server[14365]: Running as foreground process, but saving a PID file anyway
Nov 13 12:20:11 raspserv nut-server[14365]: upsnotify: failed to notify about state 2: no notification tech defined, will not spam more about it
Nov 13 12:20:11 raspserv upsd[14365]: listening on 127.0.0.1 port 3493
Nov 13 12:20:11 raspserv upsd[14365]: listening on 10.5.0.7 port 3493
Nov 13 12:20:11 raspserv upsd[14365]: Can't connect to UPS [rps1000] (riello_usb-rps1000): No such file or directory
Nov 13 12:20:11 raspserv upsd[14365]: Connected to UPS [riello1000]: riello_usb-riello1000
Nov 13 12:20:11 raspserv riello_usb[14302]: sock_connect: enabling asynchronous mode (auto)
Nov 13 12:20:11 raspserv upsd[14365]: Found 2 UPS defined in ups.conf
Nov 13 12:20:11 raspserv upsd[14365]: /etc/nut/upsd.users is world readable
Nov 13 12:20:11 raspserv upsd[14365]: Running as foreground process, but saving a PID file anyway
Nov 13 12:20:11 raspserv upsd[14365]: upsnotify: failed to notify about state 2: no notification tech defined, will not spam more about it
Nov 13 12:20:11 raspserv nut-monitor[14367]: fopen /run/nut/upsmon.pid: No such file or directory
Nov 13 12:20:11 raspserv nut-monitor[14367]: Could not find PID file to see if previous upsmon instance is already running!
Nov 13 12:20:11 raspserv nut-monitor[14367]: UPS: riello1000@localhost (primary) (power value 1)
Nov 13 12:20:11 raspserv nut-monitor[14367]: UPS: rps1000@localhost (primary) (power value 1)
Nov 13 12:20:11 raspserv nut-monitor[14367]: Using power down flag file /etc/killpower
Nov 13 12:20:11 raspserv nut-monitor[14367]: upscli_init called but SSL wasn't compiled in
Nov 13 12:20:11 raspserv nut-monitor[14367]: upsnotify: failed to notify about state 2: no notification tech defined, will not spam more about it
Nov 13 12:20:11 raspserv nut-server[14365]: User [email protected] logged into UPS [riello1000]
Nov 13 12:20:11 raspserv upsd[14365]: User [email protected] logged into UPS [riello1000]
Nov 13 12:20:11 raspserv nut-monitor[14367]: UPS riello1000@localhost: on bypass (powered, not protecting)
Nov 13 12:20:11 raspserv nut-monitor[14367]: Network UPS Tools upsmon 2.8.1.1
Nov 13 12:20:11 raspserv nut-server[14365]: User [email protected] logged into UPS [rps1000]
Nov 13 12:20:11 raspserv upsd[14365]: User [email protected] logged into UPS [rps1000]
Nov 13 12:20:11 raspserv nut-monitor[14367]: Poll UPS [rps1000@localhost] failed - Driver not connected
Nov 13 12:20:11 raspserv nut-monitor[14367]: Communications with UPS rps1000@localhost lost
Nov 13 12:20:11 raspserv nut-monitor[14367]: Network UPS Tools upsmon 2.8.1.1
Nov 13 12:20:16 raspserv nut-monitor[14367]: Poll UPS [rps1000@localhost] failed - Driver not connected
Nov 13 12:20:16 raspserv nut-monitor[14367]: UPS rps1000@localhost is unavailable
Nov 13 12:20:16 raspserv nut-monitor[14367]: wall: cannot get tty name: Inappropriate ioctl for device
Nov 13 12:20:16 raspserv nut-monitor[14367]: Network UPS Tools upsmon 2.8.1.1
Nov 13 12:20:21 raspserv nut-monitor[14367]: Poll UPS [rps1000@localhost] failed - Driver not connected
Nov 13 12:20:26 raspserv systemd[1]: [email protected]: Service RestartSec=15s expired, scheduling restart.
Nov 13 12:20:26 raspserv systemd[1]: [email protected]: Scheduled restart job, restart counter is at 1.
Nov 13 12:20:26 raspserv nut-monitor[14367]: Poll UPS [rps1000@localhost] failed - Driver not connected
Nov 13 12:20:26 raspserv nut-driver@rps1000[14410]: Duplicate driver instance detected (PID file /run/nut/riello_usb-rps1000.pid exists)! Terminating other driver!
Nov 13 12:20:26 raspserv nut-driver@rps1000[14410]: kill: No such process
Nov 13 12:20:26 raspserv nut-driver@rps1000[14410]: Network UPS Tools - Riello USB driver 0.11 (2.8.1.1)
Nov 13 12:20:26 raspserv nut-driver@rps1000[14410]: Warning: This is an experimental driver.
Nov 13 12:20:26 raspserv nut-driver@rps1000[14410]: Some features may not function correctly.
Nov 13 12:20:29 raspserv nut-driver@rps1000[14410]: Network UPS Tools - UPS driver controller 2.8.1.1
Nov 13 12:20:29 raspserv riello_usb[14448]: Startup successful
Nov 13 12:20:29 raspserv riello_usb[14448]: upsnotify: failed to notify about state 2: no notification tech defined, will not spam more about it
Nov 13 12:20:29 raspserv nut-server[14365]: Connected to UPS [rps1000]: riello_usb-rps1000
Nov 13 12:20:29 raspserv upsd[14365]: Connected to UPS [rps1000]: riello_usb-rps1000
Nov 13 12:20:30 raspserv riello_usb[14448]: sock_connect: enabling asynchronous mode (auto)
Nov 13 12:20:31 raspserv nut-monitor[14367]: Communications with UPS rps1000@localhost established
Nov 13 12:20:31 raspserv nut-monitor[14367]: UPS rps1000@localhost: on bypass (powered, not protecting)
Nov 13 12:20:31 raspserv nut-monitor[14367]: Network UPS Tools upsmon 2.8.1.1
Nov 13 12:20:31 raspserv nut-monitor[14367]: Network UPS Tools upsmon 2.8.1.1

@shgubar
Copy link
Author

shgubar commented Nov 13, 2023

The next start of nut.target shows that some data is missing on the rps1000 - for example: battery.capacity, battery.voltage.nominal:

For riello1000:

battery.capacity: 100
battery.charge: 100
battery.runtime: 70560
battery.voltage: 40.8
battery.voltage.nominal: 36
device.mfr: RPS S.p.a.
device.model: UOD2           
device.serial:                 
device.type: ups
driver.debug: 6
driver.flag.allow_killpower: 0
driver.name: riello_usb
driver.parameter.bus: 001
driver.parameter.device: 003
driver.parameter.pollinterval: 2
driver.parameter.port: auto
driver.parameter.productid: 5500
driver.parameter.synchronous: auto
driver.parameter.vendorid: 04b4
driver.state: quiet
driver.version: 2.8.1.1
driver.version.internal: 0.11
driver.version.usb: libusb-0.1 (or compat)
input.bypass.frequency: 50.00
input.bypass.voltage: 220
input.frequency: 50.00
input.voltage: 220
output.frequency: 50.00
output.frequency.nominal: 50.0
output.L1.current: 0
output.L1.power: 0
output.L1.realpower: 0
output.L2.current: 0
output.L2.power: 0
output.L2.realpower: 0
output.L3.current: 0
output.L3.power: 0
output.L3.realpower: 0
output.power.percent: 6
output.voltage: 220
output.voltage.nominal: 220
ups.delay.reboot: 5
ups.delay.shutdown: 5
ups.firmware: SWM073-01-00
ups.load: 6
ups.mfr: RPS S.p.a.
ups.model: UOD2           
ups.power.nominal: 1000
ups.productid: 5500
ups.realpower.nominal: 900
ups.serial:                 
ups.status: OL BYPASS CHRG
ups.temperature: 37
ups.vendorid: 04b4

For rps1000:

battery.charge: 100
battery.runtime: 70560
battery.voltage: 40.8
device.mfr: RPS S.p.a.
device.model: UOD2           
device.serial: AA35VOD20016622 
device.type: ups
driver.debug: 6
driver.flag.allow_killpower: 0
driver.name: riello_usb
driver.parameter.bus: 001
driver.parameter.device: 004
driver.parameter.pollinterval: 15
driver.parameter.port: auto
driver.parameter.productid: 5500
driver.parameter.synchronous: auto
driver.parameter.vendorid: 04b4
driver.state: quiet
driver.version: 2.8.1.1
driver.version.internal: 0.11
driver.version.usb: libusb-0.1 (or compat)
input.bypass.frequency: 50.00
input.bypass.voltage: 222
input.frequency: 50.00
input.voltage: 222
output.frequency: 50.00
output.L1.current: 0
output.L1.power: 0
output.L1.realpower: 0
output.L2.current: 0
output.L2.power: 0
output.L2.realpower: 0
output.L3.current: 0
output.L3.power: 0
output.L3.realpower: 0
output.power.percent: 7
output.voltage: 221
ups.delay.reboot: 5
ups.delay.shutdown: 5
ups.firmware: SWM073-01-01
ups.load: 7
ups.mfr: RPS S.p.a.
ups.model: UOD2           
ups.productid: 5500
ups.serial: AA35VOD20016622 
ups.status: OL BYPASS CHRG
ups.temperature: 39
ups.vendorid: 04b4

And log file with parameter debug_min = 6 for both UPS`s (period 5 minutes):
wdebug6_riello1000_rps1000.log

@jimklimov
Copy link
Member

Thank you for these details. I've tried to contact the original driver authors, but it has been a decade, so not sure what would come of it...

Note for future readers: log files seem to have been "escaped" by the syslog server for non ASCII-readable characters, so in text #011 is a TAB (octal for 9), #012 is LF (octal for 10) and #015 is a CR (octal for 13).

@jimklimov jimklimov added impacts-release-2.8.0 Issues reported against NUT release 2.8.0 (maybe vanilla or with minor packaging tweaks) impacts-release-2.8.1 Issues reported against NUT release 2.8.1 (maybe vanilla or with minor packaging tweaks) labels Nov 13, 2023
@jimklimov
Copy link
Member

@shgubar: Just in case: does it misbehave similarly when only one UPS is physically plugged in, to rule out some cross-talk (if e.g. libusb also gets confused)?

@shgubar
Copy link
Author

shgubar commented Nov 13, 2023

If disabled riello1000 and start only rps1000, the behavior is much worse (I wrote about it). Connection failed and Data stale are more common.

UPD: I didn't test the only rps1000 physical connection - I just disabled riello1000 in the settings.

@jimklimov
Copy link
Member

jimklimov commented Nov 13, 2023

Yes, my suspicion is that maybe somehow reports from two similar USB devices are round-robined by the USB stack, outside of NUT? So the idea to double-check is with only one device physically plugged, ruling out possibility of such confusions completely.

Also, given that firmware revisions are different, maybe one is just buggy? :D

@shgubar
Copy link
Author

shgubar commented Nov 13, 2023

Yeah, maybe. I'll check it out.

@shgubar
Copy link
Author

shgubar commented Nov 13, 2023

Physically disabled riello1000 for now. The service started, almost 5 minutes have passed, this is what is in the logs:

Nov 13 14:32:31 raspserv systemd[1]: Starting Network UPS Tools - device driver for rps1000...
Nov 13 14:32:31 raspserv nut-driver@rps1000[10320]: Network UPS Tools - Riello USB driver 0.11 (2.8.1.1)
Nov 13 14:32:31 raspserv nut-driver@rps1000[10320]: Warning: This is an experimental driver.
Nov 13 14:32:31 raspserv nut-driver@rps1000[10320]: Some features may not function correctly.
Nov 13 14:32:33 raspserv nut-driver@rps1000[10320]: Network UPS Tools - UPS driver controller 2.8.1.1
Nov 13 14:32:33 raspserv systemd[1]: Started Network UPS Tools - device driver for rps1000.
Nov 13 14:32:33 raspserv riello_usb[10348]: Startup successful
Nov 13 14:32:33 raspserv riello_usb[10348]: upsnotify: failed to notify about state 2: no notification tech defined, will not spam more about it
Nov 13 14:32:33 raspserv systemd[1]: Reached target Network UPS Tools - target for power device drivers on this system.
Nov 13 14:32:33 raspserv systemd[1]: Starting Network UPS Tools - power devices information server...
Nov 13 14:32:33 raspserv systemd[1]: Started Network UPS Tools - power devices information server.
Nov 13 14:32:33 raspserv nut-server[10350]: fopen /run/nut/upsd.pid: No such file or directory
Nov 13 14:32:33 raspserv nut-server[10350]: Could not find PID file '/run/nut/upsd.pid' to see if previous upsd instance is already running!
Nov 13 14:32:33 raspserv nut-server[10350]: listening on 127.0.0.1 port 3493
Nov 13 14:32:33 raspserv nut-server[10350]: listening on 10.5.0.7 port 3493
Nov 13 14:32:33 raspserv nut-server[10350]: Connected to UPS [rps1000]: riello_usb-rps1000
Nov 13 14:32:33 raspserv nut-server[10350]: Found 1 UPS defined in ups.conf
Nov 13 14:32:33 raspserv systemd[1]: Starting Network UPS Tools - power device monitor and shutdown controller...
Nov 13 14:32:33 raspserv upsd[10350]: listening on 127.0.0.1 port 3493
Nov 13 14:32:33 raspserv nut-server[10350]: Running as foreground process, but saving a PID file anyway
Nov 13 14:32:33 raspserv nut-server[10350]: upsnotify: failed to notify about state 2: no notification tech defined, will not spam more about it
Nov 13 14:32:33 raspserv upsd[10350]: listening on 10.5.0.7 port 3493
Nov 13 14:32:33 raspserv upsd[10350]: Connected to UPS [rps1000]: riello_usb-rps1000
Nov 13 14:32:33 raspserv upsd[10350]: Found 1 UPS defined in ups.conf
Nov 13 14:32:33 raspserv upsd[10350]: Running as foreground process, but saving a PID file anyway
Nov 13 14:32:33 raspserv upsd[10350]: upsnotify: failed to notify about state 2: no notification tech defined, will not spam more about it
Nov 13 14:32:33 raspserv systemd[1]: Started Network UPS Tools - power device monitor and shutdown controller.
Nov 13 14:32:33 raspserv systemd[1]: Reached target Network UPS Tools - target for power device drivers, data server and monitoring client (if enabled) on this system.
Nov 13 14:32:33 raspserv nut-monitor[10352]: fopen /run/nut/upsmon.pid: No such file or directory
Nov 13 14:32:33 raspserv nut-monitor[10352]: Could not find PID file to see if previous upsmon instance is already running!
Nov 13 14:32:33 raspserv nut-monitor[10352]: UPS: rps1000@localhost (primary) (power value 1)
Nov 13 14:32:33 raspserv nut-monitor[10352]: Using power down flag file /etc/killpower
Nov 13 14:32:33 raspserv nut-monitor[10352]: upscli_init called but SSL wasn't compiled in
Nov 13 14:32:33 raspserv nut-monitor[10352]: upsnotify: failed to notify about state 2: no notification tech defined, will not spam more about it
Nov 13 14:32:33 raspserv nut-server[10350]: User [email protected] logged into UPS [rps1000]
Nov 13 14:32:33 raspserv upsd[10350]: User [email protected] logged into UPS [rps1000]
Nov 13 14:32:34 raspserv riello_usb[10348]: sock_connect: enabling asynchronous mode (auto)
Nov 13 14:32:38 raspserv nut-monitor[10352]: UPS rps1000@localhost: on bypass (powered, not protecting)
Nov 13 14:32:38 raspserv nut-monitor[10352]: Network UPS Tools upsmon 2.8.1.1
Nov 13 14:34:01 raspserv nut-server[10350]: User [email protected] logged into UPS [rps1000]
Nov 13 14:34:01 raspserv upsd[10350]: User [email protected] logged into UPS [rps1000]
Nov 13 14:36:15 raspserv nut-server[10350]: User [email protected] logged into UPS [rps1000]
Nov 13 14:36:15 raspserv upsd[10350]: User [email protected] logged into UPS [rps1000]
Nov 13 14:37:21 raspserv nut-server[10350]: Data for UPS [rps1000] is stale - check driver
Nov 13 14:37:21 raspserv upsd[10350]: Data for UPS [rps1000] is stale - check driver
Nov 13 14:37:21 raspserv riello_usb[10348]: Communication with UPS is lost: status read failed!
Nov 13 14:37:22 raspserv riello_usb[10348]: Communication with UPS is re-established!
Nov 13 14:37:22 raspserv upsd[10350]: UPS [rps1000] data is no longer stale
Nov 13 14:37:22 raspserv nut-server[10350]: UPS [rps1000] data is no longer stale

And upsc rps1000:

battery.capacity: 100
battery.charge: 100
battery.runtime: 70560
battery.voltage: 40.8
battery.voltage.nominal: 36
device.mfr: RPS S.p.a.
device.model: UOD2           
device.serial: AA35VOD20016622 
device.type: ups
driver.debug: 0
driver.flag.allow_killpower: 0
driver.name: riello_usb
driver.parameter.pollinterval: 2
driver.parameter.port: auto
driver.parameter.synchronous: auto
driver.state: updateinfo
driver.version: 2.8.1.1
driver.version.internal: 0.11
driver.version.usb: libusb-0.1 (or compat)
input.bypass.frequency: 50.00
input.bypass.voltage: 222
input.frequency: 50.00
input.voltage: 222
output.frequency: 50.00
output.frequency.nominal: 50.0
output.L1.current: 0
output.L1.power: 0
output.L1.realpower: 0
output.L2.current: 0
output.L2.power: 0
output.L2.realpower: 0
output.L3.current: 0
output.L3.power: 0
output.L3.realpower: 0
output.power.percent: 7
output.voltage: 221
output.voltage.nominal: 220
ups.delay.reboot: 5
ups.delay.shutdown: 5
ups.firmware: SWM073-01-01
ups.load: 7
ups.mfr: RPS S.p.a.
ups.model: UOD2           
ups.power.nominal: 1000
ups.productid: 5500
ups.realpower.nominal: 900
ups.serial: AA35VOD20016622 
ups.status: OL BYPASS CHRG
ups.temperature: 40
ups.vendorid: 04b4

For now, I will leave it in this state until tomorrow.

@shgubar
Copy link
Author

shgubar commented Nov 14, 2023

So, your assumption was probably correct.

Since yesterday, only one UPS was physically connected to the USB (this is exactly the one that constantly showed incorrect data):

root # lsusb
Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
Bus 001 Device 003: ID 04b4:5500 Cypress Semiconductor Corp. HID->COM RS232 Adapter
Bus 001 Device 002: ID 2109:3431 VIA Labs, Inc. Hub
Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub

ups.conf:

maxretry = 3
[rps1000]
        driver = riello_usb
        port = auto
        desc = "Riello SEP 1000 ER NP"

Here are the results I see per day:

  1. There were never any mistakes.
  2. There were no incorrect data either in the logs or on the graphs (as I previously wrote about current, power, output voltage, etc.)
  3. The only thing that I noticed at the moment is that when you run nut.target, it should be:
battery.capacity: 100
battery.charge: 100
battery.runtime: 70560
battery.voltage: 40.8
battery.voltage.nominal: 36
device.mfr: RPS S.p.a.
device.model: UOD2           
device.serial: AA35VOD20016622 
device.type: ups
driver.debug: 0
driver.flag.allow_killpower: 0
driver.name: riello_usb
driver.parameter.pollinterval: 2
driver.parameter.port: auto
driver.parameter.synchronous: auto
driver.state: quiet
driver.version: 2.8.1.1
driver.version.internal: 0.11
driver.version.usb: libusb-0.1 (or compat)
input.bypass.frequency: 50.00
input.bypass.voltage: 222
input.frequency: 50.00
input.voltage: 222
output.frequency: 50.00
output.frequency.nominal: 50.0
output.L1.current: 0
output.L1.power: 0
output.L1.realpower: 0
output.L2.current: 0
output.L2.power: 0
output.L2.realpower: 0
output.L3.current: 0
output.L3.power: 0
output.L3.realpower: 0
output.power.percent: 7
output.voltage: 221
output.voltage.nominal: 220
ups.delay.reboot: 5
ups.delay.shutdown: 5
ups.firmware: SWM073-01-01
ups.load: 7
ups.mfr: RPS S.p.a.
ups.model: UOD2           
ups.power.nominal: 1000
ups.productid: 5500
ups.realpower.nominal: 900
ups.serial: AA35VOD20016622 
ups.status: OL BYPASS CHRG
ups.temperature: 39
ups.vendorid: 04b4

But sometimes:

battery.charge: 100
battery.runtime: 70560
battery.voltage: 40.8
device.mfr: RPS S.p.a.
device.model: UOD2           
device.serial: AA35VOD20016622 
device.type: ups
driver.debug: 0
driver.flag.allow_killpower: 0
driver.name: riello_usb
driver.parameter.pollinterval: 2
driver.parameter.port: auto
driver.parameter.synchronous: auto
driver.state: updateinfo
driver.version: 2.8.1.1
driver.version.internal: 0.11
driver.version.usb: libusb-0.1 (or compat)
input.bypass.frequency: 50.00
input.bypass.voltage: 222
input.frequency: 50.00
input.voltage: 222
output.frequency: 50.00
output.L1.current: 0
output.L1.power: 0
output.L1.realpower: 0
output.L2.current: 0
output.L2.power: 0
output.L2.realpower: 0
output.L3.current: 0
output.L3.power: 0
output.L3.realpower: 0
output.power.percent: 7
output.voltage: 221
ups.delay.reboot: 5
ups.delay.shutdown: 5
ups.firmware: SWM073-01-01
ups.load: 7
ups.mfr: RPS S.p.a.
ups.model: UOD2           
ups.productid: 5500
ups.serial: AA35VOD20016622 
ups.status: OL BYPASS CHRG
ups.temperature: 40
ups.vendorid: 04b4

The following parameters are missing:

battery.capacity
battery.voltage.nominal
output.frequency.nominal
output.voltage.nominal
ups.power.nominal
ups.realpower.nominal

@mzampieri70
Copy link

Hello @shgubar the problem of the bad communication could be a weak or not correctly connected earth of the UPS.
It is very important that the UPS is grounded to avoid noises on the serial/USB communication.
Please let me know if the problem will persist.

Regards

@shgubar
Copy link
Author

shgubar commented Nov 14, 2023

Hello @mzampieri70
I have good grounding for the whole house.
Moreover, there is a voltage stabilizer at the input.

I have already used other USB cables.
Even the same cable that goes to the first UPS that works perfectly.

@mzampieri70
Copy link

@shgubar the bad communication with NUT is due to some wrong bytes that alter the Checksum indeed the complete packet is discarded.
The root cause could be something wrong on the earth or noise on the communication cable.
The two UPS has the same hardware release, firmware is different, but it is not impacting the communication stage.

@shgubar
Copy link
Author

shgubar commented Nov 14, 2023

Understand. I will try replacing the power cable.

@mzampieri70
Copy link

Ok let me know

@shgubar
Copy link
Author

shgubar commented Jan 9, 2024

Hi @jimklimov, @mzampieri70

What was done:

  1. The power cable (supplied with the rps1000 UPS) had parameters of 3x0.75 mm2. The cable was replaced by - 3x1.5 mm2
  2. Was placed in a rack together with other power cables. It was concluded separately from the others.
  3. During all this time, since the last conversation here, there have been blackouts. No false data was noticed during all this time.

P.S.: the riello1000 UPS remained connected by a power cable with parameters of 3x0.75 mm2.

I am very grateful for all your advice!

@shgubar shgubar closed this as completed Jan 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug impacts-release-2.8.0 Issues reported against NUT release 2.8.0 (maybe vanilla or with minor packaging tweaks) impacts-release-2.8.1 Issues reported against NUT release 2.8.1 (maybe vanilla or with minor packaging tweaks) Incorrect or missing readings On some devices driver-reported values are systemically off (e.g. x10, x0.1, const+Value, etc.) Riello Riello UPS devices (serial, usb), usually with a variant of Qx protocol USB
Projects
None yet
Development

No branches or pull requests

4 participants
@jimklimov @shgubar @mzampieri70 and others