Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

boinc_client does not notice (or does not care) that "computer is in use" #1187

Open
romw opened this issue Feb 4, 2015 · 74 comments
Open

Comments

@romw
Copy link
Member

romw commented Feb 4, 2015

Reported by Bluefin Tuna on 3 Apr 42799116 09:47 UTC
On Linux (Fedora 17 KDE spin), boinc_client (7.0.28 x86_64-pc-linux-gnu) is running as user "boinc", a user not used for anything else.

Settings as per "Computing Preferences (apply to all BOINC projects in which you participate)" are for Processor Usage:

Suspend work while computer is on battery power?    yes
Matters only for portable computers                 yes
Suspend work while computer is in use?              yes
Suspend GPU work while computer is in use?          yes
'In use' means mouse/keyboard activity in last      5 minutes
Suspend work if no mouse/keyboard activity in last  --- minutes
Suspend work if CPU usage is above                  50%
Do work only between the hours of                   ---
Leave tasks in memory while suspended?              no
Switch between tasks every                          120 minutes
On multiprocessors, use at most                     --- processors
On multiprocessors, use at most                     100% of the processors
Use at most                                         75% of CPU time 

The machine has 2 non-threaded CPUs. As per lscpu:

Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
CPU(s):                2
On-line CPU(s) list:   0,1
Thread(s) per core:    1
Core(s) per socket:    2
Socket(s):             1

It also has a CUDA-capable graphics card, which is being used for BOINC projects.

Scenario:

  • boinc runs while user is absent
  • once the user activates the machine, boinc notices this and tunes down
    (the log shows nothing though, I think it should ... is there a "verbose"
    option?)
    ps then shows only the "boinc_client" running, not the tasks themselves
  • however, after a quarter of an hour or so, the tasks start up again; computer interactivity slows noticeably. More amazing, there are actually 3 tasks running 2 x einstein@home + 1 x seti even though the machine has just 2 CPUs (maybe it's done differently on CUDA machines); here too logging the decision logic would be informative
  • boinc knows the machine is in use though, the log says "Suspending network activity - computer is in use"
  • to get the machine in hand, one has to "kill boinc"

Migrated-From: http://boinc.berkeley.edu/trac/ticket/1215

@romw
Copy link
Member Author

romw commented Feb 5, 2015

Commented by ChristianB on 12 Nov 42799609 05:40 UTC
Are you using a Fedora Package Manager installation or did you install BOINC after downloading from the official site? Packaged versions are not really supported by the official project because we don't know what the package maintainer changed and if this has some side-effects we don't know about.

To your question of a more verbose output you should look at the user wiki (http://boinc.berkeley.edu/wiki/Client_configuration) where you can find log_flags that are helpful in this case. The flags suspend_debug and task_debug may be helpful in your case.

Next question is about re-enabling computation after 45 minutes. If the user is still using the computer (mouse/keyboard activity) than this clearly is a bug. If the user is just watching a movie and not doing anything the System thinks that the computer is unused. This should be identified by the log messages. What shouldn't happen is that the computer is slowed down by the science apps. The regular BOINC client sets the process priority to lowest when starting any science application and therefore this shouldn't interfere with normal operations. Please refer to what I wrote about packaged versions of BOINC.

As GPU computing is allowed and a capable GPU is found you will always see three science processes started. One for each CPU and one for the GPU.

@romw
Copy link
Member Author

romw commented Feb 5, 2015

Commented by Ageless on 13 Aug 42799721 01:44 UTC
Also know that there was this fix in 7.0.29:

  • Fix for linux idle detection bug with USB mice.

For logs on what may or may not be fixed already, see the BOINC 7 Change Log thread on the BOINC forums.

@romw
Copy link
Member Author

romw commented Feb 5, 2015

Commented by Bluefin Tuna on 29 Oct 42800862 09:18 UTC
Hi,

The boinc executable has been compiled from the 7.0.28 source directly.

I will add some log flags as described in the link, thank you.

In this case, I am using the computer for editing work while boinc is running. This makes the slowdown very noticeable (actually, things become unacceptably slow). I don't think I saw that behaviour before I enabled the GPU, so there may be some DMA or contention problem with the graphics driver/hardware (Btw, I am using the 'nvidia' driver instead of the Fedora 'nouveau' - unsure whether I should). Will test what happens with GPU disabled.

Will also try 7.0.31, this is going to take a bit...

@romw
Copy link
Member Author

romw commented Feb 5, 2015

Commented by Bluefin Tuna on 15 Aug 42802452 16:43 UTC
Update:

  1. Looks like (USB) mouse mouvement is indeed not recognized. It's typing that triggers "computer in use" state.

  2. The high-inertia effect is either due to the GPU or to einstein@home. Running "--no_gpus" (and thus only seti@home) triggers instant shutdown when "computer in use" is detected. If einstein@home is using the GPU, then shutdown is "rather slow" (will have to time) and for some reason response times of applications suffer a lot.

@romw
Copy link
Member Author

romw commented Feb 5, 2015

Commented by Bluefin Tuna on 21 Sep 42808484 17:52 UTC
Update:

'''Without GPU'''

  • Responsiveness of PC does not suffer, irrespective of whether boinc is running or not.
  • Computer is recognized "to be in use":
    • The first time ''only'' by typing into any terminal window (not necessarily the terminal window
      belonging to the user who is running boinc). At that point boinc shutdown is rather instantaneous.
      Typing in a text editor, typing in firefox or mouse movement/clicking around is ''not'
      recognized.
    • The second time (i.e. once boinc reactivated again because of computer idling), ''only'' by
      typing into a terminal window that was already running when boinc reactivated. A window opened
      after that is apparently not considered.

'''With GPU'''

  • As above, but responsiveness of PC does suffer; einstein@home must be fighting with X for GPU
    usage.

@Germano0
Copy link
Contributor

Germano0 commented Jun 13, 2016

Hi, I am the Fedora's BOINC-client co-maintainer.
User idle time detection based on /dev/input/* does not work, since many Linux distributions like Debian, Fedora, Ubuntu do not update the "access time" of /dev/input/* (see stat /dev/input/foo)
Moreover:

  • you will have problems in detecting user idle time if the user is connected from SSH to a machine;
  • it triggers many SELinux denials messages

There is a smarter way to do that: I have seen in client/hostinfo_unix.cpp a section that concerns Apple Mac OS, you could add another section concerning Linux distributions that use systemd and implement user idle time detection using logind through the GDBus API of Glib/GIO

# gdbus introspect --system --dest org.freedesktop.login1 --object-path /org/freedesktop/login1 | grep -i idle
      readonly b IdleHint = false;
      readonly t IdleSinceHint = 0;
      readonly t IdleSinceHintMonotonic = 0;
      readonly s IdleAction = 'ignore';
      readonly t IdleActionUSec = 1800000000;

What do you think about?

This message has been sent also to boinc-devel mailing list

Thanks to @fweimer and @lmctv for their suggestions

@Germano0
Copy link
Contributor

For your info, I removed the not working implementation of idle detection, from Fedora's BOINC. This avoids SELinux alerts and setroubleshot process high CPU usage

@ChristianBeer
Copy link
Member

I did a bit experimenting with the sd-bus API to determine the idle state. I have a small test program but it always reports the system as not idle. I also found that the IdleHint is only changed when the Desktop Manager (like Gnome and KDE) reports the system as idle. So this would not work for systems that don't have a Desktop Manager. I still think it's worth to investigate more so it at least works for systems with a DM. I'll keep trying.

@Germano0
Copy link
Contributor

Germano0 commented Feb 2, 2017

I have seen changelog
https://boinc.berkeley.edu/dev/forum_thread.php?id=10204&postid=67990
and I was wondering about the state of this bugreport. Unfortunately I still had not the opportunity to make some tests due
https://boinc.berkeley.edu/dev/forum_thread.php?id=11446

@Germano0
Copy link
Contributor

Germano0 commented Feb 12, 2017

I made a dff between hostinfo_unix.cpp of boinc 7.6.22 and 7.6.33:

$ diff boinc_7.6.22_hostinfo_unix.cpp boinc_7.6.33_hostinfo_unix.cpp 
43a44,53
> #if HAVE_XSS
> #include <X11/extensions/scrnsaver.h> //X-based idle detection
> // prevents naming collision between X.h define of Always and boinc's
> // lib/prefs.h definition in an enum.
> #undef Always
> #include <dirent.h> //for opening /tmp/.X11-unix/
>   // (There is a DirScanner class in BOINC, but it doesn't do what we want)
> #include "log_flags.h" // idle_detection_debug flag for verbose output
> #endif
> 
1314,1315c1324,1325
<     strncpy( p_model, "Alpha ", sizeof( p_model));
<     strncat( p_model, cpu_type_name, (sizeof( p_model)- strlen( p_model)- 1));
---
>     strlcpy(p_model, "Alpha ", sizeof(p_model));
>     strlcat(p_model, cpu_type_name, sizeof(p_model));
1949a1960,2146
> 
> #if HAVE_XSS
> 
> // Initializer for const vector<string> in xss_idle
> //
> const vector<string> X_display_values_initialize() {
> 
>   /* According to "man Xserver", each local Xserver will have a socket file
>    * at /tmp/.X11-unix/Xn, where "n" is the display number (0, 1, 2, etc).
>    * We will parse this directory for currently open Xservers and attempt
>    * to ultimately query them for their idle time. If we can't open this
>    * directory, or the display_values vector is otherwise empty, then a
>    * static list of guesses for open display servers is utilized instead
>    * (DISPLAY values ":{0..6}") that will attempt connections to the first
>    * seven open Xservers.
>    *
>    * If we were unable to open _any_ Xserver, then we will log this and
>    * xss_idle returns true, effectively leaving idle detection up to other
>    * methods.
>    */
>   static const string dir = "/tmp/.X11-unix/";
>   vector<string> display_values;
>   vector<string>::iterator it;
> 
>   DIR *dp;
>   struct dirent *dirp;
>   if((dp = opendir(dir.c_str())) == NULL) {
>     if ( log_flags.idle_detection_debug ) {
>       msg_printf(NULL, MSG_INFO, 
>         "[idle_detection] Error (%d) opening %s.", errno, dir.c_str());
>     }
>   }
> 
>   while ((dirp = readdir(dp)) != NULL) {
>     display_values.push_back(string(dirp->d_name));
>   }
>   closedir(dp);
> 
>   // Get rid of non-matching elements and format the matching ones.
>   for ( it = display_values.begin() ; it != display_values.end() ; ) {
>     if ( it->c_str()[0] != 'X' ) {
>       it = display_values.erase(it);
>     }
>     else {
>       replace(it->begin(), it->end(), 'X', ':');
>       it++;
>     }
> 
>   }
> 
>   // if the display_values vector is empty, assume something went wrong
>   // (couldn't open directory, no apparent Xn files). Test a static list of
>   // DISPLAY values instead that is likely to catch most common use cases.
>   // (I don't know of many environments where there will simultaneously be
>   // more than seven active, local Xservers. I'm sure they exist... somewhere.
>   // But seven was the magic number for me).
>   if ( display_values.size() == 0 ) {
>     if ( log_flags.idle_detection_debug ) {
>       msg_printf(NULL, MSG_INFO,
>         "[idle_detection] No DISPLAY values found in /tmp/.X11-unix/.");
>       msg_printf(NULL, MSG_INFO,
>         "[idle_detection] Using static DISPLAY list, :{0..6}.");
>     }
>     display_values.push_back(":0");
>     display_values.push_back(":1");
>     display_values.push_back(":2");
>     display_values.push_back(":3");
>     display_values.push_back(":4");
>     display_values.push_back(":5");
>     display_values.push_back(":6");
>     return display_values;
>   }
>   else {
>     return display_values;
>   }
> }
> 
> /* Ask the X server for user idle time (using XScreenSaver API)
>  * Return true if the idle time exceeds idle_threshold for all accessible
>  * Xservers. However, if even one Xserver reports busy/non-idle, then
>  * return false. This function assumes that the boinc user has been
>  * granted access to the Xservers a la "xhost +SI:localuser:boinc". If
>  * access isn't available for an Xserver, then that Xserver is skipped.
>  * One may drop a file in /etc/X11/Xsession.d/ that runs the xhost command
>  * for all Xservers on a machine when the Xservers start up.
>  */
> bool xss_idle(long idle_threshold) {
> 
>   const vector<string> display_values = X_display_values_initialize();
>   vector<string>::const_iterator it;
>   // If we can connect to at least one DISPLAY, this is set to false.
>   bool no_available_x_display = true;
> 
>   static XScreenSaverInfo* xssInfo = XScreenSaverAllocInfo();
>   // This shouldn't fail. XScreenSaverAllocInfo just returns a small
>   // struct (see "man 3 xss"). If we can't allocate this, then we've
>   // got bigger problems to worry about.
>   if ( xssInfo == NULL ) {
>     if ( log_flags.idle_detection_debug ) {
>       msg_printf(NULL, MSG_INFO,
>         "[idle_detection] XScreenSaverAllocInfo failed. Out of memory? Skipping XScreenSaver idle detection.");
>     }
>     return true;
>   }
> 
>   for (it = display_values.begin() ; it != display_values.end() ; it++) {
> 
>     Display* disp = NULL;
>     long idle_time = 0;
>     
>     disp = XOpenDisplay(it->c_str());
>     // XOpenDisplay may return NULL if there is no running X
>     // or DISPLAY points to wrong/invalid display
>     if (disp == NULL) {
>       if ( log_flags.idle_detection_debug ) {
>       msg_printf(NULL, MSG_INFO, 
>       "[idle_detection] DISPLAY '%s' not found or insufficient access.",
>       it->c_str());
>       }
>       continue;
>     }
> 
>     // Determine if the DISPLAY we have accessed has the XScreenSaver
>     // extension or not.
>     int event_base_return, error_base_return;
>     if (!XScreenSaverQueryExtension(
>       disp, &event_base_return, &error_base_return
>     )){
>       if ( log_flags.idle_detection_debug ) {
>       msg_printf(NULL, MSG_INFO,
>         "[idle_detection] XScreenSaver extension not available for DISPLAY '%s'.",
>         it->c_str());
>       }
>       XCloseDisplay(disp);
>       continue;
>     }
> 
>     // All checks passed. Get the idle information.
>     no_available_x_display = false;
>     XScreenSaverQueryInfo(disp, DefaultRootWindow(disp), xssInfo);
>     idle_time = xssInfo->idle;
> 
>     // Close the connection to the XServer
>     XCloseDisplay(disp);
> 
>     // convert from milliseconds to seconds
>     idle_time = idle_time / 1000;
> 
>     if ( log_flags.idle_detection_debug ) {
>       msg_printf(NULL, MSG_INFO, 
>         "[idle_detection] XSS idle detection succeeded on DISPLAY '%s'.", it->c_str());
>       msg_printf(NULL, MSG_INFO, 
>         "[idle_detection] idle threshold: %ld", idle_threshold);
>       msg_printf(NULL, MSG_INFO,
>         "[idle_detection] idle_time: %ld", idle_time);
>     }
> 
>     if ( idle_threshold < idle_time ) {
>       if ( log_flags.idle_detection_debug ) {
>         msg_printf(NULL, MSG_INFO,
>           "[idle_detection] DISPLAY '%s' is idle.", it->c_str());
>       }
>     } else {
>       if ( log_flags.idle_detection_debug ) {
>         msg_printf(NULL, MSG_INFO,
>           "[idle_detection] DISPLAY '%s' is active.", it->c_str());
>       }
>       return false;
>     }
> 
>   }
> 
>   /* We should only ever get here if all Xservers (that were queryable) were
>    * idle. If none of the Xservers were queryable, we should still end up here,
>    * and simply report true. In that case, the xss_idle function effectively
>    * provides no information on the idle state of the system, as no Xservers
>    * were accessible to interrogate.
>    */
>   if ( log_flags.idle_detection_debug && no_available_x_display ) {
>     msg_printf(NULL, MSG_INFO,
>       "[idle_detection] Could not connect to any DISPLAYs. XSS idle determination impossible.");
>   }
>   return true;
> 
> }
> #endif // HAVE_XSS
> 
1974a2172,2177
> #if HAVE_XSS
>     if (!xss_idle((long)(idle_time_to_run * 60))) {
>         return false;
>     }
> #endif // HAVE_XSS
> 
1993c2196
< #endif
---
> #endif // LINUX_LIKE_SYSTEM

So it looks like that now BOINC relies on X server to detect user idle time. I would like to ask you if you are interested in dropping the old code that runs stat on /dev/input and that now is used as "fallback" in case of failure of X detection, or you are interested in keeping it.
Actually I just started testing BOINC 7.6.33 on Fedora 25

@JuhaSointusalo
Copy link
Contributor

So it looks like that now BOINC relies on X server to detect user idle time.

Please notice that BOINC client needs permission to access X server or the XSS based idle detection won't work. See the discussion in #1453. Debian uses this file in etc/X11/Xsession.d/ .

interested in dropping the old code that runs stat on /dev/input

Maybe check with @romw what distro he used when he wrote the old code and if it ever worked.
And check a variety of distros of different ages and make sure it doesn't work anywhere.
And maybe check kernel build options that there isn't any switch that makes the old code work.
And then throw the old code away.

@LocutusOfBorg
Copy link
Contributor

Mmm what about headless servers? AFAIR this code was meant to work with them

@ChristianBeer
Copy link
Member

We should definitely keep the fallback for systems that don't have X. What @Germano0 wants is a way to disable the access to /dev for certain Linux distributions that use SELinux and where this is triggering an alarm. Instead of removing the code we could make it possible to disable it at compile time and show a message at startup that idle detection does not work without an X server with this package.

@JuhaSointusalo
Copy link
Contributor

@LocutusOfBorg @ChristianBeer

It's the all_input_idle() function we are talking about. It tries to detect activity by checking access time of event and mouse devices in /dev/input . @Germano0 said it doesn't work on Debian, Ubuntu and Fedora and I just tested that it doesn't work on Mint 17 = Ubuntu 14.04 kernel 3.13.0-105-generic and Rasbian Wheezy kernel 3.18.16+. The problem is that the access time is updated only when the device node is created, that is, when you plug in the device.

If it turns out that there is not a single distro where the code works then why bother keep it.

There is a similar test for pseudoterminal devices though you need to be using terminal for the test to work.

@Germano0
Copy link
Contributor

Instead of removing the code we could make it possible to disable it at compile time and show a message at startup that idle detection does not work without an X server with this package.

It would be great to have two different flags to disable:

  • /dev/input based idle detection;
  • XScreenSaver API based idle detection.

@Germano0
Copy link
Contributor

On Fedora we will patch BOINC 7.6.33 to remove Xorg based idle detection too, because Fedora is using Wayland by default and Fedora servers do not run graphical server as @LocutusOfBorg said.

@ChristianBeer ChristianBeer modified the milestones: Client/Manager 8.0, Undetermined Apr 10, 2017
@BryanQuigley
Copy link
Contributor

In addition to script above, I just learned that system76's special scheduler for responsiveness does take some actions. It might be ideal to add more functionality to it, to manage BOINC better.

@LinAGKar
Copy link

LinAGKar commented May 2, 2023

Maybe it would be better if the idle detection was handled by boinc-manager rather than boinc-client, since that runs in the user's session. Ideally then, boinc-manager should get tray support, so the user can have it running in the background.

@AenBleidd
Copy link
Member

Maybe it would be better if the idle detection was handled by boinc-manager rather than boinc-client, since that runs in the user's session.

What is the user runs no BOINC Manager on that particular machine?

@makeasnek
Copy link
Member

It seems the solution would be to have BOINC client accept multiple idle detection types, some of which the client can access directly, some of which exist in userspace (ie BOINC manager) which can be communicated to the client via RPC. And then the client should pick which of these signals to trust based on some internal logic.

@CharlieFenton
Copy link
Contributor

What is the user runs no BOINC Manager on that particular machine?

While I.m not as familiar with Linux as others, I want to add that it might be possible for the client to launch a small helper application in user space independent of BOINC Manager.

@makeasnek
Copy link
Member

@CharlieFenton I was under the impression, perhaps incorrectly, that BOINC already detects correct idle state in these machines by monitoring keyboard/mouse input on some /dev device.

@CharlieFenton
Copy link
Contributor

@makeasnek You very well may be correct; I don't know. As I mentioned, I am not as family with Linux and don't have a Linux box. I was just responding to this comment by @LinAGKar and this response from @AenBleidd.

@davidpanderson
Copy link
Contributor

It would be easy to add an RPC that tells the client there was user input.
The Manager (or some other program running as the desktop user) could call this.

@RichardHaselgrove
Copy link
Contributor

And never forget the case where the user is 'using' the device, with full concentration - but not actively making any inputs. Watching films full-screen, for instance.

@0x0OZ
Copy link

0x0OZ commented Sep 8, 2023

Where is the broken idle check in the repo located?

@AenBleidd
Copy link
Member

AenBleidd commented Sep 8, 2023

Where is the broken idle check in the repo located?

It's not broken, rather it need to be redefined in a proper way.
The code is here: https://github.com/BOINC/boinc/blob/master/client/hostinfo_unix.cpp
Please keep in mind that for different OSs we have different idle detection mechanisms.

@0x0OZ
Copy link

0x0OZ commented Sep 8, 2023

Where is the broken idle check in the repo located?

It's not broken, rather it need to be redefined in a proper way. The code is here: https://github.com/BOINC/boinc/blob/master/client/hostinfo_unix.cpp Please keep in mind that for different OSs we have different idle detection mechanisms.

I see it's defined for Xorg, but it's not working and not defined for wayland.

@AenBleidd
Copy link
Member

Where is the broken idle check in the repo located?

It's not broken, rather it need to be redefined in a proper way. The code is here: https://github.com/BOINC/boinc/blob/master/client/hostinfo_unix.cpp Please keep in mind that for different OSs we have different idle detection mechanisms.

I see it's defined for Xorg, but it's not working and not defined for wayland.

Because it was created when there was no Wayland.
That is why I said that it's not broken.
It's just need to be properly implemented for Wayland (ok, there're also some issues with the XOrg, mostly for headless devices, but still).

@davidpanderson
Copy link
Contributor

My recollection is that X-based idle detection doesn't work
because the BOINC client runs as a different user,
and that we need to use an approach like 'BOINC system tray' on Windows.
I never got around to implementing this, but could do so.

@0x0OZ
Copy link

0x0OZ commented Sep 9, 2023

My recollection is that X-based idle detection doesn't work because the BOINC client runs as a different user, and that we need to use an approach like 'BOINC system tray' on Windows. I never got around to implementing this, but could do so.

That seems right, I couldn't run xprintidle on a different user...
Having a system tray won't be a global solution as you have headless systems that use boinc as mentioned by @AenBleidd .

I think a global solution for all cases would be detecting if there is any user input through mouse/keyboard and if there is any output sound for X seconds to decide that which I don't know how.
Also in the case of Xorg built-in idle detection, it does not count ssh connections as activity, which means if you use an ssh to connect to a distro with Xorg, the system will be IDLE.

@RichardHaselgrove
Copy link
Contributor

There's more discussion of this in #4125 - a specific problem with enumerating X-servers under systemd.

@0x0OZ
Copy link

0x0OZ commented Sep 9, 2023

I made a small patch that should fix the issue globally for all linux distributions for different cases (headless / xorg /wayland).
But because cpp is not really my language I will not be able to make the code clean or add it in the way it should be implemented in the repo...

I only added the method to be reviewed, I don't want to break the code by removing methods like xss_idle()

@Drag-NDrop
Copy link

Drag-NDrop commented Nov 4, 2023

For inspiration; I've created a prototype to help with headless devices.
The main thougt is, that no matter how one connect to a headless instance, one would be using a terminal. So, IdleRunner was made for Bash. Wouldn't be to hard to make some conditional logic that took other shells into account.

@0x0OZ I'd love your input to this.

https://github.com/Drag-NDrop/IdleRunner

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Backlog
Status: Backlog
Development

No branches or pull requests