Changing ps() output format #71

YoannPa · 2019-11-18T16:28:06Z

Hi,
I was wondering how one could change the output format when using the ps() function to generate the table of processes ? when using ps command in a classic terminal you have an option to format the output, displaying the metrics you are interested in.
In my own specific case what I would be interested in would be to be able to have in ps() table the %CPU and also the ELAPSED TIME metrics for all processes.

I already made a function able to retrieve these metrics ( https://github.com/mathosi/cluster_check ), but since a dedicated package already exist I was wondering if it would be possible to have this feature here.
Thank you in advance for your help.
Best,

gaborcsardi · 2019-11-18T16:41:30Z

I was wondering how one could change the output format when using the ps() function to generate the table of processes?

But it already does that, no? It returns a data frame / tibble with all the processes.

when using ps command in a classic terminal you have an option to format the output

You can write a function that formats the output differently.

displaying the metrics you are interested in

In R it is pretty easy to select some columns of a data frame, that's why it is up to the end user to do it.

This said, I am open to adding a different print method, that is more similar to a regular ps x output. Then you'll need to open an issue with some sketch how the output should look.

In my own specific case what I would be interested in would be to be able to have in ps() table the %CPU and also the ELAPSED TIME metrics for all processes.

ps::ps() returns created, which is when the process was created, we can easily add elapsed time with simply Sys.time() - created.

ps does not calculate %CPU currently, because it is somewhat cumbersome. If you want to calculate this, open an(other) issue for it.

I already made a function able to retrieve these metrics ( mathosi/cluster_check ), but since a dedicated package already exist I was wondering if it would be possible to have this feature here.

Yeah, unfortunately, I don't think the approach of calling the external ps program is portable even across various Unix systems, and it will surely not work on Windows, where ps is not available.

YoannPa · 2019-11-18T18:14:40Z

But it already does that, no? It returns a data frame / tibble with all the processes.

In the Linux terminal version of ps you can really specify the metrics you want, in the order you want. I think this feature would deserve an additionnal function on its own for allowing a user to give as a string using similar output codes as in a classic ps command, the metrics he wants and in which order he wants it.
It is not so simple to do, I haven't implemented yet myself this option in the repository I linked in my previous message.
OR, another way to think it would be to generate a table with all metrics by default, and to then let the user select and order the dataframe the way he wants it. This second way would better fit what you expect from the user to do, and I would be totally fine with that.

ps does not calculate %CPU currently, because it is somewhat cumbersome. If you want to calculate this, open an(other) issue for it.

The way I designed my function ps.to.df(), it just does internally a system() of the classic ps command.
This way you don't have to calculate the %CPU usage, you just grab what is already calculated by the classic ps. Maybe I am wrong seeing things this way, but since a calculation method already exists in ps, and knowing that ps is available by default on most Linux distributions, I though it was making sense to make use of it directly.

Yeah, unfortunately, I don't think the approach of calling the external ps program is portable even across various Unix systems, and it will surely not work on Windows, where ps is not available.

Yes this approach would be limitated to UNIX system, but I think most of them have ps. In Windows the equivalent would be tasklist I guess.

I saw that one of the function from your package is called ps_cmdline().
It would be nice if this function could also have an option to handle direct ps command line as character string ?
Something like that: ps_cmdline(cmd="-C rsession -o %cpu,%mem,pid")
Or add a new function to do such thing. At least Linux user could directly pass a command line and have the output as a data.frame.

gaborcsardi · 2019-11-18T18:22:13Z

OR, another way to think it would be to generate a table with all metrics by default,

Right, I think that's the R way to do it. We can still allow customization, though.

The way I designed my function ps.to.df() , it just does internally a system() of the classic ps command.

That's fine for your script, but it is not for a tool like the ps R package, which is expected to work on macOS and Windows as well, at the very least. It should also work if the ps program is not available, e.g. on some small Docker containers.

Your script already fails on macOS, because ps has different options there:

❯ source("https://raw.githubusercontent.com/mathosi/cluster_check/master/ps_to_df.R")
❯ ps.to.df()
ps: illegal option -- -
usage: ps [-AaCcEefhjlMmrSTvwXx] [-O fmt | -o fmt] [-G gid[,gid...]]
          [-g grp[,grp...]] [-u [uid,uid...]]
          [-p pid[,pid...]] [-t tty[,tty...]] [-U user[,user...]]
       ps [-L]
[1] perCPU  perMEM  PID     PPID    USER    COMMAND STARTED ELAPSED STAT
<0 rows> (or 0-length row.names)
Warning message:
In system(command = cmd, intern = TRUE) :
  running command 'ps -A --no-headers -o %cpu:5,%mem:5,pid:7,ppid:7,user:36,comm:15,lstart:30,etime:30,stat:5 --sort=-%cpu' had status 1

Something like that: ps_cmdline(cmd="-C rsession -o %cpu,%mem,pid")

I think you missed what ps_cmdline() did. It returns the command line of a process, e.g.:

❯ ps::ps_cmdline(ps::ps_handle())
[1] "/Library/Frameworks/R.framework/Resources/bin/exec/R"

The ps R package does not use the ps program at all.

YoannPa · 2019-11-18T18:51:36Z

That's fine for your script, but it is not for a tool like the ps R package, which is expected to work on macOS and Windows as well, at the very least. It should also work if the ps program is not available, e.g. on some small Docker containers.

Your script already fails on macOS, because ps has different options there

Yes you are right. I wasn't even aware that MacOs was using ps honnestly. Then I guess another approach wouldn't fit your package expectations. One should just provide a way to do it in Linux only.

I think you missed what ps_cmdline() did.
Yes sorry about that, I was thinking maybe a function with a similar name could be created to do what I described.

Thank you for your time and your answer, that's really nice of you.
Let me know if you plan to give more flexibility to the content of the ps() table.
Best,
Yoann.

gaborcsardi · 2019-11-18T19:19:29Z

Let me know if you plan to give more flexibility to the content of the ps() table.

Yes, as I said, please open other issues. Thanks.

gaborcsardi · 2019-11-19T10:29:54Z

#72
#73

YoannPa closed this as completed Nov 18, 2019

This was referenced Nov 19, 2019

ps::ps() output that looks like command line ps #72

Closed

Processor percent metrics #73

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Changing ps() output format #71

Changing ps() output format #71

YoannPa commented Nov 18, 2019

gaborcsardi commented Nov 18, 2019

YoannPa commented Nov 18, 2019 •

edited

Loading

gaborcsardi commented Nov 18, 2019 •

edited

Loading

YoannPa commented Nov 18, 2019

gaborcsardi commented Nov 18, 2019

gaborcsardi commented Nov 19, 2019

Changing ps() output format #71

Changing ps() output format #71

Comments

YoannPa commented Nov 18, 2019

gaborcsardi commented Nov 18, 2019

YoannPa commented Nov 18, 2019 • edited Loading

gaborcsardi commented Nov 18, 2019 • edited Loading

YoannPa commented Nov 18, 2019

gaborcsardi commented Nov 18, 2019

gaborcsardi commented Nov 19, 2019

YoannPa commented Nov 18, 2019 •

edited

Loading

gaborcsardi commented Nov 18, 2019 •

edited

Loading