Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sort-of rebase of 83: traj action aborts on error #271

Open
wants to merge 25 commits into
base: melodic-devel
Choose a base branch
from

Conversation

gavanderhoorn
Copy link
Member

Sort-of, as it's actually a bit more than what was included in #83.

The proposed changes make the joint_trajectory_action aware of the state of the (OEM) motion server program (as far as it publishes that in the form of RobotStatus messages). In two cases is this information taken into account:

  1. while accepting new goals
  2. while executing already accepted goals

In both cases, the following is monitored / checked:

  1. is an e-stop active?
  2. is there some other error active?
  3. does the motion server report motion_possible is true?

If any of these are not as expected, the goal is rejected, or, if a goal is currently being processed (ie: motion is execution), the goal is aborted and trajectory execution cancelled (as that should now work, with #260 merged (which still requires a cooperating driver of course)).

In all cases, the action result would be INVALID_GOAL, as unfortunately the FollowJointTrajectory action message does not define any really more suitable error codes.

Also in all cases: the reason for aborting or rejecting a goal is described in a human readable message.

When rejecting goals:

[ERROR] [...]: Rejecting goal: controller reported motion not possible (no further information)
[ERROR] [...]: Rejecting goal: controller reported (active) error (OEM code: 11003)
[ERROR] [...]: Rejecting goal: controller reported e-stop

When aborting goals:

[ERROR] [...]: Aborting goal: controller reported motion not possible (no further information)
[ERROR] [...]: Aborting goal: controller reported (active) error (OEM code: 11003)
[ERROR] [...]: Aborting goal: controller reported e-stop

(this is output from fanuc_driver: 11003 is Deadman switch released, which will indeed prevent any motion in manual mode)

While this is certainly not perfect (it doesn't address #265 nor #118), together with #263 and #260 and an (OEM) relay which behaves according to the spec, the UX of industrial_robot_client is improved quite a bit.

Lastly: as previous versions of the JTA did not check anything, misbehaving drivers would still be able to execute motions, even if they didn't properly populate all fields of the RobotStatus messages. In order to still allow usage of such systems, two new parameters were added which can be used to revert the JTA back to its previous behaviour (ie: don't check anything, always forward goals to drivers).

In the current implementation, the default is to disable these new checks. Something to discuss is whether we should make the default to enable the new checks, and users with misbehaving drivers should instead be required to explicitly enable the bw-compatible behaviour.

@gavanderhoorn
Copy link
Member Author

gavanderhoorn commented Jun 26, 2021

Note: 55034f5 is basically what #83 proposed, but it also actually signals to the relay it should stop sending points.

Subsequent commits rewrite this into what the current implementation does.

@gavanderhoorn
Copy link
Member Author

gavanderhoorn commented Jun 26, 2021

I've requested @Levi-Armstrong and/or @JeremyZoss review this, as they've been involved with the development of industrial_core in the past.

@simonschmeisser: if you're still using this, and have some time, I'd appreciate your opinion as well.

Note: I do not intent to keep developing all of this. I just wanted to make sure I can get #83 in before releasing a new version of industrial_core into Melodic and Noetic. But since #83 actually only did half of what it was supposed to, I opted to flesh it out a bit more. And with #260 in place, it was actually not that difficult.

Note 2: it's still possible for trajectory relays to keep moving after an e-stop or error state is cleared: there is no way for the ROS side here (so the IRC node) to enforce anything. The streamer will now in all cases send a "traj abort" message, but if the trajectory and state relays (running on the OEM controller) do not cooperate, there isn't much we can do.

fanuc_driver fi will continue to execute the last sent point after resuming the relay program. Then it will stop. fanuc_driver_exp is slightly better, in that it purges its internal motion buffer and actually SKIPs the current segment. Other drivers may behave similarly (abb_driver fi).

@gavanderhoorn
Copy link
Member Author

This probably also addresses #131 partly, as this should not be possible any more (the JTA should reject the goal due to the controller not being in AUTO mode, or at least not reporting that motion_possible is true):

This happens for me ie when the controller is set to "Manual mode" instead of "automatic". The mitsubishi controller will allow for upload but returns an error once I try to start execution of the trajectory.

Of course it will depend on the (OEM) state program to properly report controller state (ie: motion_possible should accurately reflect whether motion is possible or not).

@gavanderhoorn gavanderhoorn mentioned this pull request Jun 28, 2021
The action server will now take (OEM) motion server state into account when accepting or executing goals.

If the motion server reports (via the state server) it cannot currently execute motion, either because of a controller error or e-stop, or because a required program is not running on the (OEM) controller, new goals are rejected and currently executing goals are cancelled.

Cancelled trajectories should not resume executing once the (OEM) controller error state is cleared. This will however depend on a cooperating motion server (ie: it must manage its queues properly and watch for and act on errors on the OEM controller).

Cancellations and rejections are accompanied by appropriate human error messages, giving the reason for why a goal was cancelled or rejected. Unfortunately this is mostly only human readable as the FollowJointTrajectory action doesn't currently define appropriately expressive result codes.
To keep the behaviour of the action server bw-compatible with the previous behaviour, add two ROS parameters which allow to:

 - ignore motion server state completely
 - treat UNKNOWN motion server state as ON or OFF (where appropriate)

By default motion server state is neither ignored nor are UNKNOWNs acceptable values (and so UNKNOWNs will result in rejected goals and aborted trajectory executions).
Instead of manually comparing against TriState enum values.
It doesn't need to be *this* verbose.
But keep previous behaviour.
So we can reuse stopping the relay.
@gavanderhoorn
Copy link
Member Author

Friendly ping.

I don't believe this needs extensive testing.

The main question is whether we want to change the default behaviour (from not checking to checking).

@@ -30,6 +30,9 @@
<node pkg="industrial_robot_client" type="motion_download_interface" name="motion_download_interface"/>

<!-- joint_trajectory_action: provides actionlib interface for high-level robot control -->
<node pkg="industrial_robot_client" type="joint_trajectory_action" name="joint_trajectory_action"/>
<node pkg="industrial_robot_client" type="joint_trajectory_action" name="joint_trajectory_action">
<param name="ignore_motion_server_error" type="bool" value="true" />
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
<param name="ignore_motion_server_error" type="bool" value="true" />
<param name="ignore_motion_server_error" type="bool" value="false" />

I really appreciate this being tunable but I think we should be strict by default as it can introduce surprising delayed movements if you later enable movements (maybe ... maybe not ... ?)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this is indeed the main point of discussion here.

Doing this by default has the potential to break ppl's setups.

We could merge this PR as-is and not enable it on Melodic, then branch for Noetic and enable it there by default.

I hate branching for these little things, but I don't really see another option.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we leave it disabled, how would you enable it manually? copy paste the launch file?

I totally agree on avoiding branch-mess where possible

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see now that I forgot to export the params using args. I'll add that. That would allow overriding the defaults.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems I've forgotten to fix the args missing. I'll need to address that before merging this.

industrial_robot_client/src/joint_trajectory_action.cpp Outdated Show resolved Hide resolved
industrial_robot_client/src/joint_trajectory_action.cpp Outdated Show resolved Hide resolved
industrial_robot_client/src/joint_trajectory_action.cpp Outdated Show resolved Hide resolved
industrial_robot_client/src/joint_trajectory_action.cpp Outdated Show resolved Hide resolved
gavanderhoorn and others added 2 commits July 1, 2021 11:36
Co-authored-by: Simon Schmeisser <[email protected]>
@gavanderhoorn
Copy link
Member Author

@simonschmeisser: thanks for the review.

Would you see an opportunity to test this on any real hw?

I'd be interested to know whether this mitigates what you reported in #131 for instance.

@simonschmeisser
Copy link
Contributor

I'll try to fit in testing that on our Mitsubishi in the evening

@gavanderhoorn
Copy link
Member Author

I actually don't have access to an downloading driver / robot, so I'd be interested to know whether the proposed changes actually work there.

If they do, we'll need to update the .launch file for download drivers as well.

@simonschmeisser
Copy link
Contributor

Thanks for pointing that out, I actually only have access to a downloading robot/driver

@gavanderhoorn
Copy link
Member Author

Thanks for pointing that out, I actually only have access to a downloading robot/driver

I expect the changes to accepting goals to work for both. I believe that should help with #131.

Aborting active goals will work, but whether the OEM server program also cooperates I can't say -- and will depend on its implementation of course.

isOn(..), isOff(..) and isUnknown(..) are a bit too generic to not isolate them a bit.
Instead of as part of the ctor.
@simonschmeisser
Copy link
Contributor

We would need to set the error_text string field of the FollowJointTrajectory msg for it to be printed there

I'm not sure why we have the empty controller name in all of our controller files ... what do you usually put there? Something like mitsubishi driver?

@gavanderhoorn
Copy link
Member Author

We would need to set the error_text string field of the FollowJointTrajectory msg for it to be printed there

According to the api doc (and the sources), anything you pass to setRejected(..) gets assigned to the error_text. At least that's how I interpreted that.

I'm not sure why we have the empty controller name in all of our controller files ... what do you usually put there? Something like mitsubishi driver?

It's most likely empty because the template in the tutorial lots of people refer to has it set to "".

If you set it to something non-empty, MoveIt will start looking in a different place for your action topics. That's not necessarily a problem though, you'd just have to update your driver to offer the action server there (or remap).

@simonschmeisser
Copy link
Contributor

According to the api doc (and the sources), anything you pass to setRejected(..) gets assigned to the error_text. At least that's how I interpreted that.

But we have two text variables for the error here. The generic one you are setting and the redundant one in http://docs.ros.org/en/api/control_msgs/html/action/FollowJointTrajectory.html

@gavanderhoorn
Copy link
Member Author

gavanderhoorn commented Jul 9, 2021

O, crap, you're right.

I'd somehow confused those.

Perhaps MoveIt should check both and display them.

simonschmeisser
simonschmeisser previously approved these changes Jul 9, 2021
@gavanderhoorn
Copy link
Member Author

So with the technical side out of the way, the question becomes: do we want to merge this into melodic-devel and change the default behaviour?

@simonschmeisser @Levi-Armstrong @anyone-with-an-opinion-really?

@gavanderhoorn
Copy link
Member Author

With 0eeb23f and 0fa8bba, MoveIt now reports something like:

[ WARN] [1626612647.088687582]: Controller '' failed with error INVALID_GOAL: Rejecting goal: controller reported motion not possible (no further information)
[ WARN] [1626612647.088809223]: Controller handle '' reports status FAILED
[ INFO] [1626612647.088865293]: Completed trajectory execution with status FAILED ...
[ INFO] [1626612647.088960767]: Execution completed: FAILED

The double controller there can be a bit confusing.

I was thinking of maybe adding "OEM" or "robot" there, so we would get:

[ WARN] [1626612647.088687582]: Controller '' failed with error INVALID_GOAL: Rejecting goal: OEM controller reported motion not possible (no further information)

But this would of course not be accurate if the server side would not run on an OEM controller.

@simonschmeisser
Copy link
Contributor

I'm not sure how many people understand what you mean by OEM so I'd rather go for robot controller

Copy link
Contributor

@simonschmeisser simonschmeisser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There don't seem to be that many opinionated people around any more ...

I vote for merging this as is to melodic-devel as I see the slight change in behavior as a bug fix basically

@gavanderhoorn
Copy link
Member Author

Ok, to move this forward, I'm going to create a "post noetic" branch.

On that branch, everything discussed here will default to false (ie: ignore_motion_server_error and consider_status_unknowns_ok).

On both Melodic and Noetic, they'll default to true (to maintain the current behaviour).

People still using industrial_robot_client and likely to keep use it could build from source. The binary releases will just play-nice.

@simonschmeisser
Copy link
Contributor

Friendly ping, could be part of ros-o

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

2 participants