Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hwp site changes #541

Merged
merged 8 commits into from
Nov 22, 2023
Merged

Hwp site changes #541

merged 8 commits into from
Nov 22, 2023

Conversation

bbixler500
Copy link
Contributor

Changes to the hwp_gripper, hwp_pid, and hwp_pmx agents to improve functionality at the site

Description

  • hwp_gripper: fixed how arguments are passed to the agent and renamed a function for consistency
  • hwp_pid: changed the socket communication to prevent agent crashes
  • hwp_pmx: changed the socket communication to prevent agent crashes

Motivation and Context

As it is currently, the hwp_gripper agent does not work. The changes made to that agent were made to get it running. The hwp_pid and hwp_pmx agents crash if the network communication is temporarily lost. The changes made to those agents are to prevent that crashing.

How Has This Been Tested?

All three of these agents have been running on the dev environment overnight with no issue. The changes to the hwp_gripper agent are minor and only effect how the agent is initialized. The changes to the hwp_pid and hwp_pmx agents only change the socket communication between agent and device, and the agents have been continually sending data to grafana successfully.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)

Checklist:

  • My code follows the code style of this project.
  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.

Bryce Bixler and others added 4 commits October 9, 2023 18:43
updates:
- [github.com/pre-commit/pre-commit-hooks: v4.4.0 → v4.5.0](pre-commit/pre-commit-hooks@v4.4.0...v4.5.0)

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Copy link
Collaborator

@jlashner jlashner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks ok to me for the most part, however I don't agree with the new _establish_connection logic in the PID and PMX controllers. I think this needs to fail after a certain number of attempts, instead of entering an infinite loop.

# unit tests might fail on first connection attempt
attempts = 3
for attempt in range(attempts):
while True:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't love this being an infinite loop, since control is never passed back to the agent. Can we keep this to a finite number of attempts, and handle re-connection at the agent level?

if attempt == 1:
print("Resetting connection")
self.conn.close()
self.conn = self._establish_connection(self.ip, int(self.port))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as mentioned above, if there is no connection, establish_connection here will wait forever. Instead, this should throw an error after a finite amount of time.

@davidvng
Copy link
Contributor

Hi @bbixler500 , this fix looks great! Over at SATP3 we've been having trouble with our PID and gripper agents, and I've tested this branch and they work. I'd greatly appreciate getting this merged soon.

@bbixler500
Copy link
Contributor Author

At the site there was a strange behavior where the PID and PMX agents would consistently crash every night, requiring me to restart the agents every morning. These crashes occurred because of connection interruptions, which could last up to 3hrs (I was never able to determine the cause of the connection interruptions). Previously the agents were coded so that _establish_connection would throw an error after a set number of unsuccessful connections. Consequently, the number of unsuccessful attempts was always reached and the agent would always crash. The switch to the infinite loop solved the agents crashing issue.

@davidvng
Copy link
Contributor

At the site there was a strange behavior where the PID and PMX agents would consistently crash every night, requiring me to restart the agents every morning. These crashes occurred because of connection interruptions, which could last up to 3hrs (I was never able to determine the cause of the connection interruptions). Previously the agents were coded so that _establish_connection would throw an error after a set number of unsuccessful connections. Consequently, the number of unsuccessful attempts was always reached and the agent would always crash. The switch to the infinite loop solved the agents crashing issue.

I think you can let the acq process be the "infinite loop" here: https://github.com/simonsobs/socs/blob/hwp_site_changes/socs/agents/hwp_pid/agent.py#L286
Then, whenever these calls fail, you can call init_connection again. I think this is what @jlashner meant by reconnecting at the agent level.

Bryce Bixler and others added 2 commits November 17, 2023 22:26
…f iterations. Added reconnection functionality into the agent acq process
@davidvng
Copy link
Contributor

@bbixler500 this change looks good to me. If you could fix the pytest failure and pre-commit failure, we can get this approved and merged.

@bbixler500
Copy link
Contributor Author

I'm not sure what this pytest error means or how to solve it. It appears to be from the test_hwp_pmx_agent_integration.py file

@davidvng davidvng requested a review from jlashner November 21, 2023 22:51
@davidvng
Copy link
Contributor

@jlashner @BrianJKoopman can we get this reviewed and merged this week? SATP3 would like to spin the HWP this weekend. thanks!

Copy link
Collaborator

@jlashner jlashner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me! Thanks for the changes @bbixler500

@BrianJKoopman BrianJKoopman merged commit fc79849 into main Nov 22, 2023
7 checks passed
@BrianJKoopman BrianJKoopman deleted the hwp_site_changes branch November 22, 2023 23:04
@davidvng davidvng linked an issue Nov 28, 2023 that may be closed by this pull request
hnakata-JP pushed a commit that referenced this pull request Apr 12, 2024
* minor bugfixes to the hwp-gripper agent

* Modified hwp-pid and hwp-pmx agents to handle connection interruptions

* [pre-commit.ci] pre-commit autoupdate (#540)

updates:
- [github.com/pre-commit/pre-commit-hooks: v4.4.0 → v4.5.0](pre-commit/pre-commit-hooks@v4.4.0...v4.5.0)

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Changed connection protocal from an infinite loop into a set number of iterations. Added reconnection functionality into the agent acq process

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fixes to pre-commit failures

* Added read=False to set_current_limit meathod

---------

Co-authored-by: Bryce Bixler <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

HWP Gripper agent unexpected keyword argument 'mcu_ip'
4 participants