Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Strict ACU elevation check in HWP supervisor restricts normal operation #788

Open
BrianJKoopman opened this issue Nov 3, 2024 · 4 comments
Assignees
Labels
agent: hwp supervisor bug Something isn't working

Comments

@BrianJKoopman
Copy link
Member

During normal operations the ACU encoder readout will sometimes be past the min/max elevation by a tiny amount. This can limit a typical scan. For example:

2024-11-03T14:34:06+0000 start called for pid_to_freq
2024-11-03T14:34:06+0000 pid_to_freq:9 Status is now "starting".
2024-11-03T14:34:06+0000 pid_to_freq:9 Status is now "running".
2024-11-03T14:34:06+0000 Setting state: ControlState.PIDToFreq(target_freq=2.0, direction='1', freq_tol=0.05, freq_tol_duration=10.0)
2024-11-03T14:34:06+0000 Error updating state:
Traceback (most recent call last):
  File "/opt/venv/lib/python3.10/site-packages/socs/agents/hwp_supervisor/agent.py", line 982, in update
    check_acu_ok_for_spinup()
  File "/opt/venv/lib/python3.10/site-packages/socs/agents/hwp_supervisor/agent.py", line 975, in check_acu_ok_for_spinup
    raise RuntimeError(f"ACU elevation is {acu.el_current_position} deg, "
RuntimeError: ACU elevation is 47.9998 deg, outside of allowed range (48.0, 90.0)

This has been encountered on satp2 and satp3. satp3 chose to loosen the limits in the configs by 1/10th of a degree. I think we should just set some acceptable tolerance on this check in the agent.

@ykyohei
Copy link
Contributor

ykyohei commented Nov 29, 2024

Related to this issue, satp3 had similar problem on Nov 29th 2024, by following schedule https://site.simonsobs.org/satp3/nextline/db/runs/2202

run.wait_until('2024-11-29T18:44:23.980000+00:00')
run.acu.move_to(az=180.0, el=40.0)
run.acu.move_to(az=180.0, el=48.0)
######## HWP spinning up ##########
run.hwp.set_freq(freq=2) # start HWP rotation

In this case, I think the safety was triggered due to the time lag of acu monitoring.
hwp spin up was called right after the telescope has moved from 40 to 48, and the hwp_supervisor checked the acu position that was updated slightly before, and it was lower than 48.
One of the solutions is to just add a time buffer in the schedule.

@BrianJKoopman
Copy link
Member Author

Related to this issue, satp3 had similar problem on Nov 29th 2024, by following schedule https://site.simonsobs.org/satp3/nextline/db/runs/2202

run.wait_until('2024-11-29T18:44:23.980000+00:00')
run.acu.move_to(az=180.0, el=40.0)
run.acu.move_to(az=180.0, el=48.0)
######## HWP spinning up ##########
run.hwp.set_freq(freq=2) # start HWP rotation

In this case, I think the safety was triggered due to the time lag of acu monitoring. hwp spin up was called right after the telescope has moved from 40 to 48, and the hwp_supervisor checked the acu position that was updated slightly before, and it was lower than 48. One of the solutions is to just add a time buffer in the schedule.

I would have expected motion to have completed by the time run.hwp.set_freq(freq=2) was run, since the run.acu.move_to() lines are effectively running:

acu.go_to.start()
acu.go_to.wait()

From that run error log:

RuntimeError: ACU elevation is 47.7686 deg, outside of allowed range (47.9, 70.1)

This occurred at 1732905887.911, or 2024-11-29T18:44:47.911 UTC. Here's the relevant position information in Grafana. At that timestamp the position recorded there is 48.0.

@mhasself, am I correct that the ACU agent's go_to() task blocks until motion is complete? Any exceptions to that? Or is it possible we grabbed the position early here.

@mhasself
Copy link
Member

mhasself commented Dec 2, 2024

The ACU agent blocks. I believe the problem was that the HWPSupervisor polls for the positions from the ACU agent, every so often, and that cache can be a little bit stale.

@BrianJKoopman
Copy link
Member Author

The ACU agent blocks. I believe the problem was that the HWPSupervisor polls for the positions from the ACU agent, every so often, and that cache can be a little bit stale.

Yeah, that makes sense, thanks. New issue to track this created: #799.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
agent: hwp supervisor bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants