-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Trigger PA Messages from RTS #764
Conversation
Coverage of commit
|
Coverage of commit
|
config/runtime.exs
Outdated
active_headend_path: System.get_env("ACTIVE_HEADEND_S3_PATH"), | ||
screenplay_url: System.get_env("SCREENPLAY_URL"), | ||
screenplay_api_key: System.get_env("SCREENPLAY_API_KEY"), | ||
active_pa_messages_path: System.get_env("ACTIVE_PA_MESSAGES_PATH") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the rationale for this being a runtime environment variable? If we just need a testing hook, then just using the application environment should be sufficient.
As a side note, it would be nice at some point to convert the Fake
http module to use Mox
instead, which would make testing more straightforward, but that will probably take a bit of effort.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm yeah that's a good point. I think my rationale was just that the path could change since the screenplay api is relatively new and still being developed, but I guess it would be just as much effort to make the devops change rather than just make a code change in RTS or probably even less. I'll make a change to remove this one
lib/engine/pa_messages.ex
Outdated
end | ||
|
||
def create_table(state) do | ||
:ets.new(state.pa_message_timers_table, [:named_table, read_concurrency: true]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we gain anything from using ETS here? It looks like the only access is inside this module, so there shouldn't be any contention to worry about.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh that is true. Originally I had planned that the sign processes would be making requests to access the table so started with this and left it, but we can simplify things and go without the table
lib/engine/pa_messages.ex
Outdated
ms_elapsed = existing_pa_message.interval_in_ms - remaining_ms | ||
temp_interval = (active_pa_message.interval_in_ms - ms_elapsed) |> max(0) | ||
cancel_pa_timer(timer_ref, pa_id) | ||
schedule_pa_message(active_pa_message, temp_interval, table) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The timer-based approach will certainly work, but now that I'm looking at the code, I wonder if it's a little more complex than we need. I know the original motivation was to avoid excessive polling, but now that all the logic is centralized in this engine (which I think is good), we could also just keep a map of id-to-last-playtime in state, and check all the deltas against the current time once per second (especially since we're already doing an :update
loop). My hunch is that that would end up being cleaner than wrangling the timers. Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm yeah I think that approach works nicely and will make the "edited interval" piece of the logic a good deal simpler. And it removes the need to cancel timers and such as you hinted at. I think we also would get the "handling" of inactive messages for free as well since if they're not included in the response, we won't be checking for the last playtime anyway.
@@ -47,7 +47,8 @@ defmodule Signs.Realtime do | |||
announced_alert: false, | |||
prev_prediction_keys: nil, | |||
prev_predictions: [], | |||
uses_shuttles: true | |||
uses_shuttles: true, | |||
pa_message_plays: %{} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I understand the benefit of tracking this here. Is this so in case the engine crashes, we won't spam messages? If that's the concern, and we're willing to accept a slight behavior change, we could schedule new messages with their full duration instead of immediately, which would result in fewer messages in that failure mode, rather than excess ones.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah that was the intention here, just to prevent consecutive plays due to an engine crashes/restarts. Ah so you mean that a PA Message wouldn't play immediately upon scheduling and instead wait one interval before it's first play?
I had actually sort of brought that up in the product channel last week but it was only discussed briefly. Kevin said a PA message should play as soon as it can right after it's created, but I'm still not sure if that's also what the ARINC software does currently. Seems like its still certainly up for discussion though. I'll raise the question again to the product folks and see if we would be okay with that behavior change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Discussed a bit further in the thread and I think I lean towards avoiding scheduling a new message with the full duration. I don't think it's been fully confirmed or is explicitly a standard practice, but it seems that OIOs tend to schedule a message and look immediately for a "proof of play" and I'd rather not disrupt their workflow if we can avoid it. I'm open to other ideas for preventing spammed message plays though if you have any thoughts?
config/runtime.exs
Outdated
@@ -36,7 +36,9 @@ if config_env() != :test do | |||
message_log_s3_folder: System.get_env("MESSAGE_LOG_S3_FOLDER"), | |||
message_log_report_s3_folder: System.get_env("MESSAGE_LOG_REPORT_S3_FOLDER"), | |||
monitoring_api_key: System.get_env("MONITORING_API_KEY"), | |||
active_headend_path: System.get_env("ACTIVE_HEADEND_S3_PATH") | |||
active_headend_path: System.get_env("ACTIVE_HEADEND_S3_PATH"), | |||
active_pa_messages_url: System.get_env("ACTIVE_PA_MESSAGES_URL"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On second thought @panentheos , I'm not sure if I interpreted your comment about the runtime env here correctly. Did you mean that it's not necessary to split the screenplay base url and the route for active PA messages or that we don't need the screenplay url as an env variable at all? I assume we still want it as an env variable for local testing and such, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My thinking was: The screenplay base URL should be a system env variable, like we do for other systems. The path for the endpoint should not be: Ideally, it wouldn't be abstracted at all, since the actual API path on Screenplay is always the same. However, because of how the Fake
module works, we need some sort of hook for testing different responses. The least intrusive thing that gets us that would be to put the path in the application environment, with the real value configured statically in config.exs
, and then have the tests modify the application env like they're doing.
7d20c57
to
b58619c
Compare
f999bf0
to
54fb458
Compare
Coverage of commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See a few comments, then looks good. Before we merge/deploy this, we should make sure that the environment variables are set up, and that screenplay is ready to serve responses in prod.
lib/signs/utilities/audio.ex
Outdated
end | ||
end | ||
|> then(fn | ||
:ok -> Map.put(pa_message_plays, pa_message.id, DateTime.utc_now()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PaEss.Updater.play_message
doesn't currently return :ok
for the new SCU code path, so it should be changed to do that. Alternatively, we could just remove the branching here, since I don't think we expect a recoverable failure case anyway.
lib/engine/pa_messages.ex
Outdated
end) | ||
end | ||
|
||
defp handle_inactive_pa_messages(active_pa_messages, pa_messages_last_sent) do |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to do this? If we proactively wipe this state, then deactivating and reactivating an alert would cause it to play again immediately, which could be undesirable. If we leave the sign state intact, it would block it instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah that makes sense. What I had in mind was to keep things in line with the immediate feedback that OIOs tend to expect when they set up a PA message, but maybe that's just for initially creating the message or reactivating it after a longer length of time has passed. Probably better to not allow the chance of a bunch of consecutive plays.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could see the logging here being a useful audit, but we could also just do that from the Screenplay side
lib/signs/bus.ex
Outdated
@@ -121,6 +124,23 @@ defmodule Signs.Bus do | |||
{:ok, state} | |||
end | |||
|
|||
@impl true | |||
def handle_info({:play_pa_message, pa_message}, sign) do | |||
Logger.info("pa_message: action=send id=#{pa_message.id} destination=#{sign.id}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The logging looks like it could be cleaned up a little. This one should probably be moved into send_pa_message
and only done if we actually play the message. There's another log with the same action
value in the engine, which we could remove.
Coverage of commit
|
Summary of changes
Asana Ticket: RTS: Trigger PA Message Readouts
This PR adds a new engine that reads in active PA messages from Screenplay and schedules them to play on sign zones. It also adds a map to sign states to track when the most recent play of a given pa message was. This is to add some resilience against engine restarts which would result in triggers of PA messages that are too close together.
-When a message goes inactive, it's current timer will be cancelled and it's record delete from the ets table.
-When an interval is changed, we will update the scheduling so that the next time the message plays will be in line with it's new interval.