Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Trigger PA Messages from RTS #764

Merged
merged 18 commits into from
Jul 1, 2024
Merged

Trigger PA Messages from RTS #764

merged 18 commits into from
Jul 1, 2024

Conversation

PaulJKim
Copy link
Collaborator

Summary of changes

Asana Ticket: RTS: Trigger PA Message Readouts

This PR adds a new engine that reads in active PA messages from Screenplay and schedules them to play on sign zones. It also adds a map to sign states to track when the most recent play of a given pa message was. This is to add some resilience against engine restarts which would result in triggers of PA messages that are too close together.

-When a message goes inactive, it's current timer will be cancelled and it's record delete from the ets table.
-When an interval is changed, we will update the scheduling so that the next time the message plays will be in line with it's new interval.

Copy link

Coverage of commit 47c2b62

Summary coverage rate:
  lines......: 73.4% (2085 of 2840 lines)
  functions..: 76.0% (585 of 770 functions)
  branches...: no data found

Files changed coverage rate:
                                                             |Lines       |Functions  |Branches    
  Filename                                                   |Rate     Num|Rate    Num|Rate     Num
  =================================================================================================
  lib/engine/pa_messages.ex                                  |85.2%     54| 100%    12|    -      0
  lib/fake/httpoison.ex                                      |74.5%     47|57.1%     7|    -      0
  lib/pa_messages/pa_message.ex                              | 100%      3| 100%     5|    -      0
  lib/realtime_signs.ex                                      | 100%     15| 100%     3|    -      0
  lib/signs/bus.ex                                           |84.1%    246|95.2%    42|    -      0
  lib/signs/realtime.ex                                      |92.5%     67|91.7%    12|    -      0
  lib/signs/utilities/audio.ex                               |87.7%    130| 100%    16|    -      0

Download coverage report

Copy link

Coverage of commit 9fe3e57

Summary coverage rate:
  lines......: 73.3% (2083 of 2840 lines)
  functions..: 76.0% (585 of 770 functions)
  branches...: no data found

Files changed coverage rate:
                                                             |Lines       |Functions  |Branches    
  Filename                                                   |Rate     Num|Rate    Num|Rate     Num
  =================================================================================================
  lib/engine/pa_messages.ex                                  |85.2%     54| 100%    12|    -      0
  lib/fake/httpoison.ex                                      |74.5%     47|57.1%     7|    -      0
  lib/pa_messages/pa_message.ex                              | 100%      3| 100%     5|    -      0
  lib/realtime_signs.ex                                      | 100%     15| 100%     3|    -      0
  lib/signs/bus.ex                                           |83.3%    246|95.2%    42|    -      0
  lib/signs/realtime.ex                                      |92.5%     67|91.7%    12|    -      0
  lib/signs/utilities/audio.ex                               |87.7%    130| 100%    16|    -      0

Download coverage report

active_headend_path: System.get_env("ACTIVE_HEADEND_S3_PATH"),
screenplay_url: System.get_env("SCREENPLAY_URL"),
screenplay_api_key: System.get_env("SCREENPLAY_API_KEY"),
active_pa_messages_path: System.get_env("ACTIVE_PA_MESSAGES_PATH")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the rationale for this being a runtime environment variable? If we just need a testing hook, then just using the application environment should be sufficient.

As a side note, it would be nice at some point to convert the Fake http module to use Mox instead, which would make testing more straightforward, but that will probably take a bit of effort.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm yeah that's a good point. I think my rationale was just that the path could change since the screenplay api is relatively new and still being developed, but I guess it would be just as much effort to make the devops change rather than just make a code change in RTS or probably even less. I'll make a change to remove this one

end

def create_table(state) do
:ets.new(state.pa_message_timers_table, [:named_table, read_concurrency: true])
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we gain anything from using ETS here? It looks like the only access is inside this module, so there shouldn't be any contention to worry about.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh that is true. Originally I had planned that the sign processes would be making requests to access the table so started with this and left it, but we can simplify things and go without the table

ms_elapsed = existing_pa_message.interval_in_ms - remaining_ms
temp_interval = (active_pa_message.interval_in_ms - ms_elapsed) |> max(0)
cancel_pa_timer(timer_ref, pa_id)
schedule_pa_message(active_pa_message, temp_interval, table)
Copy link
Collaborator

@panentheos panentheos Jun 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The timer-based approach will certainly work, but now that I'm looking at the code, I wonder if it's a little more complex than we need. I know the original motivation was to avoid excessive polling, but now that all the logic is centralized in this engine (which I think is good), we could also just keep a map of id-to-last-playtime in state, and check all the deltas against the current time once per second (especially since we're already doing an :update loop). My hunch is that that would end up being cleaner than wrangling the timers. Thoughts?

Copy link
Collaborator Author

@PaulJKim PaulJKim Jun 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm yeah I think that approach works nicely and will make the "edited interval" piece of the logic a good deal simpler. And it removes the need to cancel timers and such as you hinted at. I think we also would get the "handling" of inactive messages for free as well since if they're not included in the response, we won't be checking for the last playtime anyway.

@@ -47,7 +47,8 @@ defmodule Signs.Realtime do
announced_alert: false,
prev_prediction_keys: nil,
prev_predictions: [],
uses_shuttles: true
uses_shuttles: true,
pa_message_plays: %{}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand the benefit of tracking this here. Is this so in case the engine crashes, we won't spam messages? If that's the concern, and we're willing to accept a slight behavior change, we could schedule new messages with their full duration instead of immediately, which would result in fewer messages in that failure mode, rather than excess ones.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that was the intention here, just to prevent consecutive plays due to an engine crashes/restarts. Ah so you mean that a PA Message wouldn't play immediately upon scheduling and instead wait one interval before it's first play?

I had actually sort of brought that up in the product channel last week but it was only discussed briefly. Kevin said a PA message should play as soon as it can right after it's created, but I'm still not sure if that's also what the ARINC software does currently. Seems like its still certainly up for discussion though. I'll raise the question again to the product folks and see if we would be okay with that behavior change.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed a bit further in the thread and I think I lean towards avoiding scheduling a new message with the full duration. I don't think it's been fully confirmed or is explicitly a standard practice, but it seems that OIOs tend to schedule a message and look immediately for a "proof of play" and I'd rather not disrupt their workflow if we can avoid it. I'm open to other ideas for preventing spammed message plays though if you have any thoughts?

@PaulJKim PaulJKim requested a review from a team as a code owner June 14, 2024 22:33
@@ -36,7 +36,9 @@ if config_env() != :test do
message_log_s3_folder: System.get_env("MESSAGE_LOG_S3_FOLDER"),
message_log_report_s3_folder: System.get_env("MESSAGE_LOG_REPORT_S3_FOLDER"),
monitoring_api_key: System.get_env("MONITORING_API_KEY"),
active_headend_path: System.get_env("ACTIVE_HEADEND_S3_PATH")
active_headend_path: System.get_env("ACTIVE_HEADEND_S3_PATH"),
active_pa_messages_url: System.get_env("ACTIVE_PA_MESSAGES_URL"),
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On second thought @panentheos , I'm not sure if I interpreted your comment about the runtime env here correctly. Did you mean that it's not necessary to split the screenplay base url and the route for active PA messages or that we don't need the screenplay url as an env variable at all? I assume we still want it as an env variable for local testing and such, right?

Copy link
Collaborator

@panentheos panentheos Jun 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My thinking was: The screenplay base URL should be a system env variable, like we do for other systems. The path for the endpoint should not be: Ideally, it wouldn't be abstracted at all, since the actual API path on Screenplay is always the same. However, because of how the Fake module works, we need some sort of hook for testing different responses. The least intrusive thing that gets us that would be to put the path in the application environment, with the real value configured statically in config.exs, and then have the tests modify the application env like they're doing.

@sloanelybutsurely sloanelybutsurely force-pushed the pk/pa-messages branch 2 times, most recently from 7d20c57 to b58619c Compare June 18, 2024 21:01
Copy link

Coverage of commit 7c505b6

Summary coverage rate:
  lines......: 73.1% (2064 of 2824 lines)
  functions..: 75.8% (581 of 766 functions)
  branches...: no data found

Files changed coverage rate:
                                                             |Lines       |Functions  |Branches    
  Filename                                                   |Rate     Num|Rate    Num|Rate     Num
  =================================================================================================
  lib/engine/pa_messages.ex                                  |63.2%     38| 100%     8|    -      0
  lib/fake/httpoison.ex                                      |74.5%     47|57.1%     7|    -      0
  lib/pa_messages/pa_message.ex                              | 100%      3| 100%     5|    -      0
  lib/realtime_signs.ex                                      | 100%     15| 100%     3|    -      0
  lib/signs/bus.ex                                           |83.3%    246|95.2%    42|    -      0
  lib/signs/realtime.ex                                      |89.4%     66|91.7%    12|    -      0
  lib/signs/utilities/audio.ex                               |87.7%    130| 100%    16|    -      0

Download coverage report

Copy link
Collaborator

@panentheos panentheos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See a few comments, then looks good. Before we merge/deploy this, we should make sure that the environment variables are set up, and that screenplay is ready to serve responses in prod.

lib/signs/bus.ex Outdated Show resolved Hide resolved
end
end
|> then(fn
:ok -> Map.put(pa_message_plays, pa_message.id, DateTime.utc_now())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PaEss.Updater.play_message doesn't currently return :ok for the new SCU code path, so it should be changed to do that. Alternatively, we could just remove the branching here, since I don't think we expect a recoverable failure case anyway.

end)
end

defp handle_inactive_pa_messages(active_pa_messages, pa_messages_last_sent) do
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to do this? If we proactively wipe this state, then deactivating and reactivating an alert would cause it to play again immediately, which could be undesirable. If we leave the sign state intact, it would block it instead.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that makes sense. What I had in mind was to keep things in line with the immediate feedback that OIOs tend to expect when they set up a PA message, but maybe that's just for initially creating the message or reactivating it after a longer length of time has passed. Probably better to not allow the chance of a bunch of consecutive plays.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could see the logging here being a useful audit, but we could also just do that from the Screenplay side

lib/signs/bus.ex Outdated
@@ -121,6 +124,23 @@ defmodule Signs.Bus do
{:ok, state}
end

@impl true
def handle_info({:play_pa_message, pa_message}, sign) do
Logger.info("pa_message: action=send id=#{pa_message.id} destination=#{sign.id}")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logging looks like it could be cleaned up a little. This one should probably be moved into send_pa_message and only done if we actually play the message. There's another log with the same action value in the engine, which we could remove.

Copy link

Coverage of commit e5bc8be

Summary coverage rate:
  lines......: 73.4% (2057 of 2802 lines)
  functions..: 75.7% (578 of 764 functions)
  branches...: no data found

Files changed coverage rate:
                                                             |Lines       |Functions  |Branches    
  Filename                                                   |Rate     Num|Rate    Num|Rate     Num
  =================================================================================================
  lib/engine/pa_messages.ex                                  |72.4%     29| 100%     7|    -      0
  lib/fake/httpoison.ex                                      |74.5%     47|57.1%     7|    -      0
  lib/pa_messages/pa_message.ex                              | 100%      3| 100%     5|    -      0
  lib/realtime_signs.ex                                      | 100%     15| 100%     3|    -      0
  lib/signs/bus.ex                                           |84.7%    242|95.2%    42|    -      0
  lib/signs/realtime.ex                                      |92.1%     63|91.7%    12|    -      0
  lib/signs/utilities/audio.ex                               |88.5%    131| 100%    17|    -      0

Download coverage report

@PaulJKim PaulJKim merged commit 92ab430 into main Jul 1, 2024
3 checks passed
@PaulJKim PaulJKim deleted the pk/pa-messages branch July 1, 2024 14:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants