Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

quicksearch not showing results #154

Closed
2 tasks
josh-chamberlain opened this issue Dec 21, 2023 · 14 comments
Closed
2 tasks

quicksearch not showing results #154

josh-chamberlain opened this issue Dec 21, 2023 · 14 comments
Assignees
Labels
bug Something isn't working

Comments

@josh-chamberlain
Copy link
Contributor

josh-chamberlain commented Dec 21, 2023

Context

Sometimes, the app reaches a state where every search shows "no results" and we have to rebuild to fix it.

Screen Shot 2023-12-21 at 10 51 15 AM

Steps to reproduce

currently unknown. typically, waiting around a week.

Fixing the symptoms

we may not be able to fix this, but we should rebuild the app when it stops responding. search must work.

  • make a cron job / github action which checks the status of the API (returning 500 when this bug occurs) and rebuilds the app when it's not working
    • if a rebuild or two doesn't fix it, we shouldn't keep trying to rebuild.
  • the cron job should ping discord #dev-alerts whenever it triggers a rebuild (there's a secret saved for the org)
@josh-chamberlain
Copy link
Contributor Author

This has recurred. Adjusting the issue to regularly check status and trigger a rebuild if it fails.

@maxachis
Copy link
Contributor

maxachis commented Jun 5, 2024

I created a repository at https://github.com/Police-Data-Accessibility-Project/health-monitoring which I'll use to develop health monitoring scripts. It'll be fairly simple to start with, but its functionality is distinct enough that I think even now it's useful to keep it separate from other repositories. Over time, we can expand it with more in-depth health checks.

@josh-chamberlain
Copy link
Contributor Author

@maxachis thanks, that's a good idea. I added a README and LICENSE.

@maxachis
Copy link
Contributor

maxachis commented Jun 6, 2024

@josh-chamberlain I'll need the WEBHOOK_URL that is used to post dev alerts to discord in order to finish setting up a basic alerting system.

Current plan is to be conservative and have it call the search endpoint every hour. After that, I'll look into using the Digital Ocean API to trigger rebuilds on failure, but first step is to make sure it works properly and doesn't count false positives, which unfortunately means we may have to wait a little while until it fails again.

@josh-chamberlain
Copy link
Contributor Author

@maxachis sounds like a good plan. DMing it to you, and adding it as an org-level secret...one day we could rename it, probably. i'll work on wrangling our secrets a bit better so we know what they're for.

@josh-chamberlain josh-chamberlain moved this from Todo to In Progress in Open issues & Roadmap Jun 11, 2024
@maxachis
Copy link
Contributor

@josh-chamberlain I've finished up the first draft of the health-monitoring repository and set it up in the "Automation-Manager" Droplet (formerly "Database-Automation-Manager", but renamed since this part doesn't touch the database).

As designed, the manager will log errors to discord, and log all events to a rotating log in the root directory of the repository, which rotates logs every day at midnight. At the moment, this is designed just to allow us to confirm its immediate performance of intended logic, but it could be expanded (and the rate of log rotation modified) in the future.

Now it's going to be a waiting game until the search fails. If everything works properly, it'll post to discord when it occurs.

Aside

We may want to add documentation for the Automation-Manager droplet, discussing what it does and what repositories it hosts.

@josh-chamberlain
Copy link
Contributor Author

@maxachis ah, thank you! That's great. A rotating log is great. Could we do it every week or 3 days, instead, just so they can be inspected even after a weekend or whatever?

Yes, documentation for the automation-manager droplet is critical because it's difficult for me (and other less-technical or less-paying-attention people) to tell how things are deployed. I just asked on a separate issue if we were ready to make a github action for this.

  1. in the README of this repo, we should say where it's deployed / how often it is triggered
    • if i make and merge changes, will they auto-deploy or do i need to do something?
    • how can i check to verify that this is still running / see the last time it ran?
  2. that might be it. long-term we should have some kind of statuspage but I think we're good for now

out of curiosity: what makes the droplet better than periodically triggering a github action? I like the Action because it's pretty transparent/happens next to the code, but I'm sure you have more control this way.

@maxachis
Copy link
Contributor

maxachis commented Jun 13, 2024

out of curiosity: what makes the droplet better than periodically triggering a github action? I like the Action because it's pretty transparent/happens next to the code, but I'm sure you have more control this way.

@josh-chamberlain Control is a major component of it, but also ease of development. Using GitHub Actions for more complex operations, such as prod-to-dev migration or health monitoring has, in my experience, been challenging due to several interrelated factors:

  1. GA involves building up and tearing down environments. As repositories increase in size or scope, this adds additional operation time for what is effectively a redundant operation, compared to DO, where the environment simply remains in a static position for an extended period. That can impose a cost issue, especially if we're running lengthy operations multiple times a day. Especially for smaller actions, the time to setup and install all dependencies can take considerably longer than simply executing the operations all of that setup is made for.
  2. It is generally more difficult to debug in GA. There's often a delay in triggering a GA, and even if it's half a minute, that adds up quickly if I'm checking on something iteratively -- even longer if I have to wait for a setup to complete. I also can't step-by-step debug a GA as I can with other components. In DO, it's much easier to debug, run, and verify, because I can do it all from an Ubuntu command line.
  3. I have better insight into the environment in DO compared to GA. While GA does use an ubuntu environment, I can't access that environment via a shell and poke around to see what exists, what doesn't, and how commands are interpreted.

That being said, there are options that allow us to blend the two approaches:

  • One option involves Self-hosted runners on GitHub, which would in theory allow us to bridge the gap between GA and DO. This would require more learning and setup, however, and I can't assess its utility too well unless I get deeper into implementing.
  • Another is to use GA to trigger operations in DO, and report back on the messages received. Although this means that we're now using both systems rather than one consistently, it would allow us to keep GAs lightweight and maintain most of the development in DO. But it would also involve additional logic.

@maxachis
Copy link
Contributor

@josh-chamberlain Additionally, I created an issue for adding documentation about the Automation Manager.

@josh-chamberlain
Copy link
Contributor Author

thanks @maxachis , this is helpful. I'll save it in our ADRs so we know when/why to abandon a github action prototype for a deployed thing.

@josh-chamberlain
Copy link
Contributor Author

(here's the ADR I made retroactively, visible if you have notion perms)

@maxachis
Copy link
Contributor

maxachis commented Jun 13, 2024

(here's the ADR I made retroactively, visible if you have notion perms)

I will say that Github Action is the better option for things that interface directly with the same repository it's located in and where we want to inspect changes immediately after or during pull requests. Tests, linting, security checks -- these all make sense to continue to include as Github Actions, both because:

  1. Libraries for these actions (such as pytest or bandit) already tend to support Github Actions out of the box, with minimal configuration
  2. And because it enables easy review at the point of pull requests, allowing us to quickly diagnose problems before they're merged. By contrast, something like prod-to-dev migration or health monitoring don't provide feedback immediately relevant to the associated repositories.

Additionally, we may benefit from synchronizing all our different workflows through something like Jenkins, which would help formalize the more complex CI/CD processes and integrate them under a singular user interface.

@maxachis
Copy link
Contributor

maxachis commented Jun 16, 2024

@maxachis ah, thank you! That's great. A rotating log is great. Could we do it every week or 3 days, instead, just so they can be inspected even after a weekend or whatever?

  1. in the README of this repo, we should say where it's deployed / how often it is triggered

    • if i make and merge changes, will they auto-deploy or do i need to do something?
    • how can i check to verify that this is still running / see the last time it ran?
  2. that might be it. long-term we should have some kind of statuspage but I think we're good for now

@josh-chamberlain Repository updated to have log rotate every week, and README updated to include the requested information! Have a look at the Readme and let me know if it looks good to you.

@josh-chamberlain
Copy link
Contributor Author

@maxachis nice! thank you. I'm good to close this as can't repro and consider it closed while we let health-monitoring do its thing.

@josh-chamberlain josh-chamberlain closed this as not planned Won't fix, can't repro, duplicate, stale Jun 17, 2024
@github-project-automation github-project-automation bot moved this from In Progress to Done in Open issues & Roadmap Jun 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Status: Done
Development

No branches or pull requests

2 participants