Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Upgrade Ubuntu] us_prod #965

Open
1 of 50 tasks
Tracked by #157
dacook opened this issue Nov 21, 2024 · 2 comments
Open
1 of 50 tasks
Tracked by #157

[Upgrade Ubuntu] us_prod #965

dacook opened this issue Nov 21, 2024 · 2 comments
Assignees

Comments

@dacook
Copy link
Member

dacook commented Nov 21, 2024

Slack thread: #instance-managers

1. Setting up the new server

  • Check old server config for any additional services to be aware of. Document any necessary steps for migration. Eg:
    • ls /etc/nginx/sites-enabled
    • systemctl --state=running
  • Hosting: provision new server with Ubuntu 20
  • DNS: add temporary domain (eg prod2.openfoodnetwork.org)

config

  • Add temporary name to inventory/hosts
  • Review host_vars/x/config.yml, clean up if needed
    • Make a copy for the temp hostname, add temp domain to bottom of certbot_domains
  • Review ofn-secrets:x_prod/secrets.yml, clean up if needed
    • Change to shared bugsnag projects
    • Don't bother making a copy of this one

setup

Enable passthrough on current server to allow new server to generate a certificate:

  • ansible-playbook playbooks/letsencrypt_proxy.yml -l x_prod -e "proxy_target=<new_ip>"

Then setup new server. Ensure you have the correct secrets (current secrets are usually fine).
ansible-playbook -l x_prod2 -e "@../ofn-secrets/x_prod/secrets.yml" playbooks/

  • setup.yml
  • provision.yml
  • deploy.yml
  • db_integrations (Permit DB access for n8n, Metabase)

initial migration

  • Ensure sidekiq is disabled, to avoid creating subscription orders when data is loaded:
    sudo systemctl stop sidekiq && sudo systemctl disable sidekiq
  • Setup direct ssh access for ofn-admin and openfoodnetwork as per guide

ansible-playbook -l x_prod -e rsync_to=x_prod2 playbooks/

  • db_transfer.yml
  • transfer_assets.yml

Make sure to clear cache so that instance settings are applied:
cd ~/apps/openfoodnetwork/current; bin/rails runner -e production "Rails.cache.clear"

2. Testing

  • test reboot
  • send test mail (/admin/mail_methods/edit).
  • terms of service file: /admin/terms_of_service_files
  • shop catalogue display correctly, with images, add to cart, begin checkout, login
  • note: check cookies if login won't work
  • Check integrations
    • Payments (check Stripe connect status /admin/stripe_connect_settings/edit)
    • New Relic
    • Bugsnag

3. Migration

preparation

  • new server: bin/rake db:reset -e production (important: make sure you're on the new server!)
  • deploy.yml -l x_prod2 -e "git_version=vX.Y.Z" matching version with current prod
  • old server: make a tiny data change to verify later (eg add . in meta description /admin/general_settings/edit)

switchover: old server

  • 🚧 maintenance_mode.yml
  • sudo systemctl stop sidekiq redis-jobs puma
  • Transfer /var/lib/redis-jobs/dump.rdb to new server (see guide)
  • db_transfer.yml ~3min
  • sudo systemctl stop postgres (ensure other integrations no longer touch it)
  • transfer_assets.yml just in case

switchover: new server

  • sudo systemctl restart puma; sudo systemctl start sidekiq redis-jobs
  • Rails.cache.clear (or migrate redis-cache/dump.rdb also)
  • ⏭️ temporary_proxy.yml -e 'proxy_target=<ip>' redirect traffic to new prod
    • Note: this doesn't include webservices, and doesn't handle images. So it's a very short-term fix if at all.
    • Use a hosts file entry to test a direct connection
  • Check there are no alarm bells, eg:
    • ~/apps/openfoodnetwork/current/logs/production.log and sidekiq.log
    • tiny data change is present. undo it.
    • shopfront and checkout looks good
    • upload a product image
    • get confirmation from local team
  • Update DNS to point to new server

4. Cleanup (after 48hrs)

Rollback plan

  • If an error occurs before the temporary proxy is active, and can't be resolved quickly, then restore service back to current server
  • If an error occurs after proxy is active, users may have interacted with the new server (eg made payments.
    • if serious, consider putting into maintenance mode (and stop sidekiq) to avoid further changes
    • otherwise seek to resolve issue in-place.
@dacook dacook mentioned this issue Nov 21, 2024
9 tasks
@github-project-automation github-project-automation bot moved this to All the things 💤 in OFN Delivery board Nov 21, 2024
@dacook dacook changed the title us_prod [Upgrade Ubuntu] us_prod Nov 21, 2024
@dacook
Copy link
Member Author

dacook commented Nov 21, 2024

There are some subdomains pointing to the server (see Cloudflare DNS), but they appear to be simple redirects (probably set up in nginx)

  • meet.openfoodnetwork.net
  • donate.openfoodnetwork.net

@dacook dacook self-assigned this Nov 21, 2024
@dacook dacook moved this from All the things 💤 to In Progress ⚙ in OFN Delivery board Nov 21, 2024
@lauriewayne
Copy link

There are some subdomains pointing to the server (see Cloudflare DNS), but they appear to be simple redirects (probably set up in nginx)

  • meet.openfoodnetwork.net
  • donate.openfoodnetwork.net

Yep! Cloudflare gives us three redirects and we only use two. We use them a good amount.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: In Progress ⚙
Development

No branches or pull requests

2 participants