-
Notifications
You must be signed in to change notification settings - Fork 1
N8N Setup and Troubleshooting
Mode: Self Hosted on Cloudron (Memory limit: 5 GB, Shared Storage Space available: 50 GB+ )
URL: https://n8n.openfoodnetwork.org.uk/ (common URL for all instances)
Crash Alert: Uptime Kuma (Cloudron should be working for this to work), Datadog (being explored)
Slack Channel: #n8n
Version: n8n gets auto-updates
App Title and Version (16th May 2023): n8n 0.228.1
Package Version (16th May 2023): v2.42.0
Backup: Automatic Backup has been turned off due to past crashes (It will be turned on once an alternative location for backup is decided - ongoing priority task). n8n still creates backup every time the app is updated.
In this page we will talk about the steps to take when n8n crashes, is unavailable, or doesn’t respond. To see common issues in a workflow or specific modules, see Common Issues,Tips, and Tricks page in API handbook.
Sometimes while running a big workflow (ex. getting few thousand products via the GET Bulk Products API call), n8n may become unresponsive. It will be stuck at one node for a long time or show an error. Sometime, it will say “The workflow execution is probably still running but it may have crashed and n8n cannot safely tell”, when you try to stop the workflow.
It's basically out of memory. Since all the n8n instances for OFN share storage and memory, so its possible that n8n may become unresponsive if multiple memory intensive workflows are running at the same time.
What to do (general):
- Optimise your workflow by following the tips mentioned in #n8n good practices (to be added soon) to reduce the memory consumption/requirement.
- Memory allocated to n8n can be increased, but there is a cost for it. Post in n8n group if the above steps don’t solve the problem. Someone may be able to help.
n8n can become unavailable or crash due to a variety of issues (out of space in cloudron, memory issues, cloudron is down, or simply a bug). In these cases n8n will be unavailable for all the instances. Active/Scheduled workflow will not run.
Note: OFN is hosting n8n on Cloudron. Therefore, if Cloudron is down, n8n will also be unreachable.
Alert/Notification:
There is an alert set up (as shown below) to post in #n8n channel if n8n crashes.
Note: This will only work if n8n crashes, but cloudron is working fine. Other more reliable notification/alert methods are being explored at the moment (ex. datadog)
What to do (general):
- Post in slack (#n8n channel)
- Since resolving n8n crash issue might involve dev ops, therefore it can take sometime before n8n is back online.
- Impact of n8n crash will be different for instances
- Ex. If n8n crashes on a weekday morning (Australian time), then it will affect some of the active/customer centric workflows + draft workflows (being build at that time) in Aus. Thus, making it important for Aus to get n8n back up and running. But at the same time, it will have significantly less impact on European instances, as it will be night time for them. So, possibly less scheduled and draft workflows.
- Therefore, it is important to communicate how important it is for you to get n8n back up quickly. Since, we don’t have fix timeframe for sorting any n8n crash, it will depend on availability of dev ops and urgency.
- Message in the #n8n channel if you have critical workflows which needs to run
- Make a list of active workflows or any other workflow (if you remember) which are supposed to be running while n8n is down. Communicate to clients or team members if you have to.
What to do (troubleshooting - need access to cloudron - should be done by dev ops or someone with understanding of cloudron)
-
Check if [Cloudron] is still working
-
Check and follow the steps from [Cloudron troubleshooting page] - some of the common troubleshooting methods are:
-
Restarting the App from cloudron
- Go to Cloudron Dashboard and open settings for n8n (click on setting/gear icon on n8n tile as shown in the image below)
- Go to Repair section and then click Restart App (shown in the image below)
-
Enable Recovery mode (similar process)
-
Increasing the memory limit for the App (n8n settings - > Resources)
-
-
Restoring n8n from the backup: If nothing works then we can restore n8n from the last backup. But we will loose all the data after the backup (changes to workflows, new workflows, execution logs etc.)