-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
"Error: CUDA error: an illegal memory access was encountered" causes the server to stop working properly until a manual restart #14
Comments
Yes, I think I already fixed that, will push soon |
The fix was to disable async calls by adding this:
But it slows down the process a lot, so I removed it. The error is basically an out of GPU memory error. |
Speed is the most important thing, so it doesn't matter that an error may occur if the server restarts by itself afterwards. |
Fixed by reinitializing the models, instead of a script restart. Let me know if it works. |
Thanks!!! I'll give it a try, but this error doesn't occur that often, so it takes a while to check if everything is ok. |
It didn't take long for the error to appear.
|
And until the server is restarted, the images will not be processed anymore. |
Woah, how did you manage to run out of 12GB VRAM? I'm using a 6GB GPU, and I have only once encountered this error. The parallel processing param is currently inefficient (atleast for me) if a value of more than 2 is set. But you should never encounter this error if you have set it to 1. |
The Parallel Processing parameter was set to 3... |
There's a problem with the cache: on |
Will fix this, by getting the alt text from the img tag instead of image name, queried by the selector set in site configuration file (todo), if specified. This will also sort the images properly as mangadex stores alt text in a pattern like C1-xxxx, C2-xxxx etc. Even senkuro sets it as Страница 9, Страница 10, (or Page 9, Page 10) etc...
Utilising entire VRAM is not a bad thing at all, it means the work is being done efficiently. But I don't think the parallel feature is working like I intend it to, I'll look into it. |
I have not tried the latest changes, but it would often run out of GPU memory for me when trying to process multiple images at once, so I had it only doing one at a time. Even then, a large image might fail so I added the code to back off the requested image size when it caught an out-of-memory error. |
For me it just slows down the process a lot, but never run out of memory. Even when I turn off denoise and colorizer, and use only upscaler (making an already large image even larger), it still doesn't run out of memory. |
Sometimes, I have the same issue with failing to process some large images as @vatavian has. |
The issue again starts occurring :(
|
Can be done, make a new python script in same folder as app-stream.py: restarter.py
Then in the app-stream.py, replace this function:
With this function:
Run the restarter instead of the app-stream. I haven't tested, but this or something like this should work. |
Thanks a lot! |
Sometimes the error mentioned in the title occurs, after which the server stops processing images and keeps giving this error.
If the server is manually restarted, it starts working correctly, continuing to process images.
Is it possible to make it restart automatically when such an error occurs?
The text was updated successfully, but these errors were encountered: