-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RAM Usage issue with m3u8 videos #8
Comments
Hi! The M3U8 code is largely from the original codebase, regrettably I am not an expert for M3U8 and stream processing in general and in Python in particular. All TS streams are plucked from the M3U8, downloaded and then re-muxed (demuxed/muxed). I don't know if mem usage already explodes during downloads or only during the re-muxing process. I see Avnsx properly uses a streamed web request saving memory but all streams are downloaded almost at once using a thread pool. For the re-muxing I currently lack the knowledge. I can do some more research and try to debug it but without a concrete sample, where this can be observed, it is even more difficult. I also had to postpone some important private stuff due to #3 so I beg your patience. |
Now I see it, you are absolutely right - although the downloads are streamed/chunked, everything goes into a memory buffer and all .ts file contents are then collected/merged - also in memory! This will take quite some effort to re-write the code using temporary files on disk instead and not breaking anything during the process. This may take 14 days or so according to my schedule, if I get some free spot maybe earlier but no promises. |
That would explain why ram usage is about double the size of the file. That's how I kind of guessed it was working with a quick look at the code. No worries though, I'm not in a rush, it works, slowly, but it works. I was just sharing my discovery 😄 I run it fully headless and noticed the issue when my monitoring started alerting me about all my services going down lol |
Thanks a lot for sharing, neither do I know 4K creators nor would I have noticed on my gaming PC 😂 (shame on me) Glad this is just a private NAS and nothing critical 😂 |
Writing a little scraper for another site I learned more about M3U8 and MPEG-TS and ffmpeg, I plan on moving to this new way of downloading and processing. This might, however, break de-duped existing videos and re-download them. I also hope this will package properly. Stay tuned. |
Hi, you might try this version but note the warning - try with a different folder or backup your existing creator(s). Though I have some ideas I do not yet have a solution for the de-duplication thing as the files essentially become different files when ffmpeg merges them proper. I'm not sure whether this will work in your Docker container. It didn't work on WSL on a mounted project directory but I didn't try on a native Linux. There might be an issue regarding pyffmpeg and quoting as mentioned on their GitHub but the error message is different. https://github.com/prof79/fansly-downloader-ng/releases/tag/ondemand |
Very quick and I wanted to write, I might have dispelled my Linux doubts in a few days :D Actually there is two ways to do such concat files - relative or absolute; and absolute would require an unsafe flag. Relative paths are used relative to where the concat file is located - so should be no problem in this case, as I also specify the list file name in a fully-qualified manner to I rather suspect, but could not yet test it due to hashing headaches, that |
I'm under the impression that
I'm curious about this, do you know of any doc I could read about that ? Never knew absolute paths were unsafe.
Could be a realtively quick and easy fix, yes. |
Yeah and it gives more control like exception handling - I can also do fast and should probably start out a psychic 😂 Try this: https://github.com/prof79/fansly-downloader-ng/releases/tag/ondemand - 176c42f 😁 Regarding your question: https://trac.ffmpeg.org/wiki/Concatenate https://ffmpeg.org/ffmpeg-formats.html#concat-1 -> see 3.5.2 |
I've also implemented a new selective MP4 hashing algorithm that ignores stupid lavf version info or re-muxing "artifacts" like deviating bitrates and stuff in the header although the track data is identical to the old manual method. Having an opt-in new, more succinct file naming scheme probably using a CRC is also on my personal wishlist. But I still don't know if using pHash for images currently is a beneficial or detrimental thing ... |
It seems to work fine. I lifted the docker limits and I'm trying the full scrape of the creator that raised the issue. Happy NAS !
Oh, I see ! Never thought of that. |
Awesome! 😁🙏 We can leave this open if you want to do some more testing, you could close yourself or I could close with the next main branch release tomorrow+, also need to write up some explanation/release notes but not today ... |
Blue is the RAM, yes the most I've seen used atm is around 900mb but most of the time it's around 180-200mb. From what I see, all the big 4K vids are downloaded and no issue. I guess we can close. |
Well, tbh I've not monitored/checked what RAM usage the ffmpeg binary contributes during a merge but sounds OK I guess. Cache is cache 😁 - all the stuff from disk that is used by the NAS OS/services/Docker and identified as potentially required often is proactively loaded/stored in RAM - aka cached - since RAM is just so much faster than even the fastest SSD can be. What is more, this also ensures good use of your RAM instead of being empty to a large degree all the time 😁Also, stuff written back to disk may get buffered (cached) in RAM to speed things up and cut the disks some slack. But cache can be freed/shrinked by the OS as needed. |
Hi!
I dockerized your fork and run it on a NAS with 8gb of ram.
A creator I follow posts 15-20 minutes 4K videos (1,5 to 3GB file size) that make the scraper RAM usage explode and saturate all 8GB of ram of the nas. At most the scraper was using close to 6GB.
As you can see on the screenshot above, the NAS starts aggressively killing everything to get RAM back (all containers and nas services).
I was able to get around the issue by limiting ram usage with docker limits but the scraper runs super slow because of it.
I'm under the impression that m3u8 videos are fully downloaded in RAM before being offloaded in a file is that right ?
Would this be something that could be mitigated ?
The text was updated successfully, but these errors were encountered: