Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performace issue (eating RAM) #383

Open
mbnoimi opened this issue May 27, 2024 · 4 comments
Open

Performace issue (eating RAM) #383

mbnoimi opened this issue May 27, 2024 · 4 comments
Labels
enhancement New feature or request help wanted Extra attention is needed

Comments

@mbnoimi
Copy link

mbnoimi commented May 27, 2024

Hi,

I'm downloading a website with 3 depth in the same domain. My laptop RAM is 16 GB
Withing less than 3 hours, the extension ate my RAM to 90% Which forced me to force restart my laptop.
This issue occurs with big websites only (my website size about 1.5 GB mostly pure html)

Is there any workaround for enhancing the performance?

  • Linux Mint 21.3 Xfce
  • Firefox 126.0 (64-bit)
  • Save captured data to: Scrapbook folder
  • Save captured data as: Folder
@danny0838
Copy link
Owner

There's probably not too much you can do besides upgrading the hardware. It may be more performant by saving to the backend server in some cases, though.

@danny0838 danny0838 added enhancement New feature or request help wanted Extra attention is needed labels May 27, 2024
@mbnoimi
Copy link
Author

mbnoimi commented May 27, 2024

There's probably not too much you can do besides upgrading the hardware. It may be more performant by saving to the backend server in some cases, though.

I use WebHTTrack it works pretty fine but for some reason my cookies doesn't work fine. For that I use webscrapbook because it deals with cookies behind the scenes.

@mbnoimi
Copy link
Author

mbnoimi commented May 27, 2024

There's probably not too much you can do besides upgrading the hardware

BTW, Why webscrapbook stores all the scrapped data in the memory then save them in the last step? Why it doesn't save them one by one just like wget and httrack?

@danny0838
Copy link
Owner

BTW, Why webscrapbook stores all the scrapped data in the memory then save them in the last step? Why it doesn't save them one by one just like wget and httrack?

This is not true. Intermediate data is mostly saved to the browser storage, which is ultimately in the disk in some form.

The browser extension API is so limited that it cannot load files that are downloaded to the local filesystem. When capturing multiple web pages, the saved pages need to be loaded and have all links to other downloaded pages rewritten, which is not possible before all pages have been downloaded. As a result, we have to save all downloaded pages in the browser storage, rewrite them, and then save to the local filesystem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

2 participants