Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dist models just increase the size of node_modules unnecessarily (and aren't useful) #905

Open
ghnp5 opened this issue Nov 16, 2024 · 10 comments · May be fixed by #909
Open

Dist models just increase the size of node_modules unnecessarily (and aren't useful) #905

ghnp5 opened this issue Nov 16, 2024 · 10 comments · May be fixed by #909

Comments

@ghnp5
Copy link

ghnp5 commented Nov 16, 2024

Now that the models are bundled, with PR #811, the size of "node_modules/nsfwjs" is now 3 x 3 Models bigger.

The modules appear in the following subdirectories:

models
esm
cjs

The 3 models, in these 3 folders.

And, none of them are really usable... because they're in the .min.js format.

Therefore, I still need to ship my own files, with the actual raw models, in some directory that is not the node_modules.

Is it possible to either remove the bundled models completely, by ignoring them in the build/publish,
or at least ship only ONE version of them, being the raw files?

Thank you very much!

@GantMan
Copy link
Member

GantMan commented Nov 17, 2024

These should have been stripped out!

https://github.com/infinitered/nsfwjs/blob/master/.npmignore#L32

If you send a PR to strip out the model folder ( and essentially the demos ) we can merge it and do a new release.

cc: @mazenchami

@ghnp5
Copy link
Author

ghnp5 commented Nov 17, 2024

I'd love to, but if it's already in the .npmignore, I'm not sure what else I could do 😁

Or maybe it's because of the slash at the beginning, and should just be models/..

Btw - there's an outstanding issue that is preventing me from using any new versions - #904
So for now I'm stuck with version 4.2.0, which works for me 😊

Many thanks!

@haZya
Copy link
Contributor

haZya commented Nov 26, 2024

#909 should fix the issue with the redundant models folder.

On a separate note, you don't have to worry about the bundle size if you are on ESM environment (which you should be on any frontend application), it will bundle all three models but will dynamically load only the one that you load when you call nsfw.load() function. So the rest of the models aren't loaded unless you call them. You can see this in action on https://nsfwjs.com by opening the network tab in browser dev tools and selecting models in the dropdown.

On a CJS environment like a backend, it'll technically load the whole thing, but you'd only call the load function once at the start of the server and never again as long as the server is live. So it's not really a problem there.

@ghnp5
Copy link
Author

ghnp5 commented Nov 26, 2024

Thank you very much.

I'm deploying a Node.JS API through an internal Docker image, and the problem is that doing "npm install" puts all those copies of the models there, increasing the image size :)

@ambrt
Copy link

ambrt commented Dec 2, 2024

There is a case where all models bundled are the problem - namely using them on Cloudflare Workers.
After Gzip'ing its about 25MB and paid account on Cloudflare is only 10MB max.

I've tried just to remove models from node_modules but then imports fail.

Is there a way to use just one model - and exclude rest from bundling?

EDIT:
I removed all references to models except for one, manually from node_modules.
Example is here: #868 (comment)
but it would be great to just bundler do its work.

@haZya
Copy link
Contributor

haZya commented Dec 2, 2024

Hello @ambrt 👋

You raise a very valid question. Does #909 solve your issue? 🤔

But I do not think running tensorflow on a CF worker is natively supported as of right now. As you have mentioned in #868 it might be possible to get it to work but I don't think it's an ideal environment for it given the available resources specifically RAM and compute time constraints.

Bundling the models into the package means in cases like yours where the size of the build directory matters, you'll have to manually remove the model files from the build directory. The bundlers will dynamically load the model you are using so in an ESM environment it will not load other models but it cannot remove the model files because it obviously has no idea what model you'll be loading at runtime so all 3 models will be included in the build.

I'm not sure what bundler you are using and I'm also not very familiar with CF workers, but I think it should be possible to clean up the build directory and remove the model files using a script on postbuild without having to do it manually before deploy. You might have to prefix the chunks to identify which ones to remove.

However, I do agree this may not be ideal for you. The best solution would be to separate the models into another npm package which would then have to be installed along with the nsfwjs package. Then we can expose an adapter from the nsfwjs package to which the model files will be passed to load. This way you could avoid bundling the models into the main package but have them available separately to use or you could host the models separately and add the URL.

However, this probably will make it difficult for the vast majority of users of this package who'd want to simply install the nsfwjs package and call nsfwjs.load() to load a model and move on. If there is enough interest to separate the models from the main package, someone could do a PR for this. Although, at that point, there will be 2 different packages that'll have to be managed instead of one which might not be in the interest of the maintaners. 🤷‍♂️

@ambrt
Copy link

ambrt commented Dec 2, 2024

@haZya Thanks for pointing it out. I'll check #909 tomorrow.

Indeed there could be a version of nsfwjs that would just fetch models through url and not have them bundled at all.

A reason is that Cloudflare workers have pretty fast Cache/Storage for static assets. So models could be fetched on the fly, keeping bundle size to a bare minimum.

I agree that Cloudflare is not the best environment right now for own models. This could change if they'll introduce GPU servers to the wide public.

As for bundler, workers use esbuild under the hood - it's possible to bundle on one's own. Didn't use that option yet but docs say it's possible.

@ambrt
Copy link

ambrt commented Dec 3, 2024

Hi @haZya

Running npm i github:haZya/nsfwjs#prebuild-cleanup ends up with node_moduels/nsfwjs folder having just LICENSE, package.json and README files.

It seems that it triggers "prebuild" script on installation instead of at bundling time.

@haZya
Copy link
Contributor

haZya commented Dec 4, 2024

Hey @ambrt

It would not have the dist folder because it is .gitignored and will not be there when you do npm i github:haZya/nsfwjs#prebuild-cleanup.

I have added a prepare script to package.json so now it'll try to prepare and build it before installing and linking. Now try again with the same command: npm i github:haZya/nsfwjs#prebuild-cleanup it should get you the dist directory as well. You might have to install yarn since nsfwjs uses yarn for the scripts.

@ambrt
Copy link

ambrt commented Dec 5, 2024

prepare is creating dist folder, so it can run but models end up in bundle again.

Using custom esbuild seems to work, for example:
esbuild --bundle ./src/index.js --format=esm '--external:*inception_v3/group1-shard1of6.min.js' --outfile=./dist/_worker.js
reduces bundle by 5MB.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants