-
Notifications
You must be signed in to change notification settings - Fork 161
Support for GPT2 and GPTJ? #35
Comments
I'm using ggjt file formats perfectly fine at the moment.. (the one which works great with mmap on recent llama.cpp vers, which i believe at linked into this) |
@mikeggh To be clear, you are using GPTJ int4 GGML models fine with this repo at the moment? That is great to hear if I understood correctly! |
This does not work for me. If I use the model conversion tool at the GGML repo seen here or if I use the cformers repo seen here I get an error with regards to needing to update the format of the model. Trying to update the format of the model leads to issues. I think it may be worthwhile to try an older version of LLama.cpp. I may try that. |
@mallorbc, We are planning to support GPTJ as well, but for now, the code will work only with llama (I guess). |
GPT-J support incoming in the next few days (alongside assistant-style GPT-j model release) |
@AndriyMulyar That is great to hear! Looking forward to GPTJ support! |
In the GGML repo there are guides for converting those models into GGML format, including int4 support. I have successfully done so myself and ran those models using the GPTJ binary in the examples.
In this repo here, there is support for GPTJ models with an API-like interface, but the downside is that each time you make an API call, the model has to be loaded adding 2 or seconds each time you make a call.
What I like about this repo is that it uses pybindings to load the model into memory and then allows API calls without reloading the model.
My question/request is for support for GPTJ. I know it can be done with little to no code changes. If the code is abstracted enough using GPTJ after conversion may already work.
I think worst-case scenario the example under GPTJ needs to be modified a bit. I will take a look at this if it does not currently work and will contribute back if I get a working solution
The text was updated successfully, but these errors were encountered: