Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revamp JSON importer to make it easy to use #517

Open
Tracked by #465
hcho3 opened this issue Aug 14, 2023 · 3 comments
Open
Tracked by #465

Revamp JSON importer to make it easy to use #517

hcho3 opened this issue Aug 14, 2023 · 3 comments
Assignees
Labels

Comments

@hcho3
Copy link
Collaborator

hcho3 commented Aug 14, 2023

No description provided.

@hcho3 hcho3 added the 4.0 label Aug 14, 2023
@hcho3 hcho3 self-assigned this Aug 14, 2023
@stephenpardy
Copy link
Contributor

I am very interested in this issue and understanding what progress there is here.

I see the PR adding JSON importing in the C and python API: #448, but it seems like this was since removed.
There is still the ability to dump_as_json from any model, but are there any utilities to load these files back? I think my question may be a duplicate of #11 but there seems to have been a lot of development since that issue was closed.

@hcho3
Copy link
Collaborator Author

hcho3 commented Aug 21, 2024

@stephenpardy

what progress there is here.

I didn't get around writing the JSON importer yet, because I wasn't sure what kind of interface would be the best for the JSON importer. The last iteration (import_from_json from Treelite 3.9) was clunky to use and had many gotchas.
Also, for the JSON importer, it is not as simple as using the output of dump_as_json function, since the output doesn't contain some bits of information that are necessary to preserve the integrity of the model through a round-trip serialization.

Can you describe what your use case would be? I'd like to learn how you plan to use the JSON importer so that I can pick the best design.

@stephenpardy
Copy link
Contributor

@hcho3 I am looking for a way to load tree models from a variety of sources - e.g. xgboost, lightGBM, etc. and then save those models in a stable way that can be loaded and served at a later time.

I see the serialize and deserialize methods which seem to meet my needs - and there is even some nice backwards compatibility promised by the docs. I think that is enough for now, but having a human-readable format such as JSON would be much preferred over the binary one if possible (similar to how xgboost now defaults to JSON over the old binary one).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants