Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use endpoint instead of model in the inference API #2528

Merged
merged 10 commits into from
May 2, 2024

Conversation

davidkyle
Copy link
Member

@davidkyle davidkyle commented Apr 25, 2024

Update the inference APIs to refer to Inference endpoints rather than models. This involves renaming classes and updating comments but also includes the potentially breaking changes:

  • inference.delete_model renamed to inference.delete
  • inference.get_model renamed to inference.get
  • inference.put_model renamed to inference.put

The format of the get response has also changed, the models field is renamed to endpoints:

#previously
{
    "models": [...]  
}

#with this change
{
    "endpoints": [...]  
}

In GetResponse.ts I've used an @alias tag to retain the models option

Corresponding Elasticsearch change elastic/elasticsearch#107704

Copy link
Member

@pquentin pquentin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Since this is for 8.15, I suggest waiting for elastic/elasticsearch#107704 so that we make sure 1/ the JSON spec is exactly the same and 2/ allow clients-flight-recorder to keep up with the change and validate this properly.

I left a few comments there too.

@davidkyle davidkyle marked this pull request as ready for review April 29, 2024 13:35
@davidkyle davidkyle requested a review from a team as a code owner April 29, 2024 13:35
@davidkyle
Copy link
Member Author

Thanks! Since this is for 8.15, I suggest waiting for elastic/elasticsearch#107704

That PR is green now. I hesitated before pressing the merge button as after merging we must have a compatible client ready for the next serverless release and when it comes to the breaking change I am hoping there is someway to soften the blow.

Taking Python as an example, existing code will use client.inference.put_model(...) and after this change the Python function is client.inference.put(...). Is there anyway of having inference.put_model resolve to inference.put so that we don't break existing code samples? Perhaps by having spec for both inference.put and inference.put_model but the latter is deprecated and not documented?

@pquentin
Copy link
Member

pquentin commented Apr 30, 2024

Taking Python as an example, existing code will use client.inference.put_model(...) and after this change the Python function is client.inference.put(...). Is there anyway of having inference.put_model resolve to inference.put so that we don't break existing code samples? Perhaps by having spec for both inference.put and inference.put_model but the latter is deprecated and not documented?

I'm not sure we can deprecate an endpoint today if the endpoint itself is gone from Elasticsearch. (The only existing example is _knn_search.) Also, the clients are one thing, but on Serverless right now the usage of Kibana is quite high. And it's not like deprecating a function has any practical effect: things will break eventually. I suppose that's why tech preview is for.

That PR is green now. I hesitated before pressing the merge button as after merging we must have a compatible client ready for the next serverless release

We don't have great automation here, most serverless clients don't get frequent updates yet. We can coordinate to get the serverless Python client out when you want, if that helps.

Copy link
Member

@pquentin pquentin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Requested a few cosmetic fixes and asked about the models alias.

specification/_json_spec/inference.put.json Outdated Show resolved Hide resolved

export class Response {
body: {
models: Array<ModelConfigContainer>
/** @aliases models */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we decided to break compatibility, should we keep models as an alias? For what it's worth, the current Python client does the wrong thing and would send models as is to the server. That could be fixed in time for 8.15 though.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree and have removed the alias.

The change breaks compatibility, we can do that because the API is neither GA or heavily used therefore it should be a clean break.

specification/inference/inference/InferenceRequest.ts Outdated Show resolved Hide resolved
Co-authored-by: Miguel Grinberg <[email protected]>
Copy link
Member

@pquentin pquentin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! LGTM.

Copy link
Contributor

github-actions bot commented May 2, 2024

Following you can find the validation results for the APIs you have changed.

API Status Request Response
inference.delete Missing test Missing test
inference.get 🟠 Missing recording Missing recording
inference.inference Missing test Missing test
inference.put Missing test Missing test

You can validate these APIs yourself by using the make validate target.

@davidkyle davidkyle merged commit e01c305 into main May 2, 2024
6 checks passed
@davidkyle davidkyle deleted the endpoints-not-models branch May 2, 2024 12:26
@davidkyle
Copy link
Member Author

Thanks for the reviews

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants