-
Notifications
You must be signed in to change notification settings - Fork 89
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve / fix naming in NL server #4263
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Beautiful change! Thank you for breaking it up so that it is easy to review!
default_indexes: | ||
- medium_ft | ||
enabled_indexes: | ||
- medium_ft |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
intend to remove others?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I intended to only keep medium_ft for these instances. For world bank, do you think we should keep sdg and undata?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since worldbank is part of un, maybe keep it there, could come handy for any demos, etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
_LOCAL_ENV_VALUES_PATH: str = f'{Path(__file__).parent.parent}/deploy/helm_charts/envs/autopush.yaml' | ||
|
||
|
||
@dataclass | ||
class EmbeddingsSpec: | ||
default_index: str | ||
class Env: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add class comments?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
EMBEDDINGS_SPEC_KEY: str = 'EMBEDDINGS_SPEC' | ||
REGISTRY_KEY: str = 'NL_REGISTRY' | ||
CATALOG_KEY: str = 'NL_CATALOG' | ||
ENV_KEY: str = 'NL_ENV' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A one-line comment on what each is?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
nl_server/registry.py
Outdated
@@ -33,57 +34,59 @@ | |||
|
|||
|
|||
# | |||
# A map from specific index to embeddings stores and models. | |||
# A class to hold embeddings stores and models. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a more precise definition, and distinguish it from Env and Catalog?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated
# Defines one embeddings index config. | ||
@dataclass | ||
class EmbeddingsConfig: | ||
# This represents the full catalog of models and indexes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
... a subset of them may not be enabled by the Env.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
nl_server/routes.py
Outdated
@@ -47,13 +48,12 @@ def healthz(): | |||
if default_embeddings_loaded: | |||
return 'OK', 200 | |||
|
|||
default_index_type = current_app.config[ | |||
config.EMBEDDINGS_SPEC_KEY].default_index | |||
default_index_type = current_app.config[config.ENV_KEY].default_indexes[0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[0] may not be safe? LHS has a check before access.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good point, updated the check.
thanks for the review! |
logging.warning('Health Check Failed: Default index name empty!') | ||
nl_env = current_app.config[config.ENV_KEY] | ||
if not nl_env.default_indexes: | ||
logging.error('Health Check Failed: Default index name empty!') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are warnings below, so change them all or not?
(original rational for warning was I thought this can happen while booting up, i think i saw this crash in custom dc - #4240)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh. Let me update here in next PR. I hope we don't need to make a real prediction call in /healthz then all logic here can be simplified.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah, makes sense!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Always a fan of using good names - thanks!
@@ -142,9 +143,9 @@ def generate_embeddings_yaml(model_info: utils.ModelConfig, | |||
"version": 1, | |||
"indexes": { | |||
"custom_ft": { | |||
"embeddings": embeddings_csv_handler.abspath(), | |||
"embeddings_path": embeddings_csv_handler.abspath(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It'll be cool if at some point, changes to the yaml format don't require changes to the custom embeddings tool.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think update the generated yaml file should do that, so running code that generate v10 yaml file with _v10.yaml. and the server would look for _v10.yaml and not read _v9.yaml (which can not parse).
This PR aims to greatly disambiguate code logic by using distinct naming for various object. Currently there are 10+ names like embedding_xxx for class, object, dict and files with a lot of clashes.
With this change:
catalog
to replaceembeddings_config
object and represent the all available (may not be used) model/index configregistry
to replaceembedddings_map
object which holds all the model/client objectsenv
to replaceembeddings_spec
which is the additional config per environment