-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(api): add output funcionalities (get_matrix, aggregate_values) #42
base: main
Are you sure you want to change the base?
Conversation
salemsd
commented
Dec 19, 2024
- Added more functionalities for outputs
- Added unit and integration tests
@@ -12,6 +12,8 @@ | |||
|
|||
from typing import List | |||
|
|||
from antares.craft.model.output import McType, ObjectType |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally speaking I think we prefer not having to import anything related to the project in this file as it's like an util file
from pydantic import BaseModel | ||
|
||
|
||
class QueryFile(Enum): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're missing the value ID = "id"
I would also prefer to rename STSTORAGE ST_STORAGE
type_ids: str = "" | ||
columns_names: str = "" | ||
|
||
def to_query(self, object_type: ObjectType) -> str: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rename to_api_query to show it's for api
|
||
query_file: QueryFile | ||
frequency: Frequency | ||
mc_years: str = "" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably best to type it as a list for users. Same for column_names
class Output(BaseModel): | ||
name: str | ||
archived: bool | ||
|
||
def __init__(self, output_service, **kwargs: Any): # type: ignore |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you should remove the usage of pydantic inside this class and just do a normal class
self._binding_constraints_service = service_factory.create_binding_constraints_service() | ||
self._output_service = service_factory.create_output_service() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't think you need to create an output service here anymore
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The alternative would be to create the output service in read_outputs local and api manually without the factory. Same thing could be said about binding_constraint_service which is not really needed in study either.
Or we could move read_outputs back into output_service but that would create redundancy with an output also having its own service.
I don't see what else I could do ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll check
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think what you should do is add a method set_output_service inside the study service that you'll call inside the create_study_service method. This way you can remove the service from the study as you'll have it inside the study_service. That's something we're doing already for the area_service if you want to check
try: | ||
response = self._wrapper.get(url) | ||
outputs_json_list = response.json() | ||
return [Output(name=output["name"], archived=output["archived"]) for output in outputs_json_list] | ||
return pd.read_csv(StringIO(response.text)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In AntaresWeb we do pd.read_csv(io.BytesIO(response.content), sep=",")
, don't know if it's faster. We can keep the code as is but don't you have issue as you didn't specify the separator ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to the doc, separator is "," by default, so no problem. As for the data handling, apparently they offer similar performance -> "StringIO, however, is a native in-memory unicode container and will exhibit similar speed to BytesIO", from python doc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay. It's weird as in pandas doc online I see , as default but in the source code it seems like it's None. Weird
@@ -472,3 +474,36 @@ def test_read_outputs(self): | |||
mocker.get(run_url, json={"description": error_message}, status_code=404) | |||
with pytest.raises(OutputsRetrievalError, match=error_message): | |||
self.study.read_outputs() | |||
|
|||
# ==== get_matrix() ==== |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should probably put this in a separated test as you're not really checking the same thing as the first part of the test and you can re-introduce a fake output at the start of your test
|
||
matrix = output1.get_matrix("mc-all/grid/links") | ||
assert isinstance(matrix, pd.DataFrame) | ||
assert matrix.shape[1] == len(matrix_output["columns"]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't you just put matrix_output inside a dataframe and test the equality ? It would be clearer
Same remark for the aggregate values part
matrix = output.get_matrix("mc-all/grid/links") | ||
|
||
assert isinstance(matrix, pd.DataFrame) | ||
assert not matrix.empty |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you check the value a bit more ?
Same for the aggregate values test