-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Test onedata objectstore with new caching #4
Conversation
Proponuję docelowy opis PRa: This PR complements this one: galaxyproject#18174 We have tested the deduplicated Onedada object store code and confirmed that it works. We also took the occasion to include some improvements to our libs. They make the Onedata clients resistant to failures of data providers and improve their performance. NOTE: the [x] This is a refactoring of components with existing test coverage. |
Najładniej by było chyba zrobić PRa do brancha jmchiltona, żeby jak sobie to wciągnie mogło wejść wszystko na raz z jego deduplikacją? |
lib/galaxy/config/__init__.py
Outdated
"onedatafilerestclient.http_client": { | ||
"level": "INFO", | ||
"qualname": "onedatafilerestclient.http_client", | ||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
zgaduję, że nie będą tego potrzebować więc póki co bym usunął
w opisie PRa zaproponowałem notkę dla nich do wyjaśnienia
@@ -223,9 +223,9 @@ auth: | |||
# an access token suitable for data access (allowing calls to the Oneprovider REST API). | |||
access_token: ... | |||
connection: | |||
# the domain of the Onezone service (e.g. "demo.onedata.org"), or its IP address for | |||
# the domain of the Onezone service (e.g. datahub.egi.eu), or its IP address for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thx
onedatafilerestclient==21.2.5rc1 ; python_version >= "3.8" and python_version < "3.13" | ||
# onedatafilerestclient==21.2.5.1 ; python_version >= "3.8" and python_version < "3.13" | ||
../onedatafilerestclient ; python_version >= "3.8" and python_version < "3.13" | ||
../onedatarestfs ; python_version >= "3.8" and python_version < "3.13" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to na koniec do ogarnięcia
@@ -7,8 +7,9 @@ | |||
from datetime import datetime | |||
|
|||
try: | |||
from onedatafilerestclient import ( | |||
OnedataFileRESTClient, | |||
from onedatafilerestclient import OnedataFileRESTClient |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wygląda na to że nie pykło z datahubem - dyskusja na Slaku
ca05787
to
8802987
Compare
This in particular needs a lot of new tests. We will also need to actively purge datasets in the model store import code, since users might have purged datasets while the job ran. Again, more tests needed.
These are only written if the command actually ran, so we'd fail here for instance if the outputs are deleted before the job had a chance to run.
Add a sleep so we can delay output purging until command line is templated. If we don't sleep we generate a command line where the output path is just `''`. That needs another fix!
This is a tricky one. There could be systematic issues here that an admin would want to fix, but it could also just be the case that a user forced a wrong datatype. Ideally we'd probably tag this as job execution issue ? It's not very different from a job error IMO.
For anonymous users, if a history_id is provided, do not override with the current session history and instead rely on the history accessibility
…or anonymous Co-authored-by: mvdbeek <[email protected]>
…anonymous Co-authored-by: Nicola Soranzo <[email protected]>
…ion_anonymous_jobs [24.0] Fix anonymous user job retrieval logic
[24.1] Fix check for anonymous
[24.1] Merge 24.0 into 24.1
8802987
to
0b02038
Compare
…d_files [24.0] Do not copy purged outputs to object store
…unction in `WorkflowEmbed`, `SharingPage`, `WorkflowInformation` and `WorkflowActionsExtend`
…ublish-link-copy [24.1] Add copy link to published workflow in `WorkflowCard`
0b02038
to
2ec50fb
Compare
Co-authored-by: David López <[email protected]>
Fixes galaxyproject#18633: ``` ValueError: dictionary update sequence element #4 has length 1; 2 is required File "galaxy/tools/__init__.py", line 1969, in handle_single_execution rval = self.execute( File "galaxy/tools/__init__.py", line 2066, in execute return self.tool_action.execute( File "galaxy/tools/actions/model_operations.py", line 89, in execute self._produce_outputs( File "galaxy/tools/actions/model_operations.py", line 120, in _produce_outputs tool.produce_outputs( File "galaxy/tools/__init__.py", line 3816, in produce_outputs new_labels_dict = dict(source_new_label) Exception caught while attempting to execute tool with id '__RELABEL_FROM_FILE__': ```
This PR complements this one: galaxyproject#18174
We have tested the deduplicated Onedada object store code and confirmed that it works.
We also took the occasion to include some improvements to our libs. They make the Onedata clients resistant to failures of data providers and improve their performance.
NOTE: the OnedataFileRestClient logs on the debug level, at least one log per request. Let us know if it's okay or maybe we should adjust the Galaxy's logger config so that these logs are ignored.
NOTE: while working on this, we have stumbled upon major performance problems which can also cause ambiguous data access errors, as described in galaxyproject#18369. This is why we are proposing to change the implementation of
_get_total_matches_count
to use scandir instead of glob in this PR.[x] This is a refactoring of components with existing test coverage.
License