-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
eLabFTW file source for Galaxy #18665
Labels
Comments
This issue can be assigned to me. Pinging @bernt-matthias, since he was interested in discussing and testing the integration. |
4 tasks
4 tasks
kysrpex
added a commit
to kysrpex/galaxyproject-galaxy
that referenced
this issue
Dec 12, 2024
eLabFTW [1] revolves around the concepts of experiment [2] and resource [3]. Experiments and resources can have files attached to them. To get a quick overview, try out the live demo [4]. The scope of this implementation is exporting data from and importing data to eLabFTW as file attachments of already existing experiments and resources. Each user can configure their preferred eLabFTW instance entering its URL and an API Key. File sources reference files via a URI, while eLabFTW uses auto-incrementing positive integers. For more details read galaxyproject#18665 [5]. This leads to the need to declare a mapping between said identifiers and Galaxy URIs. Those take the form `elabftw://demo.elabftw.net/entity_type/entity_id/attachment_id`, where: - `entity_type` is either 'experiments' or 'resources' - entity_id` is the id (an integer in string form) of an experiment or resource - `attachment_id` is the id (an integer in string form) of an attachment This implementation uses both `aiohttp` and the `requests` libraries as underlying mechanisms to communicate with eLabFTW via its REST API [6]. A significant limitation of the implementation is that, due to the fact that the API does not have an endpoint that can list attachments for several experiments and/or resources with a single request, when listing the root directory or an entity type _recursively_, a list of entities has to be fetched first, then to fetch the information on their attachments, a separate request has to be sent _for each one_ of them. The `aiohttp` library makes it bearable to recursively browse instances with up to ~500 experiments or resources with attachments by sending them concurrently, but ultimately solving the problem would require changes to the API from the eLabFTW side. References: - [1] https://www.elabftw.net/ - [2] https://doc.elabftw.net/user-guide.html#experiments - [3] https://doc.elabftw.net/user-guide.html#resources - [4] https://demo.elabftw.net - [5] galaxyproject#18665 - [6] https://doc.elabftw.net/api/v2
kysrpex
added a commit
to kysrpex/galaxyproject-galaxy
that referenced
this issue
Dec 12, 2024
eLabFTW [1] revolves around the concepts of experiment [2] and resource [3]. Experiments and resources can have files attached to them. To get a quick overview, try out the live demo [4]. The scope of this implementation is exporting data from and importing data to eLabFTW as file attachments of already existing experiments and resources. Each user can configure their preferred eLabFTW instance entering its URL and an API Key. File sources reference files via a URI, while eLabFTW uses auto-incrementing positive integers. For more details read galaxyproject#18665 [5]. This leads to the need to declare a mapping between said identifiers and Galaxy URIs. Those take the form `elabftw://demo.elabftw.net/entity_type/entity_id/attachment_id`, where: - `entity_type` is either 'experiments' or 'resources' - entity_id` is the id (an integer in string form) of an experiment or resource - `attachment_id` is the id (an integer in string form) of an attachment This implementation uses both `aiohttp` and the `requests` libraries as underlying mechanisms to communicate with eLabFTW via its REST API [6]. A significant limitation of the implementation is that, due to the fact that the API does not have an endpoint that can list attachments for several experiments and/or resources with a single request, when listing the root directory or an entity type _recursively_, a list of entities has to be fetched first, then to fetch the information on their attachments, a separate request has to be sent _for each one_ of them. The `aiohttp` library makes it bearable to recursively browse instances with up to ~500 experiments or resources with attachments by sending them concurrently, but ultimately solving the problem would require changes to the API from the eLabFTW side. References: - [1] https://www.elabftw.net/ - [2] https://doc.elabftw.net/user-guide.html#experiments - [3] https://doc.elabftw.net/user-guide.html#resources - [4] https://demo.elabftw.net - [5] galaxyproject#18665 - [6] https://doc.elabftw.net/api/v2
4 tasks
kysrpex
added a commit
to kysrpex/galaxyproject-galaxy
that referenced
this issue
Dec 12, 2024
eLabFTW [1] revolves around the concepts of experiment [2] and resource [3]. Experiments and resources can have files attached to them. To get a quick overview, try out the live demo [4]. The scope of this implementation is exporting data from and importing data to eLabFTW as file attachments of already existing experiments and resources. Each user can configure their preferred eLabFTW instance entering its URL and an API Key. File sources reference files via a URI, while eLabFTW uses auto-incrementing positive integers. For more details read galaxyproject#18665 [5]. This leads to the need to declare a mapping between said identifiers and Galaxy URIs. Those take the form `elabftw://demo.elabftw.net/entity_type/entity_id/attachment_id`, where: - `entity_type` is either 'experiments' or 'resources' - `entity_id` is the id (an integer in string form) of an experiment or resource - `attachment_id` is the id (an integer in string form) of an attachment This implementation uses both `aiohttp` and the `requests` libraries as underlying mechanisms to communicate with eLabFTW via its REST API [6]. A significant limitation of the implementation is that, due to the fact that the API does not have an endpoint that can list attachments for several experiments and/or resources with a single request, when listing the root directory or an entity type _recursively_, a list of entities has to be fetched first, then to fetch the information on their attachments, a separate request has to be sent _for each one_ of them. The `aiohttp` library makes it bearable to recursively browse instances with up to ~500 experiments or resources with attachments by sending them concurrently, but ultimately solving the problem would require changes to the API from the eLabFTW side. References: - [1] https://www.elabftw.net/ - [2] https://doc.elabftw.net/user-guide.html#experiments - [3] https://doc.elabftw.net/user-guide.html#resources - [4] https://demo.elabftw.net - [5] galaxyproject#18665 - [6] https://doc.elabftw.net/api/v2
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
eLabFTW file source for Galaxy
I am developing an integration of Galaxy with eLabFTW and found a couple of design mismatches between eLabFTW and Galaxy that are forcing me to take non-straightforward design decisions. If I am not careful, my decisions may clash with how Galaxy is intended to work, so I thought it makes sense to open an issue to seek consensus and/or other solutions.
Exporting and importing data to Galaxy
To take data out of Galaxy, there is the option to export a history, either as a direct download link or to a file source. Research data management repositories are included in the later group.
To import data to Galaxy, there is the upload option. Data from file sources can be accessed using the "Choose remote files" button.
Remote files are represented and resolved in Galaxy using a path-like URI. File sources tipically define their own URI schema. For example
invenio://zenodo_sandbox/92442/TestProduct.zip
. Directory-like objects may be created in the file source using the endpoint/api/remote_files
, which accepts JSON of the form{"target": "invenio://zenodo_sandbox/92442", "name": "Testing Publishing"}
. File-like objects may be created using/api/histories/{history_id}/write_store
, which accepts JSON that includes thetarget_uri
key:{"target_uri": "invenio://zenodo_sandbox/92442/TestProduct.zip", ...}
.eLabFTW
eLabFTW revolves around the concepts of experiment and resource. Experiments and resources can contain file attachments. The scope of the integration would be exporting data from and importing data to eLabFTW as file attachments.
eLabFTW can be accessed thorugh a REST API, which is documented here. The sections experiments, items (internal name for resources) and uploads are of special relevance. Each entity (be it experiments or items) has an entity id (an integer), and the files attached to an entity, also known as "uploads", have an upload id (also an integer). Entity ids for experiments and items are independent (i.e. an experiment and an item can have the same id). Upload ids are common to experiments and items: an experiment and an item cannot have an attachment with the same id.
eLabFTW's backend assigns new identifiers incrementing the previous identifier of the same type, be it experiment identifiers, item identifiers, or upload identifiers. Experiment, item and upload names are not unique, e.g. two experiments can have the same name.
Integrating Galaxy with eLabFTW
Integrating eLabFTW with Galaxy through a file source involves finding a path-like URI representation for eLabFTW's experiments, items and uploads. A solution that quickly comes to mind are paths of the form
/entity_type/entity_id/upload_id
, where:entity_type
is either 'experiments' or 'resources'entity_id
is the id (an integer) of an experiment or resource (keep in mind those are independent)upload_id
is the id (an integer) of an attachmentAgain, keep in mind that experiment, item and upload names are not unique. A solution based on names would not resolve them unambiguously. From the usability point of view, a solution based on ids may however be a problem, because although names and URIs seem to be decoupled when browsing file sources (see screenshot below),
they are coupled when files are exported (see histories.export.ts, which gets
fileName
from user input).The major issue is though, that
/api/histories/{history_id}/write_store
receives atarget_uri
as input, which means URIs must be known beforehand. But entity ids and upload ids cannot be predicted, because eLabFTW's backend generates them as users create experiments, resources and upload attachments. To make things worse, upload ids are global. This means Galaxy cannot try to guess the next id based on the largest id on the serverAction points
I see thus two areas where taking action is needed:
x
guarantees that it can be retrieved later usingx
, but I do not think that's a good approach).The text was updated successfully, but these errors were encountered: