Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New Tool: OMERO convert dataset to plate #74

Merged
merged 38 commits into from
Dec 16, 2024

Conversation

rmassei
Copy link
Contributor

@rmassei rmassei commented Nov 19, 2024

New tool

OMERO dataset to plate - HCS tool to convert an existing OMERO dataset to a plate format. This is based on this python script which is a specific use-case version of the OMERO basic utiliy scripts

How it works: The script accepts as input a TSV file with two columns, 'Filename' and 'Well'. Images with matching filenames with wells will be copied into the plate well in the correct position. Optionally, the original dataset can be deleted.

tools/omero/omero_dataset_to_plate.xml Outdated Show resolved Hide resolved
tools/omero/omero_dataset_to_plate.py Outdated Show resolved Hide resolved
@rmassei
Copy link
Contributor Author

rmassei commented Nov 20, 2024

@lldelisle what do you think about it?

@lldelisle
Copy link
Contributor

lldelisle commented Nov 20, 2024

My impression is that the regex may not work.
For example, if I check what we have in our omero our image names can be:
20231208_gene-C12_for_FACS_120h_R2_A1.tiff, 20231208_gene-C12_for_FACS_120h_R2_A2.tiff ...
Instead of using a general regex, I would identify what is common in image names from left and from right and then use a regex on what is unique to each image.

@rmassei
Copy link
Contributor Author

rmassei commented Nov 22, 2024

What about making the user input the string coming before the well position in the filename?

if __name__ == "__main__":
    parser = argparse.ArgumentParser(description="Convert an OMERO dataset to a plate.")
    parser.add_argument("--credential-file", dest="credential_file", type=str, required=True,
                        help="Credential file (JSON file with username and password for OMERO)")
    parser.add_argument('--host', required=True, help='OMERO host')
    parser.add_argument('--port', required=True, type=int, help='OMERO port')
 -->  parser.add_argument('--pre_well', required=True, type=str, help='String before the well position')
    parser.add_argument('--dataset_id', type=int, required=True, help="Dataset ID to convert plate")
    parser.add_argument('--log_file', default='metadata_import_log.txt', help='Path to the log file')
    args = parser.parse_args()

and then


def convert_dataset_to_plate(host, user, pws, port, dataset_id, pre_well,
                             log_file):
    """
    Connect to OMERO server, convert a dataset to a plate using the specified regex for extracting well positions,
    optionally link the plate to a screen.
    """
    conn = BlitzGateway(user, pws, host=host, port=port, secure=True)
    if not conn.connect():
        raise ConnectionError("Failed to connect to OMERO server")

    def log_message(message, status="INFO"):
        with open(log_file, 'a') as f:
            f.write(f"{status}: {message}\n")

    try:
-->        regex = r"(?:pre_well)([A-Z])(\d{1,2})$"

@bernt-matthias
Copy link
Collaborator

I think named groups in the regexes might be a solution. But maybe difficult in terms of usability ..

Wondering why we need to rely on the filenames at all. Shouldn't there be proper metadata containing this info somewhere?

@rmassei
Copy link
Contributor Author

rmassei commented Nov 22, 2024

Some software support the inclusion of well position in the metadata during acquisition (it is also in the ome.xml format) but, unfortunately, not all experiments includes this information in the metadata.
In my experience, it often happens that this info it is included just in the filename.

@bernt-matthias
Copy link
Collaborator

Then I would suggest to think about if this can be done already with the available tools offering regular expression?

@lldelisle
Copy link
Contributor

  • I think the issue with the pre_well is that it is not super easy to use.
  • I don't understand what you are proposing @bernt-matthias . You mean 1. Extract filenames from the dataset, 2. Use regex tool to get Well 3. Use this tool where the input would be a tabular with 2 columns: filename and well?

@rmassei
Copy link
Contributor Author

rmassei commented Nov 25, 2024

* 1. Extract filenames from the dataset, 2. Use regex tool to get Well 3. Use this tool where the input would be a tabular with 2 columns: filename and well?

Really like this idea!

@lldelisle
Copy link
Contributor

I think my omero tool: https://toolshed.g2.bx.psu.edu/view/lldelisle/omero_get_children_ids/82f2efb46200 get the option to get the image name.

…aset deletion option. Additionally, changed the style of the python code
@rmassei
Copy link
Contributor Author

rmassei commented Nov 27, 2024

Ok, now the tool is accepting a TSV file with 'Filename' and "Well' as columns. Images with matching filenames with wells will be copied into the plate well in the correct position. Optionally, the original dataset can be deleted.

@lldelisle
Copy link
Contributor

I like it.

Copy link
Collaborator

@bernt-matthias bernt-matthias left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know to little about omero. I'm wondering how the well info is different from other metadata.

tools/omero/omero_dataset_to_plate.xml Outdated Show resolved Hide resolved
tools/omero/omero_dataset_to_plate.xml Outdated Show resolved Hide resolved
@rmassei
Copy link
Contributor Author

rmassei commented Nov 28, 2024

I know to little about omero. I'm wondering how the well info is different from other metadata.

In OMERO there are different level of organization for managing/storing the data which are Dataset, Project, Screen, Plate, and Well.

Just to give an overview of the hierarchical structure:

    - **Project**: Top level to organize the experiment
            - **Dataset**: Below project, collection of images with similar experimental conditions
    - **Screen**: Top level to organize large-scale high-throughput screening (HCS) experiments (Basically a Project for HCS)
            - **Plate**: Refers to a single experimental plate in a screen, contains multiple well.(i.e. 96 well plate)
               -  **Well**: Represents individual experiments, typically a single condition. One well can contain several images...

To have the plate-well format is nice for visualization purposes plus can be useful if you want to fetch all data from a specific well since wells have also IDs.

@bernt-matthias
Copy link
Collaborator

Thanks @rmassei

In OMERO there are different level of organization for managing/storing the data which are Dataset, Project, Screen, Plate, and Well.

Why can't we then integrate this into

<option value="project">Project</option>
where we already handle all levels, excet for plate?

@rmassei
Copy link
Contributor Author

rmassei commented Nov 28, 2024

You are right, it is a good idea to add them to the omero_metadata_import.xml!
I am on it

@rmassei
Copy link
Contributor Author

rmassei commented Dec 5, 2024

All good? :)

tools/omero/omero_dataset_to_plate.py Outdated Show resolved Hide resolved
tools/omero/omero_dataset_to_plate.xml Outdated Show resolved Hide resolved
tools/omero/omero_dataset_to_plate.py Show resolved Hide resolved
tools/omero/omero_dataset_to_plate.py Show resolved Hide resolved
tools/omero/omero_metadata_import.xml Outdated Show resolved Hide resolved
tools/omero/omero_dataset_to_plate.xml Show resolved Hide resolved
tools/omero/test-data/omero_output.txt Show resolved Hide resolved
Copy link
Collaborator

@bernt-matthias bernt-matthias left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good from my side. Merge?

@bernt-matthias bernt-matthias merged commit 636cbb6 into Helmholtz-UFZ:main Dec 16, 2024
14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants