diff --git a/dev/search/search_index.json b/dev/search/search_index.json index 120fb2a1..d4a4b8b8 100644 --- a/dev/search/search_index.json +++ b/dev/search/search_index.json @@ -1 +1 @@ -{"config":{"lang":["en"],"separator":"[\\s\\-]+","pipeline":["stopWordFilter"]},"docs":[{"location":"","title":"Introduction","text":""},{"location":"#_1","title":"Introduction","text":"

Oncology FM Evaluation Framework by kaiko.ai

With the first release, eva supports performance evaluation for vision Foundation Models (\"FMs\") and supervised machine learning models on WSI-patch-level image classification task. Support for radiology (CT-scans) segmentation tasks will be added soon.

With eva we provide the open-source community with an easy-to-use framework that follows industry best practices to deliver a robust, reproducible and fair evaluation benchmark across FMs of different sizes and architectures.

Support for additional modalities and tasks will be added in future releases.

"},{"location":"#use-cases","title":"Use cases","text":""},{"location":"#1-evaluate-your-own-fms-on-public-benchmark-datasets","title":"1. Evaluate your own FMs on public benchmark datasets","text":"

With a specified FM as input, you can run eva on several publicly available datasets & tasks. One evaluation run will download and preprocess the relevant data, compute embeddings, fit and evaluate a downstream head and report the mean and standard deviation of the relevant performance metrics.

Supported datasets & tasks include:

WSI patch-level pathology datasets

Radiology datasets

To evaluate FMs, eva provides support for different model-formats, including models trained with PyTorch, models available on HuggingFace and ONNX-models. For other formats custom wrappers can be implemented.

"},{"location":"#2-evaluate-ml-models-on-your-own-dataset-task","title":"2. Evaluate ML models on your own dataset & task","text":"

If you have your own labeled dataset, all that is needed is to implement a dataset class tailored to your source data. Start from one of our out-of-the box provided dataset classes, adapt it to your data and run eva to see how different FMs perform on your task.

"},{"location":"#evaluation-results","title":"Evaluation results","text":"

We evaluated the following FMs on the 4 supported WSI-patch-level image classification tasks. On the table below we report Balanced Accuracy for binary & multiclass tasks and show the average performance & standard deviation over 5 runs.

FM-backbone pretraining BACH CRC MHIST PCam/val PCam/test DINO ViT-S16 N/A 0.410 (\u00b10.009) 0.617 (\u00b10.008) 0.501 (\u00b10.004) 0.753 (\u00b10.002) 0.728 (\u00b10.003) DINO ViT-S16 ImageNet 0.695 (\u00b10.004) 0.935 (\u00b10.003) 0.831 (\u00b10.002) 0.864 (\u00b10.007) 0.849 (\u00b10.007) DINO ViT-B8 ImageNet 0.710 (\u00b10.007) 0.939 (\u00b10.001) 0.814 (\u00b10.003) 0.870 (\u00b10.003) 0.856 (\u00b10.004) DINOv2 ViT-L14 ImageNet 0.707 (\u00b10.008) 0.916 (\u00b10.002) 0.832 (\u00b10.003) 0.873 (\u00b10.001) 0.888 (\u00b10.001) Lunit - ViT-S16 TCGA 0.801 (\u00b10.005) 0.934 (\u00b10.001) 0.768 (\u00b10.004) 0.889 (\u00b10.002) 0.895 (\u00b10.006) Owkin - iBOT ViT-B16 TCGA 0.725 (\u00b10.004) 0.935 (\u00b10.001) 0.777 (\u00b10.005) 0.912 (\u00b10.002) 0.915 (\u00b10.003) UNI - DINOv2 ViT-L16 Mass-100k 0.814 (\u00b10.008) 0.950 (\u00b10.001) 0.837 (\u00b10.001) 0.936 (\u00b10.001) 0.938 (\u00b10.001) kaiko.ai - DINO ViT-S16 TCGA 0.797 (\u00b10.003) 0.943 (\u00b10.001) 0.828 (\u00b10.003) 0.903 (\u00b10.001) 0.893 (\u00b10.005) kaiko.ai - DINO ViT-S8 TCGA 0.834 (\u00b10.012) 0.946 (\u00b10.002) 0.832 (\u00b10.006) 0.897 (\u00b10.001) 0.887 (\u00b10.002) kaiko.ai - DINO ViT-B16 TCGA 0.810 (\u00b10.008) 0.960 (\u00b10.001) 0.826 (\u00b10.003) 0.900 (\u00b10.002) 0.898 (\u00b10.003) kaiko.ai - DINO ViT-B8 TCGA 0.865 (\u00b10.019) 0.956 (\u00b10.001) 0.809 (\u00b10.021) 0.913 (\u00b10.001) 0.921 (\u00b10.002) kaiko.ai - DINOv2 ViT-L14 TCGA 0.870 (\u00b10.005) 0.930 (\u00b10.001) 0.809 (\u00b10.001) 0.908 (\u00b10.001) 0.898 (\u00b10.002)

The runs use the default setup described in the section below.

eva trains the decoder on the \"train\" split and uses the \"validation\" split for monitoring, early stopping and checkpoint selection. Evaluation results are reported on the \"validation\" split and, if available, on the \"test\" split.

For more details on the FM-backbones and instructions to replicate the results, check out Replicate evaluations.

"},{"location":"#evaluation-setup","title":"Evaluation setup","text":"

Note that the current version of eva implements the task- & model-independent and fixed default set up following the standard evaluation protocol proposed by [1] and described in the table below. We selected this approach to prioritize reliable, robust and fair FM-evaluation while being in line with common literature. Additionally, with future versions we are planning to allow the use of cross-validation and hyper-parameter tuning to find the optimal setup to achieve best possible performance on the implemented downstream tasks.

With a provided FM, eva computes embeddings for all input images (WSI patches) which are then used to train a downstream head consisting of a single linear layer in a supervised setup for each of the benchmark datasets. We use early stopping with a patience of 5% of the maximal number of epochs.

Backbone frozen Hidden layers none Dropout 0.0 Activation function none Number of steps 12,500 Base Batch size 4,096 Batch size dataset specific* Base learning rate 0.01 Learning Rate [Base learning rate] * [Batch size] / [Base batch size] Max epochs [Number of samples] * [Number of steps] / [Batch size] Early stopping 5% * [Max epochs] Optimizer SGD Momentum 0.9 Weight Decay 0.0 Nesterov momentum true LR Schedule Cosine without warmup

* For smaller datasets (e.g. BACH with 400 samples) we reduce the batch size to 256 and scale the learning rate accordingly.

"},{"location":"#license","title":"License","text":"

eva is distributed under the terms of the Apache-2.0 license.

"},{"location":"#next-steps","title":"Next steps","text":"

Check out the User Guide to get started with eva

"},{"location":"CODE_OF_CONDUCT/","title":"Contributor Covenant Code of Conduct","text":""},{"location":"CODE_OF_CONDUCT/#our-pledge","title":"Our Pledge","text":"

In the interest of fostering an open and welcoming environment, we as contributors and maintainers pledge to making participation in our project and our community a harassment-free experience for everyone, regardless of age, body size, disability, ethnicity, sex characteristics, gender identity and expression, level of experience, education, socio-economic status, nationality, personal appearance, race, religion, or sexual identity and orientation.

"},{"location":"CODE_OF_CONDUCT/#our-standards","title":"Our Standards","text":"

Examples of behavior that contributes to creating a positive environment include:

Examples of unacceptable behavior by participants include:

"},{"location":"CODE_OF_CONDUCT/#our-responsibilities","title":"Our Responsibilities","text":"

Project maintainers are responsible for clarifying the standards of acceptable behavior and are expected to take appropriate and fair corrective action in response to any instances of unacceptable behavior.

Project maintainers have the right and responsibility to remove, edit, or reject comments, commits, code, wiki edits, issues, and other contributions that are not aligned to this Code of Conduct, or to ban temporarily or permanently any contributor for other behaviors that they deem inappropriate, threatening, offensive, or harmful.

"},{"location":"CODE_OF_CONDUCT/#scope","title":"Scope","text":"

This Code of Conduct applies both within project spaces and in public spaces when an individual is representing the project or its community. Examples of representing a project or community include using an official project e-mail address, posting via an official social media account, or acting as an appointed representative at an online or offline event. Representation of a project may be further defined and clarified by project maintainers.

"},{"location":"CODE_OF_CONDUCT/#enforcement","title":"Enforcement","text":"

Instances of abusive, harassing, or otherwise unacceptable behavior may be reported by contacting the project team at eva@kaiko.ai. All complaints will be reviewed and investigated and will result in a response that is deemed necessary and appropriate to the circumstances. The project team is obligated to maintain confidentiality with regard to the reporter of an incident. Further details of specific enforcement policies may be posted separately.

Project maintainers who do not follow or enforce the Code of Conduct in good faith may face temporary or permanent repercussions as determined by other members of the project's leadership.

"},{"location":"CODE_OF_CONDUCT/#attribution","title":"Attribution","text":"

This Code of Conduct is adapted from the Contributor Covenant, version 1.4, available at https://www.contributor-covenant.org/version/1/4/code-of-conduct.html

For answers to common questions about this code of conduct, see https://www.contributor-covenant.org/faq

"},{"location":"CONTRIBUTING/","title":"Contributing to eva","text":"

eva is open source and community contributions are welcome!

"},{"location":"CONTRIBUTING/#contribution-process","title":"Contribution Process","text":""},{"location":"CONTRIBUTING/#github-issues","title":"GitHub Issues","text":"

The eva contribution process generally starts with filing a GitHub issue.

eva defines four categories of issues: feature requests, bug reports, documentation fixes, and installation issues. In general, we recommend waiting for feedback from a eva maintainer or community member before proceeding to implement a feature or patch.

"},{"location":"CONTRIBUTING/#pull-requests","title":"Pull Requests","text":"

After you have agreed upon an implementation strategy for your feature or patch with an eva maintainer, the next step is to introduce your changes as a pull request against the eva repository.

Steps to make a pull request:

Once your pull request has been merged, your changes will be automatically included in the next eva release!

"},{"location":"DEVELOPER_GUIDE/","title":"Developer Guide","text":""},{"location":"DEVELOPER_GUIDE/#setting-up-a-dev-environment","title":"Setting up a DEV environment","text":"

We use PDM as a package and dependency manager. You can set up a local python environment for development as follows: 1. Install package and dependency manager PDM following the instructions here. 2. Install system dependencies - For MacOS: brew install Cmake - For Linux (Debian): sudo apt-get install build-essential cmake 3. Run pdm install -G dev to install the python dependencies. This will create a virtual environment in eva/.venv.

"},{"location":"DEVELOPER_GUIDE/#adding-new-dependencies","title":"Adding new dependencies","text":"

Add a new dependency to the core submodule: pdm add <package_name>

Add a new dependency to the vision submodule: pdm add -G vision -G all <package_name>

For more information about managing dependencies please look here.

"},{"location":"DEVELOPER_GUIDE/#continuous-integration-ci","title":"Continuous Integration (CI)","text":"

For testing automation, we use nox.

Installation: - with brew: brew install nox - with pip: pip install --user --upgrade nox (this way, you might need to run nox commands with python -m nox or specify an alias)

Commands: - nox to run all the automation tests. - nox -s fmt to run the code formatting tests. - nox -s lint to run the code lining tests. - nox -s check to run the type-annotation tests. - nox -s test to run the unit tests. - nox -s test -- tests/eva/metrics/test_average_loss.py to run specific tests

"},{"location":"STYLE_GUIDE/","title":"eva Style Guide","text":"

This document contains our style guides used in eva.

Our priority is consistency, so that developers can quickly ingest and understand the entire codebase without being distracted by style idiosyncrasies.

"},{"location":"STYLE_GUIDE/#general-coding-principles","title":"General coding principles","text":"

Q: How to keep code readable and maintainable? - Don't Repeat Yourself (DRY) - Use the lowest possible visibility for a variable or method (i.e. make private if possible) -- see Information Hiding / Encapsulation

Q: How big should a function be? - Single Level of Abstraction Principle (SLAP) - High Cohesion and Low Coupling

TL;DR: functions should usually be quite small, and _do one thing_\n
"},{"location":"STYLE_GUIDE/#python-style-guide","title":"Python Style Guide","text":"

In general we follow the following regulations: PEP8, the Google Python Style Guide and we expect type hints/annotations.

"},{"location":"STYLE_GUIDE/#docstrings","title":"Docstrings","text":"

Our docstring style is derived from Google Python style.

def example_function(variable: int, optional: str | None = None) -> str:\n    \"\"\"An example docstring that explains what this functions do.\n\n    Docs sections can be referenced via :ref:`custom text here <anchor-link>`.\n\n    Classes can be referenced via :class:`eva.data.datamodules.DataModule`.\n\n    Functions can be referenced via :func:`eva.data.datamodules.call.call_method_if_exists`.\n\n    Example:\n\n        >>> from torch import nn\n        >>> import eva\n        >>> eva.models.modules.HeadModule(\n        >>>     head=nn.Linear(10, 2),\n        >>>     criterion=nn.CrossEntropyLoss(),\n        >>> )\n\n    Args:\n        variable: A required argument.\n        optional: An optional argument.\n\n    Returns:\n        A description of the output string.\n    \"\"\"\n    pass\n
"},{"location":"STYLE_GUIDE/#module-docstrings","title":"Module docstrings","text":"

PEP-8 and PEP-257 indicate docstrings should have very specific syntax:

\"\"\"One line docstring that shouldn't wrap onto next line.\"\"\"\n
\"\"\"First line of multiline docstring that shouldn't wrap.\n\nSubsequent line or paragraphs.\n\"\"\"\n
"},{"location":"STYLE_GUIDE/#constants-docstrings","title":"Constants docstrings","text":"

Public constants should usually have docstrings. Optional on private constants. Docstrings on constants go underneath

SOME_CONSTANT = 3\n\"\"\"Either a single-line docstring or multiline as per above.\"\"\"\n
"},{"location":"STYLE_GUIDE/#function-docstrings","title":"Function docstrings","text":"

All public functions should have docstrings following the pattern shown below.

Each section can be omitted if there are no inputs, outputs, or no notable exceptions raised, respectively.

def fake_datamodule(\n    n_samples: int, random: bool = True\n) -> eva.data.datamodules.DataModule:\n    \"\"\"Generates a fake DataModule.\n\n    It builds a :class:`eva.data.datamodules.DataModule` by generating\n    a fake dataset with generated data while fixing the seed. It can\n    be useful for debugging purposes.\n\n    Args:\n        n_samples: The number of samples of the generated datasets.\n        random: Whether to generated randomly.\n\n    Returns:\n        A :class:`eva.data.datamodules.DataModule` with generated random data.\n\n    Raises:\n        ValueError: If `n_samples` is `0`.\n    \"\"\"\n    pass\n
"},{"location":"STYLE_GUIDE/#class-docstrings","title":"Class docstrings","text":"

All public classes should have class docstrings following the pattern shown below.

class DataModule(pl.LightningDataModule):\n    \"\"\"DataModule encapsulates all the steps needed to process data.\n\n    It will initialize and create the mapping between dataloaders and\n    datasets. During the `prepare_data`, `setup` and `teardown`, the\n    datamodule will call the respectively methods from all the datasets,\n    given that they are defined.\n    \"\"\"\n\n    def __init__(\n        self,\n        datasets: schemas.DatasetsSchema | None = None,\n        dataloaders: schemas.DataloadersSchema | None = None,\n    ) -> None:\n        \"\"\"Initializes the datamodule.\n\n        Args:\n            datasets: The desired datasets. Defaults to `None`.\n            dataloaders: The desired dataloaders. Defaults to `None`.\n        \"\"\"\n        pass\n
"},{"location":"datasets/","title":"Datasets","text":"

eva provides native support for several public datasets. When possible, the corresponding dataset classes facilitate automatic download to disk, if not possible, this documentation provides download instructions.

"},{"location":"datasets/#vision-datasets-overview","title":"Vision Datasets Overview","text":""},{"location":"datasets/#whole-slide-wsi-and-microscopy-image-datasets","title":"Whole Slide (WSI) and microscopy image datasets","text":"Dataset #Patches Patch Size Magnification (\u03bcm/px) Task Cancer Type BACH 400 2048x1536 20x (0.5) Classification (4 classes) Breast CRC 107,180 224x224 20x (0.5) Classification (9 classes) Colorectal PatchCamelyon 327,680 96x96 10x (1.0) * Classification (2 classes) Breast MHIST 3,152 224x224 5x (2.0) * Classification (2 classes) Colorectal Polyp

* Downsampled from 40x (0.25 \u03bcm/px) to increase the field of view.

"},{"location":"datasets/#radiology-datasets","title":"Radiology datasets","text":"Dataset #Images Image Size Task Download provided TotalSegmentator 1228 ~300 x ~300 x ~350 * Multilabel Classification (117 classes) Yes

* 3D images of varying sizes

"},{"location":"datasets/bach/","title":"BACH","text":"

The BACH dataset consists of microscopy and WSI images, of which we use only the microscopy images. These are 408 labeled images from 4 classes (\"Normal\", \"Benign\", \"Invasive\", \"InSitu\"). This dataset was used for the \"BACH Grand Challenge on Breast Cancer Histology images\".

"},{"location":"datasets/bach/#raw-data","title":"Raw data","text":""},{"location":"datasets/bach/#key-stats","title":"Key stats","text":"Modality Vision (microscopy images) Task Multiclass classification (4 classes) Cancer type Breast Data size total: 10.4GB / data in use: 7.37 GB (18.9 MB per image) Image dimension 1536 x 2048 x 3 Magnification (\u03bcm/px) 20x (0.42) Files format .tif images Number of images 408 (102 from each class) Splits in use one labeled split"},{"location":"datasets/bach/#organization","title":"Organization","text":"

The data ICIAR2018_BACH_Challenge.zip from zenodo is organized as follows:

ICAR2018_BACH_Challenge\n\u251c\u2500\u2500 Photos                    # All labeled patches used by eva\n\u2502   \u251c\u2500\u2500 Normal\n\u2502   \u2502   \u251c\u2500\u2500 n032.tif\n\u2502   \u2502   \u2514\u2500\u2500 ...\n\u2502   \u251c\u2500\u2500 Benign\n\u2502   \u2502   \u2514\u2500\u2500 ...\n\u2502   \u251c\u2500\u2500 Invasive\n\u2502   \u2502   \u2514\u2500\u2500 ...\n\u2502   \u251c\u2500\u2500 InSitu\n\u2502   \u2502   \u2514\u2500\u2500 ...\n\u251c\u2500\u2500 WSI                       # WSIs, not in use\n\u2502   \u251c\u2500\u2500 ...\n\u2514\u2500\u2500 ...\n
"},{"location":"datasets/bach/#download-and-preprocessing","title":"Download and preprocessing","text":"

The BACH dataset class supports downloading the data during runtime by setting the init argument download=True.

Note that in the provided BACH-config files the download argument is set to false. To enable automatic download you will need to open the config and set download: true.

The splits are created from the indices specified in the BACH dataset class. These indices were picked to prevent data leakage due to images belonging to the same patient. Because the small dataset in combination with the patient ID constraint does not allow to split the data three-ways with sufficient amount of data in each split, we only create a train and val split and leave it to the user to submit predictions on the official test split to the BACH Challenge Leaderboard.

Splits Train Validation #Samples 268 (67%) 132 (33%)"},{"location":"datasets/bach/#relevant-links","title":"Relevant links","text":""},{"location":"datasets/bach/#license","title":"License","text":"

Attribution-NonCommercial-ShareAlike 4.0 International

"},{"location":"datasets/crc/","title":"CRC","text":"

The CRC-HE dataset consists of labeled patches (9 classes) from colorectal cancer (CRC) and normal tissue. We use the NCT-CRC-HE-100K dataset for training and validation and the CRC-VAL-HE-7K for testing.

The NCT-CRC-HE-100K-NONORM consists of 100,000 images without applied color normalization. The CRC-VAL-HE-7K consists of 7,180 image patches from 50 patients without overlap with NCT-CRC-HE-100K-NONORM.

The tissue classes are: Adipose (ADI), background (BACK), debris (DEB), lymphocytes (LYM), mucus (MUC), smooth muscle (MUS), normal colon mucosa (NORM), cancer-associated stroma (STR) and colorectal adenocarcinoma epithelium (TUM)

"},{"location":"datasets/crc/#raw-data","title":"Raw data","text":""},{"location":"datasets/crc/#key-stats","title":"Key stats","text":"Modality Vision (WSI patches) Task Multiclass classification (9 classes) Cancer type Colorectal Data size total: 11.7GB (train), 800MB (val) Image dimension 224 x 224 x 3 Magnification (\u03bcm/px) 20x (0.5) Files format .tif images Number of images 107,180 (100k train, 7.2k val) Splits in use NCT-CRC-HE-100K (train), CRC-VAL-HE-7K (val)"},{"location":"datasets/crc/#splits","title":"Splits","text":"

We use the splits according to the data sources:

Splits Train Validation #Samples 100,000 (93.3%) 7,180 (6.7%)

A test split is not provided. Because the patient information for the training data is not available, dividing the training data in a train/val split (and using the given val split as test split) is not possible without risking data leakage. eva therefore reports evaluation results for CRC HE on the validation split.

"},{"location":"datasets/crc/#organization","title":"Organization","text":"

The data NCT-CRC-HE-100K.zip, NCT-CRC-HE-100K-NONORM.zip and CRC-VAL-HE-7K.zip from zenodo are organized as follows:

NCT-CRC-HE-100K                # All images used for training\n\u251c\u2500\u2500 ADI                        # All labeled patches belonging to the 1st class\n\u2502   \u251c\u2500\u2500 ADI-AAAFLCLY.tif\n\u2502   \u251c\u2500\u2500 ...\n\u251c\u2500\u2500 BACK                       # All labeled patches belonging to the 2nd class\n\u2502   \u251c\u2500\u2500 ...\n\u2514\u2500\u2500 ...\n\nNCT-CRC-HE-100K-NONORM         # All images used for training\n\u251c\u2500\u2500 ADI                        # All labeled patches belonging to the 1st class\n\u2502   \u251c\u2500\u2500 ADI-AAAFLCLY.tif\n\u2502   \u251c\u2500\u2500 ...\n\u251c\u2500\u2500 BACK                       # All labeled patches belonging to the 2nd class\n\u2502   \u251c\u2500\u2500 ...\n\u2514\u2500\u2500 ...\n\nCRC-VAL-HE-7K                  # All images used for validation\n\u251c\u2500\u2500 ...                        # identical structure as for NCT-CRC-HE-100K-NONORM\n\u2514\u2500\u2500 ...\n
"},{"location":"datasets/crc/#download-and-preprocessing","title":"Download and preprocessing","text":"

The CRC dataset class supports downloading the data during runtime by setting the init argument download=True.

Note that in the provided CRC-config files the download argument is set to false. To enable automatic download you will need to open the config and set download: true.

"},{"location":"datasets/crc/#relevant-links","title":"Relevant links","text":""},{"location":"datasets/crc/#license","title":"License","text":"

CC BY 4.0 LEGAL CODE

"},{"location":"datasets/mhist/","title":"MHIST","text":"

MHIST is a binary classification task which comprises of 3,152 hematoxylin and eosin (H&E)-stained Formalin Fixed Paraffin-Embedded (FFPE) fixed-size images (224 by 224 pixels) of colorectal polyps from the Department of Pathology and Laboratory Medicine at Dartmouth-Hitchcock Medical Center (DHMC).

The tissue classes are: Hyperplastic Polyp (HP), Sessile Serrated Adenoma (SSA). This classification task focuses on the clinically-important binary distinction between HPs and SSAs, a challenging problem with considerable inter-pathologist variability. HPs are typically benign, while sessile serrated adenomas are precancerous lesions that can turn into cancer if left untreated and require sooner follow-up examinations. Histologically, HPs have a superficial serrated architecture and elongated crypts, whereas SSAs are characterized by broad-based crypts, often with complex structure and heavy serration.

"},{"location":"datasets/mhist/#raw-data","title":"Raw data","text":""},{"location":"datasets/mhist/#key-stats","title":"Key stats","text":"Modality Vision (WSI patches) Task Binary classification (2 classes) Cancer type Colorectal Polyp Data size 354 MB Image dimension 224 x 224 x 3 Magnification (\u03bcm/px) 5x (2.0) * Files format .png images Number of images 3,152 (2,175 train, 977 test) Splits in use annotations.csv (train / test)

* Downsampled from 40x to increase the field of view.

"},{"location":"datasets/mhist/#organization","title":"Organization","text":"

The contents from images.zip and the file annotations.csv from bmirds are organized as follows:

mhist                           # Root folder\n\u251c\u2500\u2500 images                      # All the dataset images\n\u2502   \u251c\u2500\u2500 MHIST_aaa.png\n\u2502   \u251c\u2500\u2500 MHIST_aab.png\n\u2502   \u251c\u2500\u2500 ...\n\u2514\u2500\u2500 annotations.csv             # The dataset annotations file\n
"},{"location":"datasets/mhist/#download-and-preprocessing","title":"Download and preprocessing","text":"

To download the dataset, please visit the access portal on BMIRDS and follow the instructions. You will then receive an email with all the relative links that you can use to download the data (images.zip, annotations.csv, Dataset Research Use Agreement.pdf and MD5SUMs.txt).

Please create a root folder, e.g. mhist, and download all the files there, which unzipping the contents of images.zip to a directory named images inside your root folder (i.e. mhist/images). Afterwards, you can (optionally) delete the images.zip file.

"},{"location":"datasets/mhist/#splits","title":"Splits","text":"

We work with the splits provided by the data source. Since no \"validation\" split is provided, we use the \"test\" split as validation split.

Splits Train Validation #Samples 2,175 (69%) 977 (31%)"},{"location":"datasets/mhist/#relevant-links","title":"Relevant links","text":""},{"location":"datasets/patch_camelyon/","title":"PatchCamelyon","text":"

The PatchCamelyon benchmark is an image classification dataset with 327,680 color images (96 x 96px) extracted from histopathologic scans of lymph node sections. Each image is annotated with a binary label indicating presence of metastatic tissue.

"},{"location":"datasets/patch_camelyon/#raw-data","title":"Raw data","text":""},{"location":"datasets/patch_camelyon/#key-stats","title":"Key stats","text":"Modality Vision (WSI patches) Task Binary classification Cancer type Breast Data size 8 GB Image dimension 96 x 96 x 3 Magnification (\u03bcm/px) 10x (1.0) * Files format h5 Number of images 327,680 (50% of each class)

* The slides were acquired and digitized at 2 different medical centers using a 40x objective but under-sampled to 10x to increase the field of view.

"},{"location":"datasets/patch_camelyon/#splits","title":"Splits","text":"

The data source provides train/validation/test splits

Splits Train Validation Test #Samples 262,144 (80%) 32,768 (10%) 32,768 (10%)"},{"location":"datasets/patch_camelyon/#organization","title":"Organization","text":"

The PatchCamelyon data from zenodo is organized as follows:

\u251c\u2500\u2500 camelyonpatch_level_2_split_train_x.h5.gz               # train images\n\u251c\u2500\u2500 camelyonpatch_level_2_split_train_y.h5.gz               # train labels\n\u251c\u2500\u2500 camelyonpatch_level_2_split_valid_x.h5.gz               # val images\n\u251c\u2500\u2500 camelyonpatch_level_2_split_valid_y.h5.gz               # val labels\n\u251c\u2500\u2500 camelyonpatch_level_2_split_test_x.h5.gz                # test images\n\u251c\u2500\u2500 camelyonpatch_level_2_split_test_y.h5.gz                # test labels\n
"},{"location":"datasets/patch_camelyon/#download-and-preprocessing","title":"Download and preprocessing","text":"

The dataset class PatchCamelyon supports downloading the data during runtime by setting the init argument download=True.

Note that in the provided PatchCamelyon-config files the download argument is set to false. To enable automatic download you will need to open the config and set download: true.

Labels are provided by source files, splits are given by file names.

"},{"location":"datasets/patch_camelyon/#relevant-links","title":"Relevant links","text":""},{"location":"datasets/patch_camelyon/#citation","title":"Citation","text":"
@misc{b_s_veeling_j_linmans_j_winkens_t_cohen_2018_2546921,\n  author       = {B. S. Veeling, J. Linmans, J. Winkens, T. Cohen, M. Welling},\n  title        = {Rotation Equivariant CNNs for Digital Pathology},\n  month        = sep,\n  year         = 2018,\n  doi          = {10.1007/978-3-030-00934-2_24},\n  url          = {https://doi.org/10.1007/978-3-030-00934-2_24}\n}\n
"},{"location":"datasets/patch_camelyon/#license","title":"License","text":"

Creative Commons Zero v1.0 Universal

"},{"location":"datasets/total_segmentator/","title":"TotalSegmentator","text":"

The TotalSegmentator dataset is a radiology image-segmentation dataset with 1228 3D images and corresponding masks with 117 different anatomical structures. It can be used for segmentation and multilabel classification tasks.

"},{"location":"datasets/total_segmentator/#raw-data","title":"Raw data","text":""},{"location":"datasets/total_segmentator/#key-stats","title":"Key stats","text":"Modality Vision (radiology, CT scans) Task Segmentation / multilabel classification (117 classes) Data size total: 23.6GB Image dimension ~300 x ~300 x ~350 (number of slices) x 1 (grey scale) * Files format .nii (\"NIFTI\") images Number of images 1228 Splits in use one labeled split

/* image resolution and number of slices per image vary

"},{"location":"datasets/total_segmentator/#organization","title":"Organization","text":"

The data Totalsegmentator_dataset_v201.zip from zenodo is organized as follows:

Totalsegmentator_dataset_v201\n\u251c\u2500\u2500 s0011                               # one image\n\u2502   \u251c\u2500\u2500 ct.nii.gz                       # CT scan\n\u2502   \u251c\u2500\u2500 segmentations                   # directory with segmentation masks\n\u2502   \u2502   \u251c\u2500\u2500 adrenal_gland_left.nii.gz   # segmentation mask 1st anatomical structure\n\u2502   \u2502   \u251c\u2500\u2500 adrenal_gland_right.nii.gz  # segmentation mask 2nd anatomical structure\n\u2502   \u2502   \u2514\u2500\u2500 ...\n\u2514\u2500\u2500 ...\n
"},{"location":"datasets/total_segmentator/#download-and-preprocessing","title":"Download and preprocessing","text":" Splits Train Validation Test #Samples 737 (60%) 246 (20%) 245 (20%)"},{"location":"datasets/total_segmentator/#relevant-links","title":"Relevant links","text":""},{"location":"datasets/total_segmentator/#license","title":"License","text":"

Creative Commons Attribution 4.0 International

"},{"location":"reference/","title":"Reference API","text":"

Here is the Reference API, describing the classes, functions, parameters and attributes of the eva package.

To learn how to use eva, however, its best to get started with the User Guide

"},{"location":"reference/core/callbacks/","title":"Callbacks","text":""},{"location":"reference/core/callbacks/#writers","title":"Writers","text":""},{"location":"reference/core/callbacks/#eva.core.callbacks.writers.EmbeddingsWriter","title":"eva.core.callbacks.writers.EmbeddingsWriter","text":"

Bases: BasePredictionWriter

Callback for writing generated embeddings to disk.

This callback writes the embedding files in a separate process to avoid blocking the main process where the model forward pass is executed.

Parameters:

Name Type Description Default output_dir str

The directory where the embeddings will be saved.

required backbone Module | None

A model to be used as feature extractor. If None, it will be expected that the input batch returns the features directly.

None dataloader_idx_map Dict[int, str] | None

A dictionary mapping dataloader indices to their respective names (e.g. train, val, test).

None group_key str | None

The metadata key to group the embeddings by. If specified, the embedding files will be saved in subdirectories named after the group_key. If specified, the key must be present in the metadata of the input batch.

None overwrite bool

Whether to overwrite the output directory. Defaults to True.

True Source code in src/eva/core/callbacks/writers/embeddings.py
def __init__(\n    self,\n    output_dir: str,\n    backbone: nn.Module | None = None,\n    dataloader_idx_map: Dict[int, str] | None = None,\n    group_key: str | None = None,\n    overwrite: bool = True,\n) -> None:\n    \"\"\"Initializes a new EmbeddingsWriter instance.\n\n    This callback writes the embedding files in a separate process to avoid blocking the\n    main process where the model forward pass is executed.\n\n    Args:\n        output_dir: The directory where the embeddings will be saved.\n        backbone: A model to be used as feature extractor. If `None`,\n            it will be expected that the input batch returns the features directly.\n        dataloader_idx_map: A dictionary mapping dataloader indices to their respective\n            names (e.g. train, val, test).\n        group_key: The metadata key to group the embeddings by. If specified, the\n            embedding files will be saved in subdirectories named after the group_key.\n            If specified, the key must be present in the metadata of the input batch.\n        overwrite: Whether to overwrite the output directory. Defaults to True.\n    \"\"\"\n    super().__init__(write_interval=\"batch\")\n\n    self._output_dir = output_dir\n    self._backbone = backbone\n    self._dataloader_idx_map = dataloader_idx_map or {}\n    self._group_key = group_key\n    self._overwrite = overwrite\n\n    self._write_queue: multiprocessing.Queue\n    self._write_process: eva_multiprocessing.Process\n
"},{"location":"reference/core/interface/","title":"Interface API","text":"

Reference information for the Interface API.

"},{"location":"reference/core/interface/#eva.Interface","title":"eva.Interface","text":"

A high-level interface for training and validating a machine learning model.

This class provides a convenient interface to connect a model, data, and trainer to train and validate a model.

"},{"location":"reference/core/interface/#eva.Interface.fit","title":"fit","text":"

Perform model training and evaluation out-of-place.

This method uses the specified trainer to fit the model using the provided data.

Example use cases:

Parameters:

Name Type Description Default trainer Trainer

The base trainer to use but not modify.

required model ModelModule

The model module to use but not modify.

required data DataModule

The data module.

required Source code in src/eva/core/interface/interface.py
def fit(\n    self,\n    trainer: eva_trainer.Trainer,\n    model: modules.ModelModule,\n    data: datamodules.DataModule,\n) -> None:\n    \"\"\"Perform model training and evaluation out-of-place.\n\n    This method uses the specified trainer to fit the model using the provided data.\n\n    Example use cases:\n\n    - Using a model consisting of a frozen backbone and a head, the backbone will generate\n      the embeddings on the fly which are then used as input features to train the head on\n      the downstream task specified by the given dataset.\n    - Fitting only the head network using a dataset that loads pre-computed embeddings.\n\n    Args:\n        trainer: The base trainer to use but not modify.\n        model: The model module to use but not modify.\n        data: The data module.\n    \"\"\"\n    trainer.run_evaluation_session(model=model, datamodule=data)\n
"},{"location":"reference/core/interface/#eva.Interface.predict","title":"predict","text":"

Perform model prediction out-of-place.

This method performs inference with a pre-trained foundation model to compute embeddings.

Parameters:

Name Type Description Default trainer Trainer

The base trainer to use but not modify.

required model ModelModule

The model module to use but not modify.

required data DataModule

The data module.

required Source code in src/eva/core/interface/interface.py
def predict(\n    self,\n    trainer: eva_trainer.Trainer,\n    model: modules.ModelModule,\n    data: datamodules.DataModule,\n) -> None:\n    \"\"\"Perform model prediction out-of-place.\n\n    This method performs inference with a pre-trained foundation model to compute embeddings.\n\n    Args:\n        trainer: The base trainer to use but not modify.\n        model: The model module to use but not modify.\n        data: The data module.\n    \"\"\"\n    eva_trainer.infer_model(\n        base_trainer=trainer,\n        base_model=model,\n        datamodule=data,\n        return_predictions=False,\n    )\n
"},{"location":"reference/core/interface/#eva.Interface.predict_fit","title":"predict_fit","text":"

Combines the predict and fit commands in one method.

This method performs the following two steps: 1. predict: perform inference with a pre-trained foundation model to compute embeddings. 2. fit: training the head network using the embeddings generated in step 1.

Parameters:

Name Type Description Default trainer Trainer

The base trainer to use but not modify.

required model ModelModule

The model module to use but not modify.

required data DataModule

The data module.

required Source code in src/eva/core/interface/interface.py
def predict_fit(\n    self,\n    trainer: eva_trainer.Trainer,\n    model: modules.ModelModule,\n    data: datamodules.DataModule,\n) -> None:\n    \"\"\"Combines the predict and fit commands in one method.\n\n    This method performs the following two steps:\n    1. predict: perform inference with a pre-trained foundation model to compute embeddings.\n    2. fit: training the head network using the embeddings generated in step 1.\n\n    Args:\n        trainer: The base trainer to use but not modify.\n        model: The model module to use but not modify.\n        data: The data module.\n    \"\"\"\n    self.predict(trainer=trainer, model=model, data=data)\n    self.fit(trainer=trainer, model=model, data=data)\n
"},{"location":"reference/core/data/dataloaders/","title":"Dataloaders","text":"

Reference information for the Dataloader classes.

"},{"location":"reference/core/data/dataloaders/#eva.data.DataLoader","title":"eva.data.DataLoader dataclass","text":"

The DataLoader combines a dataset and a sampler.

It provides an iterable over the given dataset.

"},{"location":"reference/core/data/dataloaders/#eva.data.DataLoader.batch_size","title":"batch_size: int | None = 1 class-attribute instance-attribute","text":"

How many samples per batch to load.

Set to None for iterable dataset where dataset produces batches.

"},{"location":"reference/core/data/dataloaders/#eva.data.DataLoader.shuffle","title":"shuffle: bool = False class-attribute instance-attribute","text":"

Whether to shuffle the data at every epoch.

"},{"location":"reference/core/data/dataloaders/#eva.data.DataLoader.sampler","title":"sampler: samplers.Sampler | None = None class-attribute instance-attribute","text":"

Defines the strategy to draw samples from the dataset.

Can be any Iterable with __len__ implemented. If specified, shuffle must not be specified.

"},{"location":"reference/core/data/dataloaders/#eva.data.DataLoader.batch_sampler","title":"batch_sampler: samplers.Sampler | None = None class-attribute instance-attribute","text":"

Like sampler, but returns a batch of indices at a time.

Mutually exclusive with batch_size, shuffle, sampler and drop_last.

"},{"location":"reference/core/data/dataloaders/#eva.data.DataLoader.num_workers","title":"num_workers: int = multiprocessing.cpu_count() class-attribute instance-attribute","text":"

How many workers to use for loading the data.

By default, it will use the number of CPUs available.

"},{"location":"reference/core/data/dataloaders/#eva.data.DataLoader.collate_fn","title":"collate_fn: Callable | None = None class-attribute instance-attribute","text":"

The batching process.

"},{"location":"reference/core/data/dataloaders/#eva.data.DataLoader.pin_memory","title":"pin_memory: bool = True class-attribute instance-attribute","text":"

Will copy Tensors into CUDA pinned memory before returning them.

"},{"location":"reference/core/data/dataloaders/#eva.data.DataLoader.drop_last","title":"drop_last: bool = False class-attribute instance-attribute","text":"

Drops the last incomplete batch.

"},{"location":"reference/core/data/dataloaders/#eva.data.DataLoader.persistent_workers","title":"persistent_workers: bool = True class-attribute instance-attribute","text":"

Will keep the worker processes after a dataset has been consumed once.

"},{"location":"reference/core/data/dataloaders/#eva.data.DataLoader.prefetch_factor","title":"prefetch_factor: int | None = 2 class-attribute instance-attribute","text":"

Number of batches loaded in advance by each worker.

"},{"location":"reference/core/data/datamodules/","title":"Datamodules","text":"

Reference information for the Datamodule classes and functions.

"},{"location":"reference/core/data/datamodules/#eva.data.DataModule","title":"eva.data.DataModule","text":"

Bases: LightningDataModule

DataModule encapsulates all the steps needed to process data.

It will initialize and create the mapping between dataloaders and datasets. During the prepare_data, setup and teardown, the datamodule will call the respective methods from all datasets, given that they are defined.

Parameters:

Name Type Description Default datasets DatasetsSchema | None

The desired datasets.

None dataloaders DataloadersSchema | None

The desired dataloaders.

None Source code in src/eva/core/data/datamodules/datamodule.py
def __init__(\n    self,\n    datasets: schemas.DatasetsSchema | None = None,\n    dataloaders: schemas.DataloadersSchema | None = None,\n) -> None:\n    \"\"\"Initializes the datamodule.\n\n    Args:\n        datasets: The desired datasets.\n        dataloaders: The desired dataloaders.\n    \"\"\"\n    super().__init__()\n\n    self.datasets = datasets or self.default_datasets\n    self.dataloaders = dataloaders or self.default_dataloaders\n
"},{"location":"reference/core/data/datamodules/#eva.data.DataModule.default_datasets","title":"default_datasets: schemas.DatasetsSchema property","text":"

Returns the default datasets.

"},{"location":"reference/core/data/datamodules/#eva.data.DataModule.default_dataloaders","title":"default_dataloaders: schemas.DataloadersSchema property","text":"

Returns the default dataloader schema.

"},{"location":"reference/core/data/datamodules/#eva.data.datamodules.call.call_method_if_exists","title":"eva.data.datamodules.call.call_method_if_exists","text":"

Calls a desired method from the datasets if exists.

Parameters:

Name Type Description Default objects Iterable[Any]

An iterable of objects.

required method str

The dataset method name to call if exists.

required Source code in src/eva/core/data/datamodules/call.py
def call_method_if_exists(objects: Iterable[Any], /, method: str) -> None:\n    \"\"\"Calls a desired `method` from the datasets if exists.\n\n    Args:\n        objects: An iterable of objects.\n        method: The dataset method name to call if exists.\n    \"\"\"\n    for _object in _recursive_iter(objects):\n        if hasattr(_object, method):\n            fn = getattr(_object, method)\n            fn()\n
"},{"location":"reference/core/data/datamodules/#eva.data.datamodules.schemas.DatasetsSchema","title":"eva.data.datamodules.schemas.DatasetsSchema dataclass","text":"

Datasets schema used in DataModule.

"},{"location":"reference/core/data/datamodules/#eva.data.datamodules.schemas.DatasetsSchema.train","title":"train: TRAIN_DATASET = None class-attribute instance-attribute","text":"

Train dataset.

"},{"location":"reference/core/data/datamodules/#eva.data.datamodules.schemas.DatasetsSchema.val","title":"val: EVAL_DATASET = None class-attribute instance-attribute","text":"

Validation dataset.

"},{"location":"reference/core/data/datamodules/#eva.data.datamodules.schemas.DatasetsSchema.test","title":"test: EVAL_DATASET = None class-attribute instance-attribute","text":"

Test dataset.

"},{"location":"reference/core/data/datamodules/#eva.data.datamodules.schemas.DatasetsSchema.predict","title":"predict: EVAL_DATASET = None class-attribute instance-attribute","text":"

Predict dataset.

"},{"location":"reference/core/data/datamodules/#eva.data.datamodules.schemas.DatasetsSchema.tolist","title":"tolist","text":"

Returns the dataclass as a list and optionally filters it given the stage.

Source code in src/eva/core/data/datamodules/schemas.py
def tolist(self, stage: str | None = None) -> List[EVAL_DATASET]:\n    \"\"\"Returns the dataclass as a list and optionally filters it given the stage.\"\"\"\n    match stage:\n        case \"fit\":\n            return [self.train, self.val]\n        case \"validate\":\n            return [self.val]\n        case \"test\":\n            return [self.test]\n        case \"predict\":\n            return [self.predict]\n        case None:\n            return [self.train, self.val, self.test, self.predict]\n        case _:\n            raise ValueError(f\"Invalid stage `{stage}`.\")\n
"},{"location":"reference/core/data/datasets/","title":"Datasets","text":"

Reference information for the Dataset base class.

"},{"location":"reference/core/data/datasets/#eva.core.data.Dataset","title":"eva.core.data.Dataset","text":"

Bases: TorchDataset

Base dataset class.

"},{"location":"reference/core/data/datasets/#eva.core.data.Dataset.prepare_data","title":"prepare_data","text":"

Encapsulates all disk related tasks.

This method is preferred for downloading and preparing the data, for example generate manifest files. If implemented, it will be called via :class:eva.core.data.datamodules.DataModule, which ensures that is called only within a single process, making it multi-processes safe.

Source code in src/eva/core/data/datasets/base.py
def prepare_data(self) -> None:\n    \"\"\"Encapsulates all disk related tasks.\n\n    This method is preferred for downloading and preparing the data, for\n    example generate manifest files. If implemented, it will be called via\n    :class:`eva.core.data.datamodules.DataModule`, which ensures that is called\n    only within a single process, making it multi-processes safe.\n    \"\"\"\n
"},{"location":"reference/core/data/datasets/#eva.core.data.Dataset.setup","title":"setup","text":"

Setups the dataset.

This method is preferred for creating datasets or performing train/val/test splits. If implemented, it will be called via :class:eva.core.data.datamodules.DataModule at the beginning of fit (train + validate), validate, test, or predict and it will be called from every process (i.e. GPU) across all the nodes in DDP.

Source code in src/eva/core/data/datasets/base.py
def setup(self) -> None:\n    \"\"\"Setups the dataset.\n\n    This method is preferred for creating datasets or performing\n    train/val/test splits. If implemented, it will be called via\n    :class:`eva.core.data.datamodules.DataModule` at the beginning of fit\n    (train + validate), validate, test, or predict and it will be called\n    from every process (i.e. GPU) across all the nodes in DDP.\n    \"\"\"\n    self.configure()\n    self.validate()\n
"},{"location":"reference/core/data/datasets/#eva.core.data.Dataset.configure","title":"configure","text":"

Configures the dataset.

This method is preferred to configure the dataset; assign values to attributes, perform splits etc. This would be called from the method ::method::setup, before calling the ::method::validate.

Source code in src/eva/core/data/datasets/base.py
def configure(self):\n    \"\"\"Configures the dataset.\n\n    This method is preferred to configure the dataset; assign values\n    to attributes, perform splits etc. This would be called from the\n    method ::method::`setup`, before calling the ::method::`validate`.\n    \"\"\"\n
"},{"location":"reference/core/data/datasets/#eva.core.data.Dataset.validate","title":"validate","text":"

Validates the dataset.

This method aims to check the integrity of the dataset and verify that is configured properly. This would be called from the method ::method::setup, after calling the ::method::configure.

Source code in src/eva/core/data/datasets/base.py
def validate(self):\n    \"\"\"Validates the dataset.\n\n    This method aims to check the integrity of the dataset and verify\n    that is configured properly. This would be called from the method\n    ::method::`setup`, after calling the ::method::`configure`.\n    \"\"\"\n
"},{"location":"reference/core/data/datasets/#eva.core.data.Dataset.teardown","title":"teardown","text":"

Cleans up the data artifacts.

Used to clean-up when the run is finished. If implemented, it will be called via :class:eva.core.data.datamodules.DataModule at the end of fit (train + validate), validate, test, or predict and it will be called from every process (i.e. GPU) across all the nodes in DDP.

Source code in src/eva/core/data/datasets/base.py
def teardown(self) -> None:\n    \"\"\"Cleans up the data artifacts.\n\n    Used to clean-up when the run is finished. If implemented, it will\n    be called via :class:`eva.core.data.datamodules.DataModule` at the end\n    of fit (train + validate), validate, test, or predict and it will be\n    called from every process (i.e. GPU) across all the nodes in DDP.\n    \"\"\"\n
"},{"location":"reference/core/data/datasets/#embeddings-datasets","title":"Embeddings datasets","text":""},{"location":"reference/core/data/datasets/#eva.core.data.datasets.EmbeddingsClassificationDataset","title":"eva.core.data.datasets.EmbeddingsClassificationDataset","text":"

Bases: EmbeddingsDataset

Embeddings dataset class for classification tasks.

Expects a manifest file listing the paths of .pt files that contain tensor embeddings of shape [embedding_dim] or [1, embedding_dim].

Parameters:

Name Type Description Default root str

Root directory of the dataset.

required manifest_file str

The path to the manifest file, which is relative to the root argument.

required split Literal['train', 'val', 'test'] | None

The dataset split to use. The split column of the manifest file will be splitted based on this value.

None column_mapping Dict[str, str]

Defines the map between the variables and the manifest columns. It will overwrite the default_column_mapping with the provided values, so that column_mapping can contain only the values which are altered or missing.

default_column_mapping embeddings_transforms Callable | None

A function/transform that transforms the embedding.

None target_transforms Callable | None

A function/transform that transforms the target.

None Source code in src/eva/core/data/datasets/embeddings/classification/embeddings.py
def __init__(\n    self,\n    root: str,\n    manifest_file: str,\n    split: Literal[\"train\", \"val\", \"test\"] | None = None,\n    column_mapping: Dict[str, str] = base.default_column_mapping,\n    embeddings_transforms: Callable | None = None,\n    target_transforms: Callable | None = None,\n) -> None:\n    \"\"\"Initialize dataset.\n\n    Expects a manifest file listing the paths of .pt files that contain\n    tensor embeddings of shape [embedding_dim] or [1, embedding_dim].\n\n    Args:\n        root: Root directory of the dataset.\n        manifest_file: The path to the manifest file, which is relative to\n            the `root` argument.\n        split: The dataset split to use. The `split` column of the manifest\n            file will be splitted based on this value.\n        column_mapping: Defines the map between the variables and the manifest\n            columns. It will overwrite the `default_column_mapping` with\n            the provided values, so that `column_mapping` can contain only the\n            values which are altered or missing.\n        embeddings_transforms: A function/transform that transforms the embedding.\n        target_transforms: A function/transform that transforms the target.\n    \"\"\"\n    super().__init__(\n        root=root,\n        manifest_file=manifest_file,\n        split=split,\n        column_mapping=column_mapping,\n        embeddings_transforms=embeddings_transforms,\n        target_transforms=target_transforms,\n    )\n
"},{"location":"reference/core/data/datasets/#eva.core.data.datasets.MultiEmbeddingsClassificationDataset","title":"eva.core.data.datasets.MultiEmbeddingsClassificationDataset","text":"

Bases: EmbeddingsDataset

Dataset class for where a sample corresponds to multiple embeddings.

Example use case: Slide level dataset where each slide has multiple patch embeddings.

Expects a manifest file listing the paths of .pt files containing tensor embeddings.

The manifest must have a column_mapping[\"multi_id\"] column that contains the unique identifier group of embeddings. For oncology datasets, this would be usually the slide id. Each row in the manifest file points to a .pt file that can contain one or multiple embeddings. There can also be multiple rows for the same multi_id, in which case the embeddings from the different .pt files corresponding to that same multi_id will be stacked along the first dimension.

Parameters:

Name Type Description Default root str

Root directory of the dataset.

required manifest_file str

The path to the manifest file, which is relative to the root argument.

required split Literal['train', 'val', 'test']

The dataset split to use. The split column of the manifest file will be splitted based on this value.

required column_mapping Dict[str, str]

Defines the map between the variables and the manifest columns. It will overwrite the default_column_mapping with the provided values, so that column_mapping can contain only the values which are altered or missing.

default_column_mapping embeddings_transforms Callable | None

A function/transform that transforms the embedding.

None target_transforms Callable | None

A function/transform that transforms the target.

None Source code in src/eva/core/data/datasets/embeddings/classification/multi_embeddings.py
def __init__(\n    self,\n    root: str,\n    manifest_file: str,\n    split: Literal[\"train\", \"val\", \"test\"],\n    column_mapping: Dict[str, str] = base.default_column_mapping,\n    embeddings_transforms: Callable | None = None,\n    target_transforms: Callable | None = None,\n):\n    \"\"\"Initialize dataset.\n\n    Expects a manifest file listing the paths of `.pt` files containing tensor embeddings.\n\n    The manifest must have a `column_mapping[\"multi_id\"]` column that contains the\n    unique identifier group of embeddings. For oncology datasets, this would be usually\n    the slide id. Each row in the manifest file points to a .pt file that can contain\n    one or multiple embeddings. There can also be multiple rows for the same `multi_id`,\n    in which case the embeddings from the different .pt files corresponding to that same\n    `multi_id` will be stacked along the first dimension.\n\n    Args:\n        root: Root directory of the dataset.\n        manifest_file: The path to the manifest file, which is relative to\n            the `root` argument.\n        split: The dataset split to use. The `split` column of the manifest\n            file will be splitted based on this value.\n        column_mapping: Defines the map between the variables and the manifest\n            columns. It will overwrite the `default_column_mapping` with\n            the provided values, so that `column_mapping` can contain only the\n            values which are altered or missing.\n        embeddings_transforms: A function/transform that transforms the embedding.\n        target_transforms: A function/transform that transforms the target.\n    \"\"\"\n    super().__init__(\n        manifest_file=manifest_file,\n        root=root,\n        split=split,\n        column_mapping=column_mapping,\n        embeddings_transforms=embeddings_transforms,\n        target_transforms=target_transforms,\n    )\n\n    self._multi_ids: List[int]\n
"},{"location":"reference/core/data/transforms/","title":"Transforms","text":""},{"location":"reference/core/data/transforms/#eva.data.transforms.ArrayToTensor","title":"eva.data.transforms.ArrayToTensor","text":"

Converts a numpy array to a torch tensor.

"},{"location":"reference/core/data/transforms/#eva.data.transforms.ArrayToFloatTensor","title":"eva.data.transforms.ArrayToFloatTensor","text":"

Bases: ArrayToTensor

Converts a numpy array to a torch tensor and casts it to float.

"},{"location":"reference/core/data/transforms/#eva.data.transforms.Pad2DTensor","title":"eva.data.transforms.Pad2DTensor","text":"

Pads a 2D tensor to a fixed dimension accross the first dimension.

Parameters:

Name Type Description Default pad_size int

The size to pad the tensor to. If the tensor is larger than this size, no padding will be applied.

required pad_value int | float

The value to use for padding.

float('-inf') Source code in src/eva/core/data/transforms/padding/pad_2d_tensor.py
def __init__(self, pad_size: int, pad_value: int | float = float(\"-inf\")):\n    \"\"\"Initialize the transformation.\n\n    Args:\n        pad_size: The size to pad the tensor to. If the tensor is larger than this size,\n            no padding will be applied.\n        pad_value: The value to use for padding.\n    \"\"\"\n    self._pad_size = pad_size\n    self._pad_value = pad_value\n
"},{"location":"reference/core/data/transforms/#eva.data.transforms.SampleFromAxis","title":"eva.data.transforms.SampleFromAxis","text":"

Samples n_samples entries from a tensor along a given axis.

Parameters:

Name Type Description Default n_samples int

The number of samples to draw.

required seed int

The seed to use for sampling.

42 axis int

The axis along which to sample.

0 Source code in src/eva/core/data/transforms/sampling/sample_from_axis.py
def __init__(self, n_samples: int, seed: int = 42, axis: int = 0):\n    \"\"\"Initialize the transformation.\n\n    Args:\n        n_samples: The number of samples to draw.\n        seed: The seed to use for sampling.\n        axis: The axis along which to sample.\n    \"\"\"\n    self._seed = seed\n    self._n_samples = n_samples\n    self._axis = axis\n    self._generator = self._get_generator()\n
"},{"location":"reference/core/loggers/loggers/","title":"Loggers","text":""},{"location":"reference/core/loggers/loggers/#eva.core.loggers.DummyLogger","title":"eva.core.loggers.DummyLogger","text":"

Bases: DummyLogger

Dummy logger class.

This logger is currently used as a placeholder when saving results to remote storage, as common lightning loggers do not work with azure blob storage:

https://github.com/Lightning-AI/pytorch-lightning/issues/18861 https://github.com/Lightning-AI/pytorch-lightning/issues/19736

Simply disabling the loggers when pointing to remote storage doesn't work because callbacks such as LearningRateMonitor or ModelCheckpoint require a logger to be present.

Parameters:

Name Type Description Default save_dir str

The save directory (this logger does not save anything, but callbacks might use this path to save their outputs).

required Source code in src/eva/core/loggers/dummy.py
def __init__(self, save_dir: str) -> None:\n    \"\"\"Initializes the logger.\n\n    Args:\n        save_dir: The save directory (this logger does not save anything,\n            but callbacks might use this path to save their outputs).\n    \"\"\"\n    super().__init__()\n    self._save_dir = save_dir\n
"},{"location":"reference/core/loggers/loggers/#eva.core.loggers.DummyLogger.save_dir","title":"save_dir: str property","text":"

Returns the save directory.

"},{"location":"reference/core/metrics/","title":"Metrics","text":"

Reference information for the Metrics classes.

"},{"location":"reference/core/metrics/average_loss/","title":"Average Loss","text":""},{"location":"reference/core/metrics/average_loss/#eva.metrics.AverageLoss","title":"eva.metrics.AverageLoss","text":"

Bases: Metric

Average loss metric tracker.

Source code in src/eva/core/metrics/average_loss.py
def __init__(self) -> None:\n    \"\"\"Initializes the metric.\"\"\"\n    super().__init__()\n\n    self.add_state(\"value\", default=torch.tensor(0), dist_reduce_fx=\"sum\")\n    self.add_state(\"total\", default=torch.tensor(0), dist_reduce_fx=\"sum\")\n
"},{"location":"reference/core/metrics/binary_balanced_accuracy/","title":"Binary Balanced Accuracy","text":""},{"location":"reference/core/metrics/binary_balanced_accuracy/#eva.metrics.BinaryBalancedAccuracy","title":"eva.metrics.BinaryBalancedAccuracy","text":"

Bases: BinaryStatScores

Computes the balanced accuracy for binary classification.

"},{"location":"reference/core/metrics/binary_balanced_accuracy/#eva.metrics.BinaryBalancedAccuracy.compute","title":"compute","text":"

Compute accuracy based on inputs passed in to update previously.

Source code in src/eva/core/metrics/binary_balanced_accuracy.py
def compute(self) -> Tensor:\n    \"\"\"Compute accuracy based on inputs passed in to ``update`` previously.\"\"\"\n    tp, fp, tn, fn = self._final_state()\n    sensitivity = _safe_divide(tp, tp + fn)\n    specificity = _safe_divide(tn, tn + fp)\n    return 0.5 * (sensitivity + specificity)\n
"},{"location":"reference/core/metrics/core/","title":"Core","text":""},{"location":"reference/core/metrics/core/#eva.metrics.MetricModule","title":"eva.metrics.MetricModule","text":"

Bases: Module

The metrics module.

Allows to store and keep track of train, val and test metrics.

Parameters:

Name Type Description Default train MetricCollection | None

The training metric collection.

required val MetricCollection | None

The validation metric collection.

required test MetricCollection | None

The test metric collection.

required Source code in src/eva/core/metrics/structs/module.py
def __init__(\n    self,\n    train: collection.MetricCollection | None,\n    val: collection.MetricCollection | None,\n    test: collection.MetricCollection | None,\n) -> None:\n    \"\"\"Initializes the metrics for the Trainer.\n\n    Args:\n        train: The training metric collection.\n        val: The validation metric collection.\n        test: The test metric collection.\n    \"\"\"\n    super().__init__()\n\n    self._train = train or self.default_metric_collection\n    self._val = val or self.default_metric_collection\n    self._test = test or self.default_metric_collection\n
"},{"location":"reference/core/metrics/core/#eva.metrics.MetricModule.default_metric_collection","title":"default_metric_collection: collection.MetricCollection property","text":"

Returns the default metric collection.

"},{"location":"reference/core/metrics/core/#eva.metrics.MetricModule.training_metrics","title":"training_metrics: collection.MetricCollection property","text":"

Returns the metrics of the train dataset.

"},{"location":"reference/core/metrics/core/#eva.metrics.MetricModule.validation_metrics","title":"validation_metrics: collection.MetricCollection property","text":"

Returns the metrics of the validation dataset.

"},{"location":"reference/core/metrics/core/#eva.metrics.MetricModule.test_metrics","title":"test_metrics: collection.MetricCollection property","text":"

Returns the metrics of the test dataset.

"},{"location":"reference/core/metrics/core/#eva.metrics.MetricModule.from_metrics","title":"from_metrics classmethod","text":"

Initializes a metric module from a list of metrics.

Parameters:

Name Type Description Default train MetricModuleType | None

Metrics for the training stage.

required val MetricModuleType | None

Metrics for the validation stage.

required test MetricModuleType | None

Metrics for the test stage.

required separator str

The separator between the group name of the metric and the metric itself.

'/' Source code in src/eva/core/metrics/structs/module.py
@classmethod\ndef from_metrics(\n    cls,\n    train: MetricModuleType | None,\n    val: MetricModuleType | None,\n    test: MetricModuleType | None,\n    *,\n    separator: str = \"/\",\n) -> MetricModule:\n    \"\"\"Initializes a metric module from a list of metrics.\n\n    Args:\n        train: Metrics for the training stage.\n        val: Metrics for the validation stage.\n        test: Metrics for the test stage.\n        separator: The separator between the group name of the metric\n            and the metric itself.\n    \"\"\"\n    return cls(\n        train=_create_collection_from_metrics(train, prefix=\"train\" + separator),\n        val=_create_collection_from_metrics(val, prefix=\"val\" + separator),\n        test=_create_collection_from_metrics(test, prefix=\"test\" + separator),\n    )\n
"},{"location":"reference/core/metrics/core/#eva.metrics.MetricModule.from_schema","title":"from_schema classmethod","text":"

Initializes a metric module from the metrics schema.

Parameters:

Name Type Description Default schema MetricsSchema

The dataclass metric schema.

required separator str

The separator between the group name of the metric and the metric itself.

'/' Source code in src/eva/core/metrics/structs/module.py
@classmethod\ndef from_schema(\n    cls,\n    schema: schemas.MetricsSchema,\n    *,\n    separator: str = \"/\",\n) -> MetricModule:\n    \"\"\"Initializes a metric module from the metrics schema.\n\n    Args:\n        schema: The dataclass metric schema.\n        separator: The separator between the group name of the metric\n            and the metric itself.\n    \"\"\"\n    return cls.from_metrics(\n        train=schema.training_metrics,\n        val=schema.evaluation_metrics,\n        test=schema.evaluation_metrics,\n        separator=separator,\n    )\n
"},{"location":"reference/core/metrics/core/#eva.metrics.MetricsSchema","title":"eva.metrics.MetricsSchema dataclass","text":"

Metrics schema.

"},{"location":"reference/core/metrics/core/#eva.metrics.MetricsSchema.common","title":"common: MetricModuleType | None = None class-attribute instance-attribute","text":"

Holds the common train and evaluation metrics.

"},{"location":"reference/core/metrics/core/#eva.metrics.MetricsSchema.train","title":"train: MetricModuleType | None = None class-attribute instance-attribute","text":"

The exclusive training metrics.

"},{"location":"reference/core/metrics/core/#eva.metrics.MetricsSchema.evaluation","title":"evaluation: MetricModuleType | None = None class-attribute instance-attribute","text":"

The exclusive evaluation metrics.

"},{"location":"reference/core/metrics/core/#eva.metrics.MetricsSchema.training_metrics","title":"training_metrics: MetricModuleType | None property","text":"

Returns the training metics.

"},{"location":"reference/core/metrics/core/#eva.metrics.MetricsSchema.evaluation_metrics","title":"evaluation_metrics: MetricModuleType | None property","text":"

Returns the evaluation metics.

"},{"location":"reference/core/metrics/defaults/","title":"Defaults","text":""},{"location":"reference/core/metrics/defaults/#eva.metrics.BinaryClassificationMetrics","title":"eva.metrics.BinaryClassificationMetrics","text":"

Bases: MetricCollection

Default metrics for binary classification tasks.

The metrics instantiated here are:

Parameters:

Name Type Description Default threshold float

Threshold for transforming probability to binary (0,1) predictions

0.5 ignore_index int | None

Specifies a target value that is ignored and does not contribute to the metric calculation.

None prefix str | None

A string to append in front of the keys of the output dict.

None postfix str | None

A string to append after the keys of the output dict.

None Source code in src/eva/core/metrics/defaults/classification/binary.py
def __init__(\n    self,\n    threshold: float = 0.5,\n    ignore_index: int | None = None,\n    prefix: str | None = None,\n    postfix: str | None = None,\n) -> None:\n    \"\"\"Initializes the binary classification metrics.\n\n    The metrics instantiated here are:\n\n    - BinaryAUROC\n    - BinaryAccuracy\n    - BinaryBalancedAccuracy\n    - BinaryF1Score\n    - BinaryPrecision\n    - BinaryRecall\n\n    Args:\n        threshold: Threshold for transforming probability to binary (0,1) predictions\n        ignore_index: Specifies a target value that is ignored and does not\n            contribute to the metric calculation.\n        prefix: A string to append in front of the keys of the output dict.\n        postfix: A string to append after the keys of the output dict.\n    \"\"\"\n    super().__init__(\n        metrics=[\n            classification.BinaryAUROC(\n                ignore_index=ignore_index,\n            ),\n            classification.BinaryAccuracy(\n                threshold=threshold,\n                ignore_index=ignore_index,\n            ),\n            binary_balanced_accuracy.BinaryBalancedAccuracy(\n                threshold=threshold,\n                ignore_index=ignore_index,\n            ),\n            classification.BinaryF1Score(\n                threshold=threshold,\n                ignore_index=ignore_index,\n            ),\n            classification.BinaryPrecision(\n                threshold=threshold,\n                ignore_index=ignore_index,\n            ),\n            classification.BinaryRecall(\n                threshold=threshold,\n                ignore_index=ignore_index,\n            ),\n        ],\n        prefix=prefix,\n        postfix=postfix,\n        compute_groups=[\n            [\n                \"BinaryAccuracy\",\n                \"BinaryBalancedAccuracy\",\n                \"BinaryF1Score\",\n                \"BinaryPrecision\",\n                \"BinaryRecall\",\n            ],\n            [\n                \"BinaryAUROC\",\n            ],\n        ],\n    )\n
"},{"location":"reference/core/metrics/defaults/#eva.metrics.MulticlassClassificationMetrics","title":"eva.metrics.MulticlassClassificationMetrics","text":"

Bases: MetricCollection

Default metrics for multi-class classification tasks.

The metrics instantiated here are:

Parameters:

Name Type Description Default num_classes int

Integer specifying the number of classes.

required average Literal['macro', 'weighted', 'none']

Defines the reduction that is applied over labels.

'macro' ignore_index int | None

Specifies a target value that is ignored and does not contribute to the metric calculation.

None prefix str | None

A string to append in front of the keys of the output dict.

None postfix str | None

A string to append after the keys of the output dict.

None Source code in src/eva/core/metrics/defaults/classification/multiclass.py
def __init__(\n    self,\n    num_classes: int,\n    average: Literal[\"macro\", \"weighted\", \"none\"] = \"macro\",\n    ignore_index: int | None = None,\n    prefix: str | None = None,\n    postfix: str | None = None,\n) -> None:\n    \"\"\"Initializes the multi-class classification metrics.\n\n    The metrics instantiated here are:\n\n    - MulticlassAccuracy\n    - MulticlassPrecision\n    - MulticlassRecall\n    - MulticlassF1Score\n    - MulticlassAUROC\n\n    Args:\n        num_classes: Integer specifying the number of classes.\n        average: Defines the reduction that is applied over labels.\n        ignore_index: Specifies a target value that is ignored and does not\n            contribute to the metric calculation.\n        prefix: A string to append in front of the keys of the output dict.\n        postfix: A string to append after the keys of the output dict.\n    \"\"\"\n    super().__init__(\n        metrics=[\n            classification.MulticlassAUROC(\n                num_classes=num_classes,\n                average=average,\n                ignore_index=ignore_index,\n            ),\n            classification.MulticlassAccuracy(\n                num_classes=num_classes,\n                average=average,\n                ignore_index=ignore_index,\n            ),\n            classification.MulticlassF1Score(\n                num_classes=num_classes,\n                average=average,\n                ignore_index=ignore_index,\n            ),\n            classification.MulticlassPrecision(\n                num_classes=num_classes,\n                average=average,\n                ignore_index=ignore_index,\n            ),\n            classification.MulticlassRecall(\n                num_classes=num_classes,\n                average=average,\n                ignore_index=ignore_index,\n            ),\n        ],\n        prefix=prefix,\n        postfix=postfix,\n        compute_groups=[\n            [\n                \"MulticlassAccuracy\",\n                \"MulticlassF1Score\",\n                \"MulticlassPrecision\",\n                \"MulticlassRecall\",\n            ],\n            [\n                \"MulticlassAUROC\",\n            ],\n        ],\n    )\n
"},{"location":"reference/core/models/modules/","title":"Modules","text":"

Reference information for the model Modules API.

"},{"location":"reference/core/models/modules/#eva.models.modules.ModelModule","title":"eva.models.modules.ModelModule","text":"

Bases: LightningModule

The base model module.

Parameters:

Name Type Description Default metrics MetricsSchema | None

The metric groups to track.

None postprocess BatchPostProcess | None

A list of helper functions to apply after the loss and before the metrics calculation to the model predictions and targets.

None Source code in src/eva/core/models/modules/module.py
def __init__(\n    self,\n    metrics: metrics_lib.MetricsSchema | None = None,\n    postprocess: batch_postprocess.BatchPostProcess | None = None,\n) -> None:\n    \"\"\"Initializes the basic module.\n\n    Args:\n        metrics: The metric groups to track.\n        postprocess: A list of helper functions to apply after the\n            loss and before the metrics calculation to the model\n            predictions and targets.\n    \"\"\"\n    super().__init__()\n\n    self._metrics = metrics or self.default_metrics\n    self._postprocess = postprocess or self.default_postprocess\n\n    self.metrics = metrics_lib.MetricModule.from_schema(self._metrics)\n
"},{"location":"reference/core/models/modules/#eva.models.modules.ModelModule.default_metrics","title":"default_metrics: metrics_lib.MetricsSchema property","text":"

The default metrics.

"},{"location":"reference/core/models/modules/#eva.models.modules.ModelModule.default_postprocess","title":"default_postprocess: batch_postprocess.BatchPostProcess property","text":"

The default post-processes.

"},{"location":"reference/core/models/modules/#eva.models.modules.ModelModule.metrics_device","title":"metrics_device: torch.device property","text":"

Returns the device by which the metrics should be calculated.

We allocate the metrics to CPU when operating on single device, as it is much faster, but to GPU when employing multiple ones, as DDP strategy requires the metrics to be allocated to the module's GPU.

"},{"location":"reference/core/models/modules/#eva.models.modules.HeadModule","title":"eva.models.modules.HeadModule","text":"

Bases: ModelModule

Neural Net Head Module for training on features.

It can be used for supervised (mini-batch) stochastic gradient descent downstream tasks such as classification, regression and segmentation.

Parameters:

Name Type Description Default head MODEL_TYPE

The neural network that would be trained on the features.

required criterion Callable[..., Tensor]

The loss function to use.

required backbone MODEL_TYPE | None

The feature extractor. If None, it will be expected that the input batch returns the features directly.

None optimizer OptimizerCallable

The optimizer to use.

Adam lr_scheduler LRSchedulerCallable

The learning rate scheduler to use.

ConstantLR metrics MetricsSchema | None

The metric groups to track.

None postprocess BatchPostProcess | None

A list of helper functions to apply after the loss and before the metrics calculation to the model predictions and targets.

None Source code in src/eva/core/models/modules/head.py
def __init__(\n    self,\n    head: MODEL_TYPE,\n    criterion: Callable[..., torch.Tensor],\n    backbone: MODEL_TYPE | None = None,\n    optimizer: OptimizerCallable = optim.Adam,\n    lr_scheduler: LRSchedulerCallable = lr_scheduler.ConstantLR,\n    metrics: metrics_lib.MetricsSchema | None = None,\n    postprocess: batch_postprocess.BatchPostProcess | None = None,\n) -> None:\n    \"\"\"Initializes the neural net head module.\n\n    Args:\n        head: The neural network that would be trained on the features.\n        criterion: The loss function to use.\n        backbone: The feature extractor. If `None`, it will be expected\n            that the input batch returns the features directly.\n        optimizer: The optimizer to use.\n        lr_scheduler: The learning rate scheduler to use.\n        metrics: The metric groups to track.\n        postprocess: A list of helper functions to apply after the\n            loss and before the metrics calculation to the model\n            predictions and targets.\n    \"\"\"\n    super().__init__(metrics=metrics, postprocess=postprocess)\n\n    self.head = head\n    self.criterion = criterion\n    self.backbone = backbone\n    self.optimizer = optimizer\n    self.lr_scheduler = lr_scheduler\n
"},{"location":"reference/core/models/modules/#eva.models.modules.InferenceModule","title":"eva.models.modules.InferenceModule","text":"

Bases: ModelModule

An lightweight model module to perform inference.

Parameters:

Name Type Description Default backbone MODEL_TYPE

The network to be used for inference.

required Source code in src/eva/core/models/modules/inference.py
def __init__(self, backbone: MODEL_TYPE) -> None:\n    \"\"\"Initializes the module.\n\n    Args:\n        backbone: The network to be used for inference.\n    \"\"\"\n    super().__init__(metrics=None)\n\n    self.backbone = backbone\n
"},{"location":"reference/core/models/networks/","title":"Networks","text":"

Reference information for the model Networks API.

"},{"location":"reference/core/models/networks/#eva.models.networks.MLP","title":"eva.models.networks.MLP","text":"

Bases: Module

A Multi-layer Perceptron (MLP) network.

Parameters:

Name Type Description Default input_size int

The number of input features.

required output_size int

The number of output features.

required hidden_layer_sizes tuple[int, ...] | None

A list specifying the number of units in each hidden layer.

None dropout float

Dropout probability for hidden layers.

0.0 hidden_activation_fn Type[Module] | None

Activation function to use for hidden layers. Default is ReLU.

ReLU output_activation_fn Type[Module] | None

Activation function to use for the output layer. Default is None.

None Source code in src/eva/core/models/networks/mlp.py
def __init__(\n    self,\n    input_size: int,\n    output_size: int,\n    hidden_layer_sizes: tuple[int, ...] | None = None,\n    hidden_activation_fn: Type[torch.nn.Module] | None = nn.ReLU,\n    output_activation_fn: Type[torch.nn.Module] | None = None,\n    dropout: float = 0.0,\n) -> None:\n    \"\"\"Initializes the MLP.\n\n    Args:\n        input_size: The number of input features.\n        output_size: The number of output features.\n        hidden_layer_sizes: A list specifying the number of units in each hidden layer.\n        dropout: Dropout probability for hidden layers.\n        hidden_activation_fn: Activation function to use for hidden layers. Default is ReLU.\n        output_activation_fn: Activation function to use for the output layer. Default is None.\n    \"\"\"\n    super().__init__()\n\n    self.input_size = input_size\n    self.output_size = output_size\n    self.hidden_layer_sizes = hidden_layer_sizes if hidden_layer_sizes is not None else ()\n    self.hidden_activation_fn = hidden_activation_fn\n    self.output_activation_fn = output_activation_fn\n    self.dropout = dropout\n\n    self._network = self._build_network()\n
"},{"location":"reference/core/models/networks/#eva.models.networks.MLP.forward","title":"forward","text":"

Defines the forward pass of the MLP.

Parameters:

Name Type Description Default x Tensor

The input tensor.

required

Returns:

Type Description Tensor

The output of the network.

Source code in src/eva/core/models/networks/mlp.py
def forward(self, x: torch.Tensor) -> torch.Tensor:\n    \"\"\"Defines the forward pass of the MLP.\n\n    Args:\n        x: The input tensor.\n\n    Returns:\n        The output of the network.\n    \"\"\"\n    return self._network(x)\n
"},{"location":"reference/core/models/networks/#wrappers","title":"Wrappers","text":""},{"location":"reference/core/models/networks/#eva.models.networks.wrappers.BaseModel","title":"eva.models.networks.wrappers.BaseModel","text":"

Bases: Module

Base class for model wrappers.

Parameters:

Name Type Description Default tensor_transforms Callable | None

The transforms to apply to the output tensor produced by the model.

None Source code in src/eva/core/models/networks/wrappers/base.py
def __init__(self, tensor_transforms: Callable | None = None) -> None:\n    \"\"\"Initializes the model.\n\n    Args:\n        tensor_transforms: The transforms to apply to the output\n            tensor produced by the model.\n    \"\"\"\n    super().__init__()\n\n    self._output_transforms = tensor_transforms\n
"},{"location":"reference/core/models/networks/#eva.models.networks.wrappers.BaseModel.load_model","title":"load_model abstractmethod","text":"

Loads the model.

Source code in src/eva/core/models/networks/wrappers/base.py
@abc.abstractmethod\ndef load_model(self) -> Callable[..., torch.Tensor]:\n    \"\"\"Loads the model.\"\"\"\n    raise NotImplementedError\n
"},{"location":"reference/core/models/networks/#eva.models.networks.wrappers.BaseModel.model_forward","title":"model_forward abstractmethod","text":"

Implements the forward pass of the model.

Parameters:

Name Type Description Default tensor Tensor

The input tensor to the model.

required Source code in src/eva/core/models/networks/wrappers/base.py
@abc.abstractmethod\ndef model_forward(self, tensor: torch.Tensor) -> torch.Tensor:\n    \"\"\"Implements the forward pass of the model.\n\n    Args:\n        tensor: The input tensor to the model.\n    \"\"\"\n    raise NotImplementedError\n
"},{"location":"reference/core/models/networks/#eva.models.networks.wrappers.ModelFromFunction","title":"eva.models.networks.wrappers.ModelFromFunction","text":"

Bases: BaseModel

Wrapper class for models which are initialized from functions.

This is helpful for initializing models in a .yaml configuration file.

Parameters:

Name Type Description Default path Callable[..., Module]

The path to the callable object (class or function).

required arguments Dict[str, Any] | None

The extra callable function / class arguments.

None checkpoint_path str | None

The path to the checkpoint to load the model weights from. This is currently only supported for torch model checkpoints. For other formats, the checkpoint loading should be handled within the provided callable object in . None tensor_transforms Callable | None

The transforms to apply to the output tensor produced by the model.

None Source code in src/eva/core/models/networks/wrappers/from_function.py
def __init__(\n    self,\n    path: Callable[..., nn.Module],\n    arguments: Dict[str, Any] | None = None,\n    checkpoint_path: str | None = None,\n    tensor_transforms: Callable | None = None,\n) -> None:\n    \"\"\"Initializes and constructs the model.\n\n    Args:\n        path: The path to the callable object (class or function).\n        arguments: The extra callable function / class arguments.\n        checkpoint_path: The path to the checkpoint to load the model\n            weights from. This is currently only supported for torch\n            model checkpoints. For other formats, the checkpoint loading\n            should be handled within the provided callable object in <path>.\n        tensor_transforms: The transforms to apply to the output tensor\n            produced by the model.\n    \"\"\"\n    super().__init__()\n\n    self._path = path\n    self._arguments = arguments\n    self._checkpoint_path = checkpoint_path\n    self._tensor_transforms = tensor_transforms\n\n    self._model = self.load_model()\n
"},{"location":"reference/core/models/networks/#eva.models.networks.wrappers.HuggingFaceModel","title":"eva.models.networks.wrappers.HuggingFaceModel","text":"

Bases: BaseModel

Wrapper class for loading HuggingFace transformers models.

Parameters:

Name Type Description Default model_name_or_path str

The model name or path to load the model from. This can be a local path or a model name from the HuggingFace model hub.

required tensor_transforms Callable | None

The transforms to apply to the output tensor produced by the model.

None Source code in src/eva/core/models/networks/wrappers/huggingface.py
def __init__(self, model_name_or_path: str, tensor_transforms: Callable | None = None) -> None:\n    \"\"\"Initializes the model.\n\n    Args:\n        model_name_or_path: The model name or path to load the model from.\n            This can be a local path or a model name from the `HuggingFace`\n            model hub.\n        tensor_transforms: The transforms to apply to the output tensor\n            produced by the model.\n    \"\"\"\n    super().__init__(tensor_transforms=tensor_transforms)\n\n    self._model_name_or_path = model_name_or_path\n    self._model = self.load_model()\n
"},{"location":"reference/core/models/networks/#eva.models.networks.wrappers.ONNXModel","title":"eva.models.networks.wrappers.ONNXModel","text":"

Bases: BaseModel

Wrapper class for loading ONNX models.

Parameters:

Name Type Description Default path str

The path to the .onnx model file.

required device Literal['cpu', 'cuda'] | None

The device to run the model on. This can be either \"cpu\" or \"cuda\".

'cpu' tensor_transforms Callable | None

The transforms to apply to the output tensor produced by the model.

None Source code in src/eva/core/models/networks/wrappers/onnx.py
def __init__(\n    self,\n    path: str,\n    device: Literal[\"cpu\", \"cuda\"] | None = \"cpu\",\n    tensor_transforms: Callable | None = None,\n):\n    \"\"\"Initializes the model.\n\n    Args:\n        path: The path to the .onnx model file.\n        device: The device to run the model on. This can be either \"cpu\" or \"cuda\".\n        tensor_transforms: The transforms to apply to the output tensor produced by the model.\n    \"\"\"\n    super().__init__(tensor_transforms=tensor_transforms)\n\n    self._path = path\n    self._device = device\n    self._model = self.load_model()\n
"},{"location":"reference/core/trainers/functional/","title":"Functional","text":"

Reference information for the trainers Functional API.

"},{"location":"reference/core/trainers/functional/#eva.core.trainers.functional.run_evaluation_session","title":"eva.core.trainers.functional.run_evaluation_session","text":"

Runs a downstream evaluation session out-of-place.

It performs an evaluation run (fit and evaluate) on the model multiple times. Note that as the input base_trainer and base_model would be cloned, the input object would not be modified.

Parameters:

Name Type Description Default base_trainer Trainer

The base trainer module to use.

required base_model ModelModule

The base model module to use.

required datamodule DataModule

The data module.

required n_runs int

The amount of runs (fit and evaluate) to perform.

1 verbose bool

Whether to verbose the session metrics instead of these of each individual runs and vice-versa.

True Source code in src/eva/core/trainers/functional.py
def run_evaluation_session(\n    base_trainer: eva_trainer.Trainer,\n    base_model: modules.ModelModule,\n    datamodule: datamodules.DataModule,\n    *,\n    n_runs: int = 1,\n    verbose: bool = True,\n) -> None:\n    \"\"\"Runs a downstream evaluation session out-of-place.\n\n    It performs an evaluation run (fit and evaluate) on the model\n    multiple times. Note that as the input `base_trainer` and\n    `base_model` would be cloned, the input object would not\n    be modified.\n\n    Args:\n        base_trainer: The base trainer module to use.\n        base_model: The base model module to use.\n        datamodule: The data module.\n        n_runs: The amount of runs (fit and evaluate) to perform.\n        verbose: Whether to verbose the session metrics instead of\n            these of each individual runs and vice-versa.\n    \"\"\"\n    recorder = _recorder.SessionRecorder(output_dir=base_trainer.default_log_dir, verbose=verbose)\n    for run_index in range(n_runs):\n        validation_scores, test_scores = run_evaluation(\n            base_trainer,\n            base_model,\n            datamodule,\n            run_id=f\"run_{run_index}\",\n            verbose=not verbose,\n        )\n        recorder.update(validation_scores, test_scores)\n    recorder.save()\n
"},{"location":"reference/core/trainers/functional/#eva.core.trainers.functional.run_evaluation","title":"eva.core.trainers.functional.run_evaluation","text":"

Fits and evaluates a model out-of-place.

Parameters:

Name Type Description Default base_trainer Trainer

The base trainer to use but not modify.

required base_model ModelModule

The model module to use but not modify.

required datamodule DataModule

The data module.

required run_id str | None

The run id to be appended to the output log directory. If None, it will use the log directory of the trainer as is.

None verbose bool

Whether to print the validation and test metrics in the end of the training.

True

Returns:

Type Description Tuple[_EVALUATE_OUTPUT, _EVALUATE_OUTPUT | None]

A tuple of with the validation and the test metrics (if exists).

Source code in src/eva/core/trainers/functional.py
def run_evaluation(\n    base_trainer: eva_trainer.Trainer,\n    base_model: modules.ModelModule,\n    datamodule: datamodules.DataModule,\n    *,\n    run_id: str | None = None,\n    verbose: bool = True,\n) -> Tuple[_EVALUATE_OUTPUT, _EVALUATE_OUTPUT | None]:\n    \"\"\"Fits and evaluates a model out-of-place.\n\n    Args:\n        base_trainer: The base trainer to use but not modify.\n        base_model: The model module to use but not modify.\n        datamodule: The data module.\n        run_id: The run id to be appended to the output log directory.\n            If `None`, it will use the log directory of the trainer as is.\n        verbose: Whether to print the validation and test metrics\n            in the end of the training.\n\n    Returns:\n        A tuple of with the validation and the test metrics (if exists).\n    \"\"\"\n    trainer, model = _utils.clone(base_trainer, base_model)\n    trainer.setup_log_dirs(run_id or \"\")\n    return fit_and_validate(trainer, model, datamodule, verbose=verbose)\n
"},{"location":"reference/core/trainers/functional/#eva.core.trainers.functional.fit_and_validate","title":"eva.core.trainers.functional.fit_and_validate","text":"

Fits and evaluates a model in-place.

If the test set is set in the datamodule, it will evaluate the model on the test set as well.

Parameters:

Name Type Description Default trainer Trainer

The trainer module to use and update in-place.

required model ModelModule

The model module to use and update in-place.

required datamodule DataModule

The data module.

required verbose bool

Whether to print the validation and test metrics in the end of the training.

True

Returns:

Type Description Tuple[_EVALUATE_OUTPUT, _EVALUATE_OUTPUT | None]

A tuple of with the validation and the test metrics (if exists).

Source code in src/eva/core/trainers/functional.py
def fit_and_validate(\n    trainer: eva_trainer.Trainer,\n    model: modules.ModelModule,\n    datamodule: datamodules.DataModule,\n    verbose: bool = True,\n) -> Tuple[_EVALUATE_OUTPUT, _EVALUATE_OUTPUT | None]:\n    \"\"\"Fits and evaluates a model in-place.\n\n    If the test set is set in the datamodule, it will evaluate the model\n    on the test set as well.\n\n    Args:\n        trainer: The trainer module to use and update in-place.\n        model: The model module to use and update in-place.\n        datamodule: The data module.\n        verbose: Whether to print the validation and test metrics\n            in the end of the training.\n\n    Returns:\n        A tuple of with the validation and the test metrics (if exists).\n    \"\"\"\n    trainer.fit(model, datamodule=datamodule)\n    validation_scores = trainer.validate(datamodule=datamodule, verbose=verbose)\n    test_scores = (\n        None\n        if datamodule.datasets.test is None\n        else trainer.test(datamodule=datamodule, verbose=verbose)\n    )\n    return validation_scores, test_scores\n
"},{"location":"reference/core/trainers/functional/#eva.core.trainers.functional.infer_model","title":"eva.core.trainers.functional.infer_model","text":"

Performs model inference out-of-place.

Note that the input base_model and base_trainer would not be modified.

Parameters:

Name Type Description Default base_trainer Trainer

The base trainer to use but not modify.

required base_model ModelModule

The model module to use but not modify.

required datamodule DataModule

The data module.

required return_predictions bool

Whether to return the model predictions.

False Source code in src/eva/core/trainers/functional.py
def infer_model(\n    base_trainer: eva_trainer.Trainer,\n    base_model: modules.ModelModule,\n    datamodule: datamodules.DataModule,\n    *,\n    return_predictions: bool = False,\n) -> None:\n    \"\"\"Performs model inference out-of-place.\n\n    Note that the input `base_model` and `base_trainer` would\n    not be modified.\n\n    Args:\n        base_trainer: The base trainer to use but not modify.\n        base_model: The model module to use but not modify.\n        datamodule: The data module.\n        return_predictions: Whether to return the model predictions.\n    \"\"\"\n    trainer, model = _utils.clone(base_trainer, base_model)\n    return trainer.predict(\n        model=model,\n        datamodule=datamodule,\n        return_predictions=return_predictions,\n    )\n
"},{"location":"reference/core/trainers/trainer/","title":"Trainers","text":"

Reference information for the Trainers API.

"},{"location":"reference/core/trainers/trainer/#eva.core.trainers.Trainer","title":"eva.core.trainers.Trainer","text":"

Bases: Trainer

Core trainer class.

This is an extended version of lightning's core trainer class.

For the input arguments, refer to ::class::lightning.pytorch.Trainer.

Parameters:

Name Type Description Default args Any

Positional arguments of ::class::lightning.pytorch.Trainer.

() default_root_dir str

The default root directory to store the output logs. Unlike in ::class::lightning.pytorch.Trainer, this path would be the prioritized destination point.

'logs' n_runs int

The amount of runs (fit and evaluate) to perform in an evaluation session.

1 kwargs Any

Kew-word arguments of ::class::lightning.pytorch.Trainer.

{} Source code in src/eva/core/trainers/trainer.py
@argparse._defaults_from_env_vars\ndef __init__(\n    self,\n    *args: Any,\n    default_root_dir: str = \"logs\",\n    n_runs: int = 1,\n    **kwargs: Any,\n) -> None:\n    \"\"\"Initializes the trainer.\n\n    For the input arguments, refer to ::class::`lightning.pytorch.Trainer`.\n\n    Args:\n        args: Positional arguments of ::class::`lightning.pytorch.Trainer`.\n        default_root_dir: The default root directory to store the output logs.\n            Unlike in ::class::`lightning.pytorch.Trainer`, this path would be the\n            prioritized destination point.\n        n_runs: The amount of runs (fit and evaluate) to perform in an evaluation session.\n        kwargs: Kew-word arguments of ::class::`lightning.pytorch.Trainer`.\n    \"\"\"\n    super().__init__(*args, default_root_dir=default_root_dir, **kwargs)\n\n    self._n_runs = n_runs\n\n    self._session_id: str = _logging.generate_session_id()\n    self._log_dir: str = self.default_log_dir\n\n    self.setup_log_dirs()\n
"},{"location":"reference/core/trainers/trainer/#eva.core.trainers.Trainer.default_log_dir","title":"default_log_dir: str property","text":"

Returns the default log directory.

"},{"location":"reference/core/trainers/trainer/#eva.core.trainers.Trainer.setup_log_dirs","title":"setup_log_dirs","text":"

Setups the logging directory of the trainer and experimental loggers in-place.

Parameters:

Name Type Description Default subdirectory str

Whether to append a subdirectory to the output log.

'' Source code in src/eva/core/trainers/trainer.py
def setup_log_dirs(self, subdirectory: str = \"\") -> None:\n    \"\"\"Setups the logging directory of the trainer and experimental loggers in-place.\n\n    Args:\n        subdirectory: Whether to append a subdirectory to the output log.\n    \"\"\"\n    self._log_dir = os.path.join(self.default_root_dir, self._session_id, subdirectory)\n\n    enabled_loggers = []\n    if isinstance(self.loggers, list) and len(self.loggers) > 0:\n        for logger in self.loggers:\n            if isinstance(logger, (pl_loggers.CSVLogger, pl_loggers.TensorBoardLogger)):\n                if not cloud_io._is_local_file_protocol(self.default_root_dir):\n                    loguru.logger.warning(\n                        f\"Skipped {type(logger).__name__} as remote storage is not supported.\"\n                    )\n                    continue\n                else:\n                    logger._root_dir = self.default_root_dir\n                    logger._name = self._session_id\n                    logger._version = subdirectory\n            enabled_loggers.append(logger)\n\n    self._loggers = enabled_loggers or [eva_loggers.DummyLogger(self._log_dir)]\n
"},{"location":"reference/core/trainers/trainer/#eva.core.trainers.Trainer.run_evaluation_session","title":"run_evaluation_session","text":"

Runs an evaluation session out-of-place.

It performs an evaluation run (fit and evaluate) the model self._n_run times. Note that the input base_model would not be modified, so the weights of the input model will remain as they are.

Parameters:

Name Type Description Default model ModelModule

The base model module to evaluate.

required datamodule DataModule

The data module.

required Source code in src/eva/core/trainers/trainer.py
def run_evaluation_session(\n    self,\n    model: modules.ModelModule,\n    datamodule: datamodules.DataModule,\n) -> None:\n    \"\"\"Runs an evaluation session out-of-place.\n\n    It performs an evaluation run (fit and evaluate) the model\n    `self._n_run` times. Note that the input `base_model` would\n    not be modified, so the weights of the input model will remain\n    as they are.\n\n    Args:\n        model: The base model module to evaluate.\n        datamodule: The data module.\n    \"\"\"\n    functional.run_evaluation_session(\n        base_trainer=self,\n        base_model=model,\n        datamodule=datamodule,\n        n_runs=self._n_runs,\n        verbose=self._n_runs > 1,\n    )\n
"},{"location":"reference/core/utils/multiprocessing/","title":"Multiprocessing","text":"

Reference information for the utils Multiprocessing API.

"},{"location":"reference/core/utils/multiprocessing/#eva.core.utils.multiprocessing.Process","title":"eva.core.utils.multiprocessing.Process","text":"

Bases: Process

Multiprocessing wrapper with logic to propagate exceptions to the parent process.

Source: https://stackoverflow.com/a/33599967/4992248

Source code in src/eva/core/utils/multiprocessing.py
def __init__(self, *args: Any, **kwargs: Any) -> None:\n    \"\"\"Initialize the process.\"\"\"\n    multiprocessing.Process.__init__(self, *args, **kwargs)\n\n    self._parent_conn, self._child_conn = multiprocessing.Pipe()\n    self._exception = None\n
"},{"location":"reference/core/utils/multiprocessing/#eva.core.utils.multiprocessing.Process.exception","title":"exception property","text":"

Property that contains exception information from the process.

"},{"location":"reference/core/utils/multiprocessing/#eva.core.utils.multiprocessing.Process.run","title":"run","text":"

Run the process.

Source code in src/eva/core/utils/multiprocessing.py
def run(self) -> None:\n    \"\"\"Run the process.\"\"\"\n    try:\n        multiprocessing.Process.run(self)\n        self._child_conn.send(None)\n    except Exception as e:\n        tb = traceback.format_exc()\n        self._child_conn.send((e, tb))\n
"},{"location":"reference/core/utils/multiprocessing/#eva.core.utils.multiprocessing.Process.check_exceptions","title":"check_exceptions","text":"

Check for exception propagate it to the parent process.

Source code in src/eva/core/utils/multiprocessing.py
def check_exceptions(self) -> None:\n    \"\"\"Check for exception propagate it to the parent process.\"\"\"\n    if not self.is_alive():\n        if self.exception:\n            error, traceback = self.exception\n            sys.stderr.write(traceback + \"\\n\")\n            raise error\n
"},{"location":"reference/core/utils/workers/","title":"Workers","text":"

Reference information for the utils Workers API.

"},{"location":"reference/core/utils/workers/#eva.core.utils.workers.main_worker_only","title":"eva.core.utils.workers.main_worker_only","text":"

Function decorator which will execute it only on main / worker process.

Source code in src/eva/core/utils/workers.py
def main_worker_only(func: Callable) -> Any:\n    \"\"\"Function decorator which will execute it only on main / worker process.\"\"\"\n\n    def wrapper(*args: Any, **kwargs: Any) -> Any:\n        \"\"\"Wrapper function for the decorated method.\"\"\"\n        if is_main_worker():\n            return func(*args, **kwargs)\n\n    return wrapper\n
"},{"location":"reference/core/utils/workers/#eva.core.utils.workers.is_main_worker","title":"eva.core.utils.workers.is_main_worker","text":"

Returns whether the main process / worker is currently used.

Source code in src/eva/core/utils/workers.py
def is_main_worker() -> bool:\n    \"\"\"Returns whether the main process / worker is currently used.\"\"\"\n    process = multiprocessing.current_process()\n    return process.name == \"MainProcess\"\n
"},{"location":"reference/vision/","title":"Vision","text":"

Reference information for the Vision API.

If you have not already installed the Vision-package, install it with:

pip install 'kaiko-eva[vision]'\n

"},{"location":"reference/vision/utils/","title":"Utils","text":""},{"location":"reference/vision/utils/#eva.vision.utils.io.image","title":"eva.vision.utils.io.image","text":"

Image I/O related functions.

"},{"location":"reference/vision/utils/#eva.vision.utils.io.image.read_image","title":"read_image","text":"

Reads and loads the image from a file path as a RGB.

Parameters:

Name Type Description Default path str

The path of the image file.

required

Returns:

Type Description NDArray[uint8]

The RGB image as a numpy array.

Raises:

Type Description FileExistsError

If the path does not exist or it is unreachable.

IOError

If the image could not be loaded.

Source code in src/eva/vision/utils/io/image.py
def read_image(path: str) -> npt.NDArray[np.uint8]:\n    \"\"\"Reads and loads the image from a file path as a RGB.\n\n    Args:\n        path: The path of the image file.\n\n    Returns:\n        The RGB image as a numpy array.\n\n    Raises:\n        FileExistsError: If the path does not exist or it is unreachable.\n        IOError: If the image could not be loaded.\n    \"\"\"\n    return read_image_as_array(path, cv2.IMREAD_COLOR)\n
"},{"location":"reference/vision/utils/#eva.vision.utils.io.image.read_image_as_array","title":"read_image_as_array","text":"

Reads and loads an image file as a numpy array.

Parameters:

Name Type Description Default path str

The path to the image file.

required flags int

Specifies the way in which the image should be read.

IMREAD_UNCHANGED

Returns:

Type Description NDArray[uint8]

The image as a numpy array.

Raises:

Type Description FileExistsError

If the path does not exist or it is unreachable.

IOError

If the image could not be loaded.

Source code in src/eva/vision/utils/io/image.py
def read_image_as_array(path: str, flags: int = cv2.IMREAD_UNCHANGED) -> npt.NDArray[np.uint8]:\n    \"\"\"Reads and loads an image file as a numpy array.\n\n    Args:\n        path: The path to the image file.\n        flags: Specifies the way in which the image should be read.\n\n    Returns:\n        The image as a numpy array.\n\n    Raises:\n        FileExistsError: If the path does not exist or it is unreachable.\n        IOError: If the image could not be loaded.\n    \"\"\"\n    _utils.check_file(path)\n    image = cv2.imread(path, flags=flags)\n    if image is None:\n        raise IOError(\n            f\"Input '{path}' could not be loaded. \"\n            \"Please verify that the path is a valid image file.\"\n        )\n\n    if image.ndim == 3:\n        image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)\n\n    if image.ndim == 2 and flags == cv2.IMREAD_COLOR:\n        image = image[:, :, np.newaxis]\n\n    return np.asarray(image).astype(np.uint8)\n
"},{"location":"reference/vision/utils/#eva.vision.utils.io.nifti","title":"eva.vision.utils.io.nifti","text":"

NIfTI I/O related functions.

"},{"location":"reference/vision/utils/#eva.vision.utils.io.nifti.read_nifti_slice","title":"read_nifti_slice","text":"

Reads and loads a NIfTI image from a file path as uint8.

Parameters:

Name Type Description Default path str

The path to the NIfTI file.

required slice_index int

The image slice index to return.

required use_storage_dtype bool

Whether to cast the raw image array to the inferred type.

True

Returns:

Type Description NDArray[Any]

The image as a numpy array (height, width, channels).

Raises:

Type Description FileExistsError

If the path does not exist or it is unreachable.

ValueError

If the input channel is invalid for the image.

Source code in src/eva/vision/utils/io/nifti.py
def read_nifti_slice(\n    path: str, slice_index: int, *, use_storage_dtype: bool = True\n) -> npt.NDArray[Any]:\n    \"\"\"Reads and loads a NIfTI image from a file path as `uint8`.\n\n    Args:\n        path: The path to the NIfTI file.\n        slice_index: The image slice index to return.\n        use_storage_dtype: Whether to cast the raw image\n            array to the inferred type.\n\n    Returns:\n        The image as a numpy array (height, width, channels).\n\n    Raises:\n        FileExistsError: If the path does not exist or it is unreachable.\n        ValueError: If the input channel is invalid for the image.\n    \"\"\"\n    _utils.check_file(path)\n    image_data = nib.load(path)  # type: ignore\n    image_slice = image_data.slicer[:, :, slice_index : slice_index + 1]  # type: ignore\n    image_array = image_slice.get_fdata()\n    if use_storage_dtype:\n        image_array = image_array.astype(image_data.get_data_dtype())  # type: ignore\n    return image_array\n
"},{"location":"reference/vision/utils/#eva.vision.utils.io.nifti.fetch_total_nifti_slices","title":"fetch_total_nifti_slices","text":"

Fetches the total slides of a NIfTI image file.

Parameters:

Name Type Description Default path str

The path to the NIfTI file.

required

Returns:

Type Description int

The number of the total available slides.

Raises:

Type Description FileExistsError

If the path does not exist or it is unreachable.

ValueError

If the input channel is invalid for the image.

Source code in src/eva/vision/utils/io/nifti.py
def fetch_total_nifti_slices(path: str) -> int:\n    \"\"\"Fetches the total slides of a NIfTI image file.\n\n    Args:\n        path: The path to the NIfTI file.\n\n    Returns:\n        The number of the total available slides.\n\n    Raises:\n        FileExistsError: If the path does not exist or it is unreachable.\n        ValueError: If the input channel is invalid for the image.\n    \"\"\"\n    _utils.check_file(path)\n    image = nib.load(path)  # type: ignore\n    image_shape = image.header.get_data_shape()  # type: ignore\n    return image_shape[-1]\n
"},{"location":"reference/vision/data/","title":"Vision Data","text":"

Reference information for the Vision Data API.

"},{"location":"reference/vision/data/datasets/","title":"Datasets","text":""},{"location":"reference/vision/data/datasets/#visiondataset","title":"VisionDataset","text":""},{"location":"reference/vision/data/datasets/#eva.vision.data.datasets.VisionDataset","title":"eva.vision.data.datasets.VisionDataset","text":"

Bases: Dataset, ABC, Generic[DataSample]

Base dataset class for vision tasks.

"},{"location":"reference/vision/data/datasets/#eva.vision.data.datasets.VisionDataset.filename","title":"filename abstractmethod","text":"

Returns the filename of the index'th data sample.

Note that this is the relative file path to the root.

Parameters:

Name Type Description Default index int

The index of the data-sample to select.

required

Returns:

Type Description str

The filename of the index'th data sample.

Source code in src/eva/vision/data/datasets/vision.py
@abc.abstractmethod\ndef filename(self, index: int) -> str:\n    \"\"\"Returns the filename of the `index`'th data sample.\n\n    Note that this is the relative file path to the root.\n\n    Args:\n        index: The index of the data-sample to select.\n\n    Returns:\n        The filename of the `index`'th data sample.\n    \"\"\"\n
"},{"location":"reference/vision/data/datasets/#classification-datasets","title":"Classification datasets","text":""},{"location":"reference/vision/data/datasets/#eva.vision.data.datasets.BACH","title":"eva.vision.data.datasets.BACH","text":"

Bases: ImageClassification

Dataset class for BACH images and corresponding targets.

The dataset is split into train and validation by taking into account the patient IDs to avoid any data leakage.

Parameters:

Name Type Description Default root str

Path to the root directory of the dataset. The dataset will be downloaded and extracted here, if it does not already exist.

required split Literal['train', 'val'] | None

Dataset split to use. If None, the entire dataset is used.

None download bool

Whether to download the data for the specified split. Note that the download will be executed only by additionally calling the :meth:prepare_data method and if the data does not yet exist on disk.

False image_transforms Callable | None

A function/transform that takes in an image and returns a transformed version.

None target_transforms Callable | None

A function/transform that takes in the target and transforms it.

None Source code in src/eva/vision/data/datasets/classification/bach.py
def __init__(\n    self,\n    root: str,\n    split: Literal[\"train\", \"val\"] | None = None,\n    download: bool = False,\n    image_transforms: Callable | None = None,\n    target_transforms: Callable | None = None,\n) -> None:\n    \"\"\"Initialize the dataset.\n\n    The dataset is split into train and validation by taking into account\n    the patient IDs to avoid any data leakage.\n\n    Args:\n        root: Path to the root directory of the dataset. The dataset will\n            be downloaded and extracted here, if it does not already exist.\n        split: Dataset split to use. If `None`, the entire dataset is used.\n        download: Whether to download the data for the specified split.\n            Note that the download will be executed only by additionally\n            calling the :meth:`prepare_data` method and if the data does\n            not yet exist on disk.\n        image_transforms: A function/transform that takes in an image\n            and returns a transformed version.\n        target_transforms: A function/transform that takes in the target\n            and transforms it.\n    \"\"\"\n    super().__init__(\n        image_transforms=image_transforms,\n        target_transforms=target_transforms,\n    )\n\n    self._root = root\n    self._split = split\n    self._download = download\n\n    self._samples: List[Tuple[str, int]] = []\n    self._indices: List[int] = []\n
"},{"location":"reference/vision/data/datasets/#eva.vision.data.datasets.PatchCamelyon","title":"eva.vision.data.datasets.PatchCamelyon","text":"

Bases: ImageClassification

Dataset class for PatchCamelyon images and corresponding targets.

Parameters:

Name Type Description Default root str

The path to the dataset root. This path should contain the uncompressed h5 files and the metadata.

required split Literal['train', 'val', 'test']

The dataset split for training, validation, or testing.

required download bool

Whether to download the data for the specified split. Note that the download will be executed only by additionally calling the :meth:prepare_data method.

False image_transforms Callable | None

A function/transform that takes in an image and returns a transformed version.

None target_transforms Callable | None

A function/transform that takes in the target and transforms it.

None Source code in src/eva/vision/data/datasets/classification/patch_camelyon.py
def __init__(\n    self,\n    root: str,\n    split: Literal[\"train\", \"val\", \"test\"],\n    download: bool = False,\n    image_transforms: Callable | None = None,\n    target_transforms: Callable | None = None,\n) -> None:\n    \"\"\"Initializes the dataset.\n\n    Args:\n        root: The path to the dataset root. This path should contain\n            the uncompressed h5 files and the metadata.\n        split: The dataset split for training, validation, or testing.\n        download: Whether to download the data for the specified split.\n            Note that the download will be executed only by additionally\n            calling the :meth:`prepare_data` method.\n        image_transforms: A function/transform that takes in an image\n            and returns a transformed version.\n        target_transforms: A function/transform that takes in the target\n            and transforms it.\n    \"\"\"\n    super().__init__(\n        image_transforms=image_transforms,\n        target_transforms=target_transforms,\n    )\n\n    self._root = root\n    self._split = split\n    self._download = download\n
"},{"location":"reference/vision/data/datasets/#segmentation-datasets","title":"Segmentation datasets","text":""},{"location":"reference/vision/data/datasets/#eva.vision.data.datasets.ImageSegmentation","title":"eva.vision.data.datasets.ImageSegmentation","text":"

Bases: VisionDataset[Tuple[Image, Mask]], ABC

Image segmentation abstract dataset.

Parameters:

Name Type Description Default transforms Callable | None

A function/transforms that takes in an image and a label and returns the transformed versions of both.

None Source code in src/eva/vision/data/datasets/segmentation/base.py
def __init__(\n    self,\n    transforms: Callable | None = None,\n) -> None:\n    \"\"\"Initializes the image segmentation base class.\n\n    Args:\n        transforms: A function/transforms that takes in an\n            image and a label and returns the transformed versions of both.\n    \"\"\"\n    super().__init__()\n\n    self._transforms = transforms\n
"},{"location":"reference/vision/data/datasets/#eva.vision.data.datasets.ImageSegmentation.classes","title":"classes: List[str] | None property","text":"

Returns the list with names of the dataset names.

"},{"location":"reference/vision/data/datasets/#eva.vision.data.datasets.ImageSegmentation.class_to_idx","title":"class_to_idx: Dict[str, int] | None property","text":"

Returns a mapping of the class name to its target index.

"},{"location":"reference/vision/data/datasets/#eva.vision.data.datasets.ImageSegmentation.load_metadata","title":"load_metadata","text":"

Returns the dataset metadata.

Parameters:

Name Type Description Default index int | None

The index of the data sample to return the metadata of. If None, it will return the metadata of the current dataset.

required

Returns:

Type Description Dict[str, Any] | List[Dict[str, Any]] | None

The sample metadata.

Source code in src/eva/vision/data/datasets/segmentation/base.py
def load_metadata(self, index: int | None) -> Dict[str, Any] | List[Dict[str, Any]] | None:\n    \"\"\"Returns the dataset metadata.\n\n    Args:\n        index: The index of the data sample to return the metadata of.\n            If `None`, it will return the metadata of the current dataset.\n\n    Returns:\n        The sample metadata.\n    \"\"\"\n
"},{"location":"reference/vision/data/datasets/#eva.vision.data.datasets.ImageSegmentation.load_image","title":"load_image abstractmethod","text":"

Loads and returns the index'th image sample.

Parameters:

Name Type Description Default index int

The index of the data sample to load.

required

Returns:

Type Description Image

An image torchvision tensor (channels, height, width).

Source code in src/eva/vision/data/datasets/segmentation/base.py
@abc.abstractmethod\ndef load_image(self, index: int) -> tv_tensors.Image:\n    \"\"\"Loads and returns the `index`'th image sample.\n\n    Args:\n        index: The index of the data sample to load.\n\n    Returns:\n        An image torchvision tensor (channels, height, width).\n    \"\"\"\n
"},{"location":"reference/vision/data/datasets/#eva.vision.data.datasets.ImageSegmentation.load_mask","title":"load_mask abstractmethod","text":"

Returns the index'th target masks sample.

Parameters:

Name Type Description Default index int

The index of the data sample target masks to load.

required

Returns:

Type Description Mask

The semantic mask as a (H x W) shaped tensor with integer

Mask

values which represent the pixel class id.

Source code in src/eva/vision/data/datasets/segmentation/base.py
@abc.abstractmethod\ndef load_mask(self, index: int) -> tv_tensors.Mask:\n    \"\"\"Returns the `index`'th target masks sample.\n\n    Args:\n        index: The index of the data sample target masks to load.\n\n    Returns:\n        The semantic mask as a (H x W) shaped tensor with integer\n        values which represent the pixel class id.\n    \"\"\"\n
"},{"location":"reference/vision/data/datasets/#eva.vision.data.datasets.TotalSegmentator2D","title":"eva.vision.data.datasets.TotalSegmentator2D","text":"

Bases: ImageSegmentation

TotalSegmentator 2D segmentation dataset.

Parameters:

Name Type Description Default root str

Path to the root directory of the dataset. The dataset will be downloaded and extracted here, if it does not already exist.

required split Literal['train', 'val'] | None

Dataset split to use. If None, the entire dataset is used.

required version Literal['small', 'full'] | None

The version of the dataset to initialize. If None, it will use the files located at root as is and wont perform any checks.

'small' download bool

Whether to download the data for the specified split. Note that the download will be executed only by additionally calling the :meth:prepare_data method and if the data does not exist yet on disk.

False as_uint8 bool

Whether to convert and return the images as a 8-bit.

True transforms Callable | None

A function/transforms that takes in an image and a target mask and returns the transformed versions of both.

None Source code in src/eva/vision/data/datasets/segmentation/total_segmentator.py
def __init__(\n    self,\n    root: str,\n    split: Literal[\"train\", \"val\"] | None,\n    version: Literal[\"small\", \"full\"] | None = \"small\",\n    download: bool = False,\n    as_uint8: bool = True,\n    transforms: Callable | None = None,\n) -> None:\n    \"\"\"Initialize dataset.\n\n    Args:\n        root: Path to the root directory of the dataset. The dataset will\n            be downloaded and extracted here, if it does not already exist.\n        split: Dataset split to use. If `None`, the entire dataset is used.\n        version: The version of the dataset to initialize. If `None`, it will\n            use the files located at root as is and wont perform any checks.\n        download: Whether to download the data for the specified split.\n            Note that the download will be executed only by additionally\n            calling the :meth:`prepare_data` method and if the data does not\n            exist yet on disk.\n        as_uint8: Whether to convert and return the images as a 8-bit.\n        transforms: A function/transforms that takes in an image and a target\n            mask and returns the transformed versions of both.\n    \"\"\"\n    super().__init__(transforms=transforms)\n\n    self._root = root\n    self._split = split\n    self._version = version\n    self._download = download\n    self._as_uint8 = as_uint8\n\n    self._samples_dirs: List[str] = []\n    self._indices: List[Tuple[int, int]] = []\n
"},{"location":"reference/vision/data/transforms/","title":"Transforms","text":""},{"location":"reference/vision/data/transforms/#eva.core.data.transforms.dtype.ArrayToTensor","title":"eva.core.data.transforms.dtype.ArrayToTensor","text":"

Converts a numpy array to a torch tensor.

"},{"location":"reference/vision/data/transforms/#eva.core.data.transforms.dtype.ArrayToFloatTensor","title":"eva.core.data.transforms.dtype.ArrayToFloatTensor","text":"

Bases: ArrayToTensor

Converts a numpy array to a torch tensor and casts it to float.

"},{"location":"reference/vision/data/transforms/#eva.vision.data.transforms.ResizeAndCrop","title":"eva.vision.data.transforms.ResizeAndCrop","text":"

Bases: Compose

Resizes, crops and normalizes an input image while preserving its aspect ratio.

Parameters:

Name Type Description Default size int | Sequence[int]

Desired output size of the crop. If size is an int instead of sequence like (h, w), a square crop (size, size) is made.

224 mean Sequence[float]

Sequence of means for each image channel.

(0.5, 0.5, 0.5) std Sequence[float]

Sequence of standard deviations for each image channel.

(0.5, 0.5, 0.5) Source code in src/eva/vision/data/transforms/common/resize_and_crop.py
def __init__(\n    self,\n    size: int | Sequence[int] = 224,\n    mean: Sequence[float] = (0.5, 0.5, 0.5),\n    std: Sequence[float] = (0.5, 0.5, 0.5),\n) -> None:\n    \"\"\"Initializes the transform object.\n\n    Args:\n        size: Desired output size of the crop. If size is an `int` instead\n            of sequence like (h, w), a square crop (size, size) is made.\n        mean: Sequence of means for each image channel.\n        std: Sequence of standard deviations for each image channel.\n    \"\"\"\n    self._size = size\n    self._mean = mean\n    self._std = std\n\n    super().__init__(transforms=self._build_transforms())\n
"},{"location":"reference/vision/models/networks/","title":"Networks","text":""},{"location":"reference/vision/models/networks/#eva.vision.models.networks.ABMIL","title":"eva.vision.models.networks.ABMIL","text":"

Bases: Module

ABMIL network for multiple instance learning classification tasks.

Takes an array of patch level embeddings per slide as input. This implementation supports batched inputs of shape (batch_size, n_instances, input_size). For slides with less than n_instances patches, you can apply padding and provide a mask tensor to the forward pass.

The original implementation from [1] was used as a reference: https://github.com/AMLab-Amsterdam/AttentionDeepMIL/blob/master/model.py

Notes

[1] Maximilian Ilse, Jakub M. Tomczak, Max Welling, \"Attention-based Deep Multiple Instance Learning\", 2018 https://arxiv.org/abs/1802.04712

Parameters:

Name Type Description Default input_size int

input embedding dimension

required output_size int

number of classes

required projected_input_size int | None

size of the projected input. if None, no projection is performed.

required hidden_size_attention int

hidden dimension in attention network

128 hidden_sizes_mlp tuple

dimensions for hidden layers in last mlp

(128, 64) use_bias bool

whether to use bias in the attention network

True dropout_input_embeddings float

dropout rate for the input embeddings

0.0 dropout_attention float

dropout rate for the attention network and classifier

0.0 dropout_mlp float

dropout rate for the final MLP network

0.0 pad_value int | float | None

Value indicating padding in the input tensor. If specified, entries with this value in the will be masked. If set to None, no masking is applied.

float('-inf') Source code in src/eva/vision/models/networks/abmil.py
def __init__(\n    self,\n    input_size: int,\n    output_size: int,\n    projected_input_size: int | None,\n    hidden_size_attention: int = 128,\n    hidden_sizes_mlp: tuple = (128, 64),\n    use_bias: bool = True,\n    dropout_input_embeddings: float = 0.0,\n    dropout_attention: float = 0.0,\n    dropout_mlp: float = 0.0,\n    pad_value: int | float | None = float(\"-inf\"),\n) -> None:\n    \"\"\"Initializes the ABMIL network.\n\n    Args:\n        input_size: input embedding dimension\n        output_size: number of classes\n        projected_input_size: size of the projected input. if `None`, no projection is\n            performed.\n        hidden_size_attention: hidden dimension in attention network\n        hidden_sizes_mlp: dimensions for hidden layers in last mlp\n        use_bias: whether to use bias in the attention network\n        dropout_input_embeddings: dropout rate for the input embeddings\n        dropout_attention: dropout rate for the attention network and classifier\n        dropout_mlp: dropout rate for the final MLP network\n        pad_value: Value indicating padding in the input tensor. If specified, entries with\n            this value in the will be masked. If set to `None`, no masking is applied.\n    \"\"\"\n    super().__init__()\n\n    self._pad_value = pad_value\n\n    if projected_input_size:\n        self.projector = nn.Sequential(\n            nn.Linear(input_size, projected_input_size, bias=True),\n            nn.Dropout(p=dropout_input_embeddings),\n        )\n        input_size = projected_input_size\n    else:\n        self.projector = nn.Dropout(p=dropout_input_embeddings)\n\n    self.gated_attention = GatedAttention(\n        input_dim=input_size,\n        hidden_dim=hidden_size_attention,\n        dropout=dropout_attention,\n        n_classes=1,\n        use_bias=use_bias,\n    )\n\n    self.classifier = MLP(\n        input_size=input_size,\n        output_size=output_size,\n        hidden_layer_sizes=hidden_sizes_mlp,\n        dropout=dropout_mlp,\n        hidden_activation_fn=nn.ReLU,\n    )\n
"},{"location":"reference/vision/models/networks/#eva.vision.models.networks.ABMIL.forward","title":"forward","text":"

Forward pass.

Parameters:

Name Type Description Default input_tensor Tensor

Tensor with expected shape of (batch_size, n_instances, input_size).

required Source code in src/eva/vision/models/networks/abmil.py
def forward(self, input_tensor: torch.Tensor) -> torch.Tensor:\n    \"\"\"Forward pass.\n\n    Args:\n        input_tensor: Tensor with expected shape of (batch_size, n_instances, input_size).\n    \"\"\"\n    input_tensor, mask = self._mask_values(input_tensor, self._pad_value)\n\n    # (batch_size, n_instances, input_size) -> (batch_size, n_instances, projected_input_size)\n    input_tensor = self.projector(input_tensor)\n\n    attention_logits = self.gated_attention(input_tensor)  # (batch_size, n_instances, 1)\n    if mask is not None:\n        # fill masked values with -inf, which will yield 0s after softmax\n        attention_logits = attention_logits.masked_fill(mask, float(\"-inf\"))\n\n    attention_weights = nn.functional.softmax(attention_logits, dim=1)\n    # (batch_size, n_instances, 1)\n\n    attention_result = torch.matmul(torch.transpose(attention_weights, 1, 2), input_tensor)\n    # (batch_size, 1, hidden_size_attention)\n\n    attention_result = torch.squeeze(attention_result, 1)  # (batch_size, hidden_size_attention)\n\n    return self.classifier(attention_result)  # (batch_size, output_size)\n
"},{"location":"reference/vision/utils/io/","title":"IO","text":""},{"location":"reference/vision/utils/io/#eva.vision.utils.io.image","title":"eva.vision.utils.io.image","text":"

Image I/O related functions.

"},{"location":"reference/vision/utils/io/#eva.vision.utils.io.image.read_image","title":"read_image","text":"

Reads and loads the image from a file path as a RGB.

Parameters:

Name Type Description Default path str

The path of the image file.

required

Returns:

Type Description NDArray[uint8]

The RGB image as a numpy array.

Raises:

Type Description FileExistsError

If the path does not exist or it is unreachable.

IOError

If the image could not be loaded.

Source code in src/eva/vision/utils/io/image.py
def read_image(path: str) -> npt.NDArray[np.uint8]:\n    \"\"\"Reads and loads the image from a file path as a RGB.\n\n    Args:\n        path: The path of the image file.\n\n    Returns:\n        The RGB image as a numpy array.\n\n    Raises:\n        FileExistsError: If the path does not exist or it is unreachable.\n        IOError: If the image could not be loaded.\n    \"\"\"\n    return read_image_as_array(path, cv2.IMREAD_COLOR)\n
"},{"location":"reference/vision/utils/io/#eva.vision.utils.io.image.read_image_as_array","title":"read_image_as_array","text":"

Reads and loads an image file as a numpy array.

Parameters:

Name Type Description Default path str

The path to the image file.

required flags int

Specifies the way in which the image should be read.

IMREAD_UNCHANGED

Returns:

Type Description NDArray[uint8]

The image as a numpy array.

Raises:

Type Description FileExistsError

If the path does not exist or it is unreachable.

IOError

If the image could not be loaded.

Source code in src/eva/vision/utils/io/image.py
def read_image_as_array(path: str, flags: int = cv2.IMREAD_UNCHANGED) -> npt.NDArray[np.uint8]:\n    \"\"\"Reads and loads an image file as a numpy array.\n\n    Args:\n        path: The path to the image file.\n        flags: Specifies the way in which the image should be read.\n\n    Returns:\n        The image as a numpy array.\n\n    Raises:\n        FileExistsError: If the path does not exist or it is unreachable.\n        IOError: If the image could not be loaded.\n    \"\"\"\n    _utils.check_file(path)\n    image = cv2.imread(path, flags=flags)\n    if image is None:\n        raise IOError(\n            f\"Input '{path}' could not be loaded. \"\n            \"Please verify that the path is a valid image file.\"\n        )\n\n    if image.ndim == 3:\n        image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)\n\n    if image.ndim == 2 and flags == cv2.IMREAD_COLOR:\n        image = image[:, :, np.newaxis]\n\n    return np.asarray(image).astype(np.uint8)\n
"},{"location":"reference/vision/utils/io/#eva.vision.utils.io.nifti","title":"eva.vision.utils.io.nifti","text":"

NIfTI I/O related functions.

"},{"location":"reference/vision/utils/io/#eva.vision.utils.io.nifti.read_nifti_slice","title":"read_nifti_slice","text":"

Reads and loads a NIfTI image from a file path as uint8.

Parameters:

Name Type Description Default path str

The path to the NIfTI file.

required slice_index int

The image slice index to return.

required use_storage_dtype bool

Whether to cast the raw image array to the inferred type.

True

Returns:

Type Description NDArray[Any]

The image as a numpy array (height, width, channels).

Raises:

Type Description FileExistsError

If the path does not exist or it is unreachable.

ValueError

If the input channel is invalid for the image.

Source code in src/eva/vision/utils/io/nifti.py
def read_nifti_slice(\n    path: str, slice_index: int, *, use_storage_dtype: bool = True\n) -> npt.NDArray[Any]:\n    \"\"\"Reads and loads a NIfTI image from a file path as `uint8`.\n\n    Args:\n        path: The path to the NIfTI file.\n        slice_index: The image slice index to return.\n        use_storage_dtype: Whether to cast the raw image\n            array to the inferred type.\n\n    Returns:\n        The image as a numpy array (height, width, channels).\n\n    Raises:\n        FileExistsError: If the path does not exist or it is unreachable.\n        ValueError: If the input channel is invalid for the image.\n    \"\"\"\n    _utils.check_file(path)\n    image_data = nib.load(path)  # type: ignore\n    image_slice = image_data.slicer[:, :, slice_index : slice_index + 1]  # type: ignore\n    image_array = image_slice.get_fdata()\n    if use_storage_dtype:\n        image_array = image_array.astype(image_data.get_data_dtype())  # type: ignore\n    return image_array\n
"},{"location":"reference/vision/utils/io/#eva.vision.utils.io.nifti.fetch_total_nifti_slices","title":"fetch_total_nifti_slices","text":"

Fetches the total slides of a NIfTI image file.

Parameters:

Name Type Description Default path str

The path to the NIfTI file.

required

Returns:

Type Description int

The number of the total available slides.

Raises:

Type Description FileExistsError

If the path does not exist or it is unreachable.

ValueError

If the input channel is invalid for the image.

Source code in src/eva/vision/utils/io/nifti.py
def fetch_total_nifti_slices(path: str) -> int:\n    \"\"\"Fetches the total slides of a NIfTI image file.\n\n    Args:\n        path: The path to the NIfTI file.\n\n    Returns:\n        The number of the total available slides.\n\n    Raises:\n        FileExistsError: If the path does not exist or it is unreachable.\n        ValueError: If the input channel is invalid for the image.\n    \"\"\"\n    _utils.check_file(path)\n    image = nib.load(path)  # type: ignore\n    image_shape = image.header.get_data_shape()  # type: ignore\n    return image_shape[-1]\n
"},{"location":"user-guide/","title":"User Guide","text":"

Here you can find everything you need to install, understand and interact with eva.

"},{"location":"user-guide/#getting-started","title":"Getting started","text":"

Install eva on your machine and learn how to use eva.

"},{"location":"user-guide/#tutorials","title":"Tutorials","text":"

To familiarize yourself with eva, try out some of our tutorials.

"},{"location":"user-guide/#advanced-user-guide","title":"Advanced user guide","text":"

Get to know eva in more depth by studying our advanced user guides.

"},{"location":"user-guide/advanced/model_wrappers/","title":"Model Wrappers","text":"

This document shows how to use eva's Model Wrapper API (eva.models.networks.wrappers) to load different model formats from a series of sources such as PyTorch Hub, HuggingFace Model Hub and ONNX.

"},{"location":"user-guide/advanced/model_wrappers/#loading-pytorch-models","title":"Loading PyTorch models","text":"

The eva framework is built on top of PyTorch Lightning and thus naturally supports loading PyTorch models. You just need to specify the class path of your model in the backbone section of the .yaml config file.

backbone:\n  class_path: path.to.your.ModelClass\n  init_args:\n    arg_1: ...\n    arg_2: ...\n

Note that your ModelClass should subclass torch.nn.Module and implement the forward() method to return embedding tensors of shape [embedding_dim].

"},{"location":"user-guide/advanced/model_wrappers/#pytorch-hub","title":"PyTorch Hub","text":"

To load models from PyTorch Hub or other torch model providers, the easiest way is to use the ModelFromFunction wrapper class:

backbone:\n  class_path: eva.models.networks.wrappers.ModelFromFunction\n  init_args:\n    path: torch.hub.load\n    arguments:\n      repo_or_dir: facebookresearch/dino:main\n      model: dino_vits16\n      pretrained: false\n    checkpoint_path: path/to/your/checkpoint.torch\n

Note that if a checkpoint_path is provided, ModelFromFunction will automatically initialize the specified model using the provided weights from that checkpoint file.

"},{"location":"user-guide/advanced/model_wrappers/#timm","title":"timm","text":"

Similar to the above example, we can easily load models using the common vision library timm:

backbone:\n  class_path: eva.models.networks.wrappers.ModelFromFunction\n  init_args:\n    path: timm.create_model\n    arguments:\n      model_name: resnet18\n      pretrained: true\n

"},{"location":"user-guide/advanced/model_wrappers/#loading-models-from-huggingface-hub","title":"Loading models from HuggingFace Hub","text":"

For loading models from HuggingFace Hub, eva provides a custom wrapper class HuggingFaceModel which can be used as follows:

backbone:\n  class_path: eva.models.networks.wrappers.HuggingFaceModel\n  init_args:\n    model_name_or_path: owkin/phikon\n    tensor_transforms: \n      class_path: eva.models.networks.transforms.ExtractCLSFeatures\n

In the above example, the forward pass implemented by the owkin/phikon model returns an output tensor containing the hidden states of all input tokens. In order to extract the state corresponding to the CLS token only, we can specify a transformation via the tensor_transforms argument which will be applied to the model output.

"},{"location":"user-guide/advanced/model_wrappers/#loading-onnx-models","title":"Loading ONNX models","text":"

.onnx model checkpoints can be loaded using the ONNXModel wrapper class as follows:

class_path: eva.models.networks.wrappers.ONNXModel\ninit_args:\n  path: path/to/model.onnx\n  device: cuda\n
"},{"location":"user-guide/advanced/model_wrappers/#implementing-custom-model-wrappers","title":"Implementing custom model wrappers","text":"

You can also implement your own model wrapper classes, in case your model format is not supported by the wrapper classes that eva already provides. To do so, you need to subclass eva.models.networks.wrappers.BaseModel and implement the following abstract methods:

You can take the implementations of ModelFromFunction, HuggingFaceModel and ONNXModel wrappers as a reference.

"},{"location":"user-guide/advanced/replicate_evaluations/","title":"Replicate evaluations","text":"

To produce the evaluation results presented here, you can run eva with the settings below.

Make sure to replace <task> in the commands below with bach, crc, mhist or patch_camelyon.

Note that to run the commands below you will need to first download the data. BACH, CRC and PatchCamelyon provide automatic download by setting the argument download: true in their respective config-files. In the case of MHIST you will need to download the data manually by following the instructions provided here.

"},{"location":"user-guide/advanced/replicate_evaluations/#dino-vit-s16-random-weights","title":"DINO ViT-S16 (random weights)","text":"

Evaluating the backbone with randomly initialized weights serves as a baseline to compare the pretrained FMs to an FM that produces embeddings without any prior learning on image tasks. To evaluate, run:

PRETRAINED=false \\\nEMBEDDINGS_ROOT=\"./data/embeddings/dino_vits16_random\" \\\neva predict_fit --config configs/vision/dino_vit/offline/<task>.yaml\n
"},{"location":"user-guide/advanced/replicate_evaluations/#dino-vit-s16-imagenet","title":"DINO ViT-S16 (ImageNet)","text":"

The next baseline model, uses a pretrained ViT-S16 backbone with ImageNet weights. To evaluate, run:

EMBEDDINGS_ROOT=\"./data/embeddings/dino_vits16_imagenet\" \\\neva predict_fit --config configs/vision/dino_vit/offline/<task>.yaml\n
"},{"location":"user-guide/advanced/replicate_evaluations/#dino-vit-b8-imagenet","title":"DINO ViT-B8 (ImageNet)","text":"

To evaluate performance on the larger ViT-B8 backbone pretrained on ImageNet, run:

EMBEDDINGS_ROOT=\"./data/embeddings/dino_vitb8_imagenet\" \\\nDINO_BACKBONE=dino_vitb8 \\\nIN_FEATURES=768 \\\neva predict_fit --config configs/vision/dino_vit/offline/<task>.yaml\n

"},{"location":"user-guide/advanced/replicate_evaluations/#dinov2-vit-l14-imagenet","title":"DINOv2 ViT-L14 (ImageNet)","text":"

To evaluate performance on Dino v2 ViT-L14 backbone pretrained on ImageNet, run:

PRETRAINED=true \\\nEMBEDDINGS_ROOT=\"./data/embeddings/dinov2_vitl14_kaiko\" \\\nREPO_OR_DIR=facebookresearch/dinov2:main \\\nDINO_BACKBONE=dinov2_vitl14_reg \\\nFORCE_RELOAD=true \\\nIN_FEATURES=1024 \\\neva predict_fit --config configs/vision/dino_vit/offline/<task>.yaml\n

"},{"location":"user-guide/advanced/replicate_evaluations/#lunit-dino-vit-s16-tcga","title":"Lunit - DINO ViT-S16 (TCGA)","text":"

Lunit, released the weights for a DINO ViT-S16 backbone, pretrained on TCGA data on GitHub. To evaluate, run:

PRETRAINED=false \\\nEMBEDDINGS_ROOT=\"./data/embeddings/dino_vits16_lunit\" \\\nCHECKPOINT_PATH=\"https://github.com/lunit-io/benchmark-ssl-pathology/releases/download/pretrained-weights/dino_vit_small_patch16_ep200.torch\" \\\nNORMALIZE_MEAN=[0.70322989,0.53606487,0.66096631] \\\nNORMALIZE_STD=[0.21716536,0.26081574,0.20723464] \\\neva predict_fit --config configs/vision/dino_vit/offline/<task>.yaml\n
"},{"location":"user-guide/advanced/replicate_evaluations/#owkin-ibot-vit-b16-tcga","title":"Owkin - iBOT ViT-B16 (TCGA)","text":"

Owkin released the weights for \"Phikon\", an FM trained with iBOT on TCGA data, via HuggingFace. To evaluate, run:

EMBEDDINGS_ROOT=\"./data/embeddings/dino_vitb16_owkin\" \\\neva predict_fit --config configs/vision/owkin/phikon/offline/<task>.yaml\n

Note: since eva provides the config files to evaluate tasks with the Phikon FM in \"configs/vision/owkin/phikon/offline\", it is not necessary to set the environment variables needed for the runs above.

"},{"location":"user-guide/advanced/replicate_evaluations/#uni-dinov2-vit-l16-mass-100k","title":"UNI - DINOv2 ViT-L16 (Mass-100k)","text":"

The UNI FM, introduced in [1] is available on HuggingFace. Note that access needs to be requested.

Unlike the other FMs evaluated for our leaderboard, the UNI model uses the vision library timm to load the model. To accomodate this, you will need to modify the config files (see also Model Wrappers).

Make a copy of the task-config you'd like to run, and replace the backbone section with:

backbone:\n    class_path: eva.models.ModelFromFunction\n    init_args:\n        path: timm.create_model\n        arguments:\n            model_name: vit_large_patch16_224\n            patch_size: 16\n            init_values: 1e-5\n            num_classes: 0\n            dynamic_img_size: true\n        checkpoint_path: <path/to/pytorch_model.bin>\n

Now evaluate the model by running:

EMBEDDINGS_ROOT=\"./data/embeddings/dinov2_vitl16_uni\" \\\nIN_FEATURES=1024 \\\neva predict_fit --config path/to/<task>.yaml\n

"},{"location":"user-guide/advanced/replicate_evaluations/#kaikoai-dino-vit-s16-tcga","title":"kaiko.ai - DINO ViT-S16 (TCGA)","text":"

To evaluate kaiko.ai's FM with DINO ViT-S16 backbone, pretrained on TCGA data on GitHub, run:

PRETRAINED=false \\\nEMBEDDINGS_ROOT=\"./data/embeddings/dino_vits16_kaiko\" \\\nCHECKPOINT_PATH=[TBD*] \\\nNORMALIZE_MEAN=[0.5,0.5,0.5] \\\nNORMALIZE_STD=[0.5,0.5,0.5] \\\neva predict_fit --config configs/vision/dino_vit/offline/<task>.yaml\n

* path to public checkpoint will be added when available.

"},{"location":"user-guide/advanced/replicate_evaluations/#kaikoai-dino-vit-s8-tcga","title":"kaiko.ai - DINO ViT-S8 (TCGA)","text":"

To evaluate kaiko.ai's FM with DINO ViT-S8 backbone, pretrained on TCGA data on GitHub, run:

PRETRAINED=false \\\nEMBEDDINGS_ROOT=\"./data/embeddings/dino_vits8_kaiko\" \\\nDINO_BACKBONE=dino_vits8 \\\nCHECKPOINT_PATH=[TBD*] \\\nNORMALIZE_MEAN=[0.5,0.5,0.5] \\\nNORMALIZE_STD=[0.5,0.5,0.5] \\\neva predict_fit --config configs/vision/dino_vit/offline/<task>.yaml\n

* path to public checkpoint will be added when available.

"},{"location":"user-guide/advanced/replicate_evaluations/#kaikoai-dino-vit-b16-tcga","title":"kaiko.ai - DINO ViT-B16 (TCGA)","text":"

To evaluate kaiko.ai's FM with the larger DINO ViT-B16 backbone, pretrained on TCGA data, run:

PRETRAINED=false \\\nEMBEDDINGS_ROOT=\"./data/embeddings/dino_vitb16_kaiko\" \\\nDINO_BACKBONE=dino_vitb16 \\\nCHECKPOINT_PATH=[TBD*] \\\nIN_FEATURES=768 \\\nNORMALIZE_MEAN=[0.5,0.5,0.5] \\\nNORMALIZE_STD=[0.5,0.5,0.5] \\\neva predict_fit --config configs/vision/dino_vit/offline/<task>.yaml\n

* path to public checkpoint will be added when available.

"},{"location":"user-guide/advanced/replicate_evaluations/#kaikoai-dino-vit-b8-tcga","title":"kaiko.ai - DINO ViT-B8 (TCGA)","text":"

To evaluate kaiko.ai's FM with the larger DINO ViT-B8 backbone, pretrained on TCGA data, run:

PRETRAINED=false \\\nEMBEDDINGS_ROOT=\"./data/embeddings/dino_vitb8_kaiko\" \\\nDINO_BACKBONE=dino_vitb8 \\\nCHECKPOINT_PATH=[TBD*] \\\nIN_FEATURES=768 \\\nNORMALIZE_MEAN=[0.5,0.5,0.5] \\\nNORMALIZE_STD=[0.5,0.5,0.5] \\\neva predict_fit --config configs/vision/dino_vit/offline/<task>.yaml\n

* path to public checkpoint will be added when available.

"},{"location":"user-guide/advanced/replicate_evaluations/#kaikoai-dinov2-vit-l14-tcga","title":"kaiko.ai - DINOv2 ViT-L14 (TCGA)","text":"

To evaluate kaiko.ai's FM with the larger DINOv2 ViT-L14 backbone, pretrained on TCGA data, run:

PRETRAINED=false \\\nEMBEDDINGS_ROOT=\"./data/embeddings/dinov2_vitl14_kaiko\" \\\nREPO_OR_DIR=facebookresearch/dinov2:main \\\nDINO_BACKBONE=dinov2_vitl14_reg \\\nFORCE_RELOAD=true \\\nCHECKPOINT_PATH=[TBD*] \\\nIN_FEATURES=1024 \\\nNORMALIZE_MEAN=[0.5,0.5,0.5] \\\nNORMALIZE_STD=[0.5,0.5,0.5] \\\neva predict_fit --config configs/vision/dino_vit/offline/<task>.yaml\n

* path to public checkpoint will be added when available.

"},{"location":"user-guide/advanced/replicate_evaluations/#references","title":"References","text":"

[1]: Chen: A General-Purpose Self-Supervised Model for Computational Pathology, 2023 (arxiv)

"},{"location":"user-guide/getting-started/how_to_use/","title":"How to use eva","text":"

Before starting to use eva, it's important to get familiar with the different workflows, subcommands and configurations.

"},{"location":"user-guide/getting-started/how_to_use/#eva-subcommands","title":"eva subcommands","text":"

To run an evaluation, we call:

eva <subcommand> --config <path-to-config-file>\n

The eva interface supports the subcommands: predict, fit and predict_fit.

"},{"location":"user-guide/getting-started/how_to_use/#online-vs-offline-workflows","title":"* online vs. offline workflows","text":"

We distinguish between the online and offline workflow:

The online workflow can be used to quickly run a complete evaluation without saving and tracking embeddings. The offline workflow runs faster (only one FM-backbone forward pass) and is ideal to experiment with different decoders on the same FM-backbone.

"},{"location":"user-guide/getting-started/how_to_use/#run-configurations","title":"Run configurations","text":""},{"location":"user-guide/getting-started/how_to_use/#config-files","title":"Config files","text":"

The setup for an eva run is provided in a .yaml config file which is defined with the --config flag.

A config file specifies the setup for the trainer (including callback for the model backbone), the model (setup of the trainable decoder) and data module.

The config files for the datasets and models that eva supports out of the box, you can find on GitHub (scroll to the bottom of the page). We recommend that you inspect some of them to get a better understanding of their structure and content.

"},{"location":"user-guide/getting-started/how_to_use/#environment-variables","title":"Environment variables","text":"

To customize runs, without the need of creating custom config-files, you can overwrite the config-parameters listed below by setting them as environment variables.

Type Description OUTPUT_ROOT str The directory to store logging outputs and evaluation results EMBEDDINGS_ROOT str The directory to store the computed embeddings CHECKPOINT_PATH str Path to the FM-checkpoint to be evaluated IN_FEATURES int The input feature dimension (embedding) NUM_CLASSES int Number of classes for classification tasks N_RUNS int Number of fit runs to perform in a session, defaults to 5 MAX_STEPS int Maximum number of training steps (if early stopping is not triggered) BATCH_SIZE int Batch size for a training step PREDICT_BATCH_SIZE int Batch size for a predict step LR_VALUE float Learning rate for training the decoder MONITOR_METRIC str The metric to monitor for early stopping and final model checkpoint loading MONITOR_METRIC_MODE str \"min\" or \"max\", depending on the MONITOR_METRIC used REPO_OR_DIR str GitHub repo with format containing model implementation, e.g. \"facebookresearch/dino:main\" DINO_BACKBONE str Backbone model architecture if a facebookresearch/dino FM is evaluated FORCE_RELOAD bool Whether to force a fresh download of the github repo unconditionally PRETRAINED bool Whether to load FM-backbone weights from a pretrained model"},{"location":"user-guide/getting-started/installation/","title":"Installation","text":"
pip install \"kaiko-eva[vision]\"\n
"},{"location":"user-guide/getting-started/installation/#run-eva","title":"Run eva","text":"

Now you are all set and you can start running eva with:

eva <subcommand> --config <path-to-config-file>\n
To learn how the subcommands and configs work, we recommend you familiarize yourself with How to use eva and then proceed to running eva with the Tutorials.

"},{"location":"user-guide/tutorials/evaluate_resnet/","title":"Train and evaluate a ResNet","text":"

If you read How to use eva and followed the Tutorials to this point, you might ask yourself why you would not always use the offline workflow to run a complete evaluation. An offline-run stores the computed embeddings and runs faster than the online-workflow which computes a backbone-forward pass in every epoch.

One use case for the online-workflow is the evaluation of a supervised ML model that does not rely on a backbone/head architecture. To demonstrate this, let's train a ResNet 18 from PyTorch Image Models (timm).

To do this we need to create a new config-file:

Now let's adapt the new bach.yaml-config to the new model:

     head:\n      class_path: eva.models.ModelFromFunction\n      init_args:\n        path: timm.create_model\n        arguments:\n          model_name: resnet18\n          num_classes: &NUM_CLASSES 4\n          drop_rate: 0.0\n          pretrained: false\n
To reduce training time, let's overwrite some of the default parameters. Run the training & evaluation with:
OUTPUT_ROOT=logs/resnet/bach \\\nMAX_STEPS=50 \\\nLR_VALUE=0.01 \\\neva fit --config configs/vision/resnet18/bach.yaml\n
Once the run is complete, take a look at the results in logs/resnet/bach/<session-id>/results.json and check out the tensorboard with tensorboard --logdir logs/resnet/bach. How does the performance compare to the results observed in the previous tutorials?

"},{"location":"user-guide/tutorials/offline_vs_online/","title":"Offline vs. online evaluations","text":"

In this tutorial we run eva with the three subcommands predict, fit and predict_fit, and take a look at the difference between offline and online workflows.

"},{"location":"user-guide/tutorials/offline_vs_online/#before-you-start","title":"Before you start","text":"

If you haven't downloaded the config files yet, please download them from GitHub (scroll to the bottom of the page).

For this tutorial we use the BACH classification task which is available on Zenodo and is distributed under Attribution-NonCommercial-ShareAlike 4.0 International license.

To let eva automatically handle the dataset download, you can open configs/vision/dino_vit/offline/bach.yaml and set download: true. Before doing so, please make sure that your use case is compliant with the dataset license.

"},{"location":"user-guide/tutorials/offline_vs_online/#offline-evaluations","title":"Offline evaluations","text":""},{"location":"user-guide/tutorials/offline_vs_online/#1-compute-the-embeddings","title":"1. Compute the embeddings","text":"

First, let's use the predict-command to download the data and compute embeddings. In this example we use a randomly initialized dino_vits16 as backbone.

Open a terminal in the folder where you installed eva and run:

PRETRAINED=false \\\nEMBEDDINGS_ROOT=./data/embeddings/dino_vits16_random \\\neva predict --config configs/vision/dino_vit/offline/bach.yaml\n

Executing this command will:

Once the session is complete, verify that:

"},{"location":"user-guide/tutorials/offline_vs_online/#2-evaluate-the-fm","title":"2. Evaluate the FM","text":"

Now we can use the fit-command to evaluate the FM on the precomputed embeddings.

To ensure a quick run for the purpose of this exercise, we overwrite some of the default parameters. Run eva to fit the decoder classifier with:

N_RUNS=2 \\\nMAX_STEPS=20 \\\nLR_VALUE=0.1 \\\neva fit --config configs/vision/dino_vit/offline/bach.yaml\n

Executing this command will:

Once the session is complete:

"},{"location":"user-guide/tutorials/offline_vs_online/#3-run-a-complete-offline-workflow","title":"3. Run a complete offline-workflow","text":"

With the predict_fit-command, the two steps above can be executed with one command. Let's do this, but this time let's use an FM pretrained from ImageNet.

Go back to the terminal and execute:

N_RUNS=1 \\\nMAX_STEPS=20 \\\nLR_VALUE=0.1 \\\nPRETRAINED=true \\\nEMBEDDINGS_ROOT=./data/embeddings/dino_vits16_pretrained \\\neva predict_fit --config configs/vision/dino_vit/offline/bach.yaml\n

Once the session is complete, inspect the evaluation results as you did in Step 2. Compare the performance metrics and training curves. Can you observe better performance with the ImageNet pretrained encoder?

"},{"location":"user-guide/tutorials/offline_vs_online/#online-evaluations","title":"Online evaluations","text":"

Alternatively to the offline workflow from Step 3, a complete evaluation can also be computed online. In this case we don't save and track embeddings and instead fit the ML model (encoder with frozen layers + trainable decoder) directly on the given task.

As in Step 3 above, we again use a dino_vits16 pretrained from ImageNet.

Run a complete online workflow with the following command:

N_RUNS=1 \\\nMAX_STEPS=20 \\\nLR_VALUE=0.1 \\\nPRETRAINED=true \\\neva fit --config configs/vision/dino_vit/online/bach.yaml\n

Executing this command will:

Once the run is complete:

"}]} \ No newline at end of file +{"config":{"lang":["en"],"separator":"[\\s\\-]+","pipeline":["stopWordFilter"]},"docs":[{"location":"","title":"Introduction","text":""},{"location":"#_1","title":"Introduction","text":"

Oncology FM Evaluation Framework by kaiko.ai

With the first release, eva supports performance evaluation for vision Foundation Models (\"FMs\") and supervised machine learning models on WSI-patch-level image classification task. Support for radiology (CT-scans) segmentation tasks will be added soon.

With eva we provide the open-source community with an easy-to-use framework that follows industry best practices to deliver a robust, reproducible and fair evaluation benchmark across FMs of different sizes and architectures.

Support for additional modalities and tasks will be added in future releases.

"},{"location":"#use-cases","title":"Use cases","text":""},{"location":"#1-evaluate-your-own-fms-on-public-benchmark-datasets","title":"1. Evaluate your own FMs on public benchmark datasets","text":"

With a specified FM as input, you can run eva on several publicly available datasets & tasks. One evaluation run will download and preprocess the relevant data, compute embeddings, fit and evaluate a downstream head and report the mean and standard deviation of the relevant performance metrics.

Supported datasets & tasks include:

WSI patch-level pathology datasets

Radiology datasets

To evaluate FMs, eva provides support for different model-formats, including models trained with PyTorch, models available on HuggingFace and ONNX-models. For other formats custom wrappers can be implemented.

"},{"location":"#2-evaluate-ml-models-on-your-own-dataset-task","title":"2. Evaluate ML models on your own dataset & task","text":"

If you have your own labeled dataset, all that is needed is to implement a dataset class tailored to your source data. Start from one of our out-of-the box provided dataset classes, adapt it to your data and run eva to see how different FMs perform on your task.

"},{"location":"#evaluation-results","title":"Evaluation results","text":"

We evaluated the following FMs on the 4 supported WSI-patch-level image classification tasks. On the table below we report Balanced Accuracy for binary & multiclass tasks and show the average performance & standard deviation over 5 runs.

FM-backbone pretraining BACH CRC MHIST PCam/val PCam/test DINO ViT-S16 N/A 0.410 (\u00b10.009) 0.617 (\u00b10.008) 0.501 (\u00b10.004) 0.753 (\u00b10.002) 0.728 (\u00b10.003) DINO ViT-S16 ImageNet 0.695 (\u00b10.004) 0.935 (\u00b10.003) 0.831 (\u00b10.002) 0.864 (\u00b10.007) 0.849 (\u00b10.007) DINO ViT-B8 ImageNet 0.710 (\u00b10.007) 0.939 (\u00b10.001) 0.814 (\u00b10.003) 0.870 (\u00b10.003) 0.856 (\u00b10.004) DINOv2 ViT-L14 ImageNet 0.707 (\u00b10.008) 0.916 (\u00b10.002) 0.832 (\u00b10.003) 0.873 (\u00b10.001) 0.888 (\u00b10.001) Lunit - ViT-S16 TCGA 0.801 (\u00b10.005) 0.934 (\u00b10.001) 0.768 (\u00b10.004) 0.889 (\u00b10.002) 0.895 (\u00b10.006) Owkin - iBOT ViT-B16 TCGA 0.725 (\u00b10.004) 0.935 (\u00b10.001) 0.777 (\u00b10.005) 0.912 (\u00b10.002) 0.915 (\u00b10.003) UNI - DINOv2 ViT-L16 Mass-100k 0.814 (\u00b10.008) 0.950 (\u00b10.001) 0.837 (\u00b10.001) 0.936 (\u00b10.001) 0.938 (\u00b10.001) kaiko.ai - DINO ViT-S16 TCGA 0.797 (\u00b10.003) 0.943 (\u00b10.001) 0.828 (\u00b10.003) 0.903 (\u00b10.001) 0.893 (\u00b10.005) kaiko.ai - DINO ViT-S8 TCGA 0.834 (\u00b10.012) 0.946 (\u00b10.002) 0.832 (\u00b10.006) 0.897 (\u00b10.001) 0.887 (\u00b10.002) kaiko.ai - DINO ViT-B16 TCGA 0.810 (\u00b10.008) 0.960 (\u00b10.001) 0.826 (\u00b10.003) 0.900 (\u00b10.002) 0.898 (\u00b10.003) kaiko.ai - DINO ViT-B8 TCGA 0.865 (\u00b10.019) 0.956 (\u00b10.001) 0.809 (\u00b10.021) 0.913 (\u00b10.001) 0.921 (\u00b10.002) kaiko.ai - DINOv2 ViT-L14 TCGA 0.870 (\u00b10.005) 0.930 (\u00b10.001) 0.809 (\u00b10.001) 0.908 (\u00b10.001) 0.898 (\u00b10.002)

The runs use the default setup described in the section below.

eva trains the decoder on the \"train\" split and uses the \"validation\" split for monitoring, early stopping and checkpoint selection. Evaluation results are reported on the \"validation\" split and, if available, on the \"test\" split.

For more details on the FM-backbones and instructions to replicate the results, check out Replicate evaluations.

"},{"location":"#evaluation-setup","title":"Evaluation setup","text":"

Note that the current version of eva implements the task- & model-independent and fixed default set up following the standard evaluation protocol proposed by [1] and described in the table below. We selected this approach to prioritize reliable, robust and fair FM-evaluation while being in line with common literature. Additionally, with future versions we are planning to allow the use of cross-validation and hyper-parameter tuning to find the optimal setup to achieve best possible performance on the implemented downstream tasks.

With a provided FM, eva computes embeddings for all input images (WSI patches) which are then used to train a downstream head consisting of a single linear layer in a supervised setup for each of the benchmark datasets. We use early stopping with a patience of 5% of the maximal number of epochs.

Backbone frozen Hidden layers none Dropout 0.0 Activation function none Number of steps 12,500 Base Batch size 4,096 Batch size dataset specific* Base learning rate 0.01 Learning Rate [Base learning rate] * [Batch size] / [Base batch size] Max epochs [Number of samples] * [Number of steps] / [Batch size] Early stopping 5% * [Max epochs] Optimizer SGD Momentum 0.9 Weight Decay 0.0 Nesterov momentum true LR Schedule Cosine without warmup

* For smaller datasets (e.g. BACH with 400 samples) we reduce the batch size to 256 and scale the learning rate accordingly.

"},{"location":"#license","title":"License","text":"

eva is distributed under the terms of the Apache-2.0 license.

"},{"location":"#next-steps","title":"Next steps","text":"

Check out the User Guide to get started with eva

"},{"location":"CODE_OF_CONDUCT/","title":"Contributor Covenant Code of Conduct","text":""},{"location":"CODE_OF_CONDUCT/#our-pledge","title":"Our Pledge","text":"

In the interest of fostering an open and welcoming environment, we as contributors and maintainers pledge to making participation in our project and our community a harassment-free experience for everyone, regardless of age, body size, disability, ethnicity, sex characteristics, gender identity and expression, level of experience, education, socio-economic status, nationality, personal appearance, race, religion, or sexual identity and orientation.

"},{"location":"CODE_OF_CONDUCT/#our-standards","title":"Our Standards","text":"

Examples of behavior that contributes to creating a positive environment include:

Examples of unacceptable behavior by participants include:

"},{"location":"CODE_OF_CONDUCT/#our-responsibilities","title":"Our Responsibilities","text":"

Project maintainers are responsible for clarifying the standards of acceptable behavior and are expected to take appropriate and fair corrective action in response to any instances of unacceptable behavior.

Project maintainers have the right and responsibility to remove, edit, or reject comments, commits, code, wiki edits, issues, and other contributions that are not aligned to this Code of Conduct, or to ban temporarily or permanently any contributor for other behaviors that they deem inappropriate, threatening, offensive, or harmful.

"},{"location":"CODE_OF_CONDUCT/#scope","title":"Scope","text":"

This Code of Conduct applies both within project spaces and in public spaces when an individual is representing the project or its community. Examples of representing a project or community include using an official project e-mail address, posting via an official social media account, or acting as an appointed representative at an online or offline event. Representation of a project may be further defined and clarified by project maintainers.

"},{"location":"CODE_OF_CONDUCT/#enforcement","title":"Enforcement","text":"

Instances of abusive, harassing, or otherwise unacceptable behavior may be reported by contacting the project team at eva@kaiko.ai. All complaints will be reviewed and investigated and will result in a response that is deemed necessary and appropriate to the circumstances. The project team is obligated to maintain confidentiality with regard to the reporter of an incident. Further details of specific enforcement policies may be posted separately.

Project maintainers who do not follow or enforce the Code of Conduct in good faith may face temporary or permanent repercussions as determined by other members of the project's leadership.

"},{"location":"CODE_OF_CONDUCT/#attribution","title":"Attribution","text":"

This Code of Conduct is adapted from the Contributor Covenant, version 1.4, available at https://www.contributor-covenant.org/version/1/4/code-of-conduct.html

For answers to common questions about this code of conduct, see https://www.contributor-covenant.org/faq

"},{"location":"CONTRIBUTING/","title":"Contributing to eva","text":"

eva is open source and community contributions are welcome!

"},{"location":"CONTRIBUTING/#contribution-process","title":"Contribution Process","text":""},{"location":"CONTRIBUTING/#github-issues","title":"GitHub Issues","text":"

The eva contribution process generally starts with filing a GitHub issue.

eva defines four categories of issues: feature requests, bug reports, documentation fixes, and installation issues. In general, we recommend waiting for feedback from a eva maintainer or community member before proceeding to implement a feature or patch.

"},{"location":"CONTRIBUTING/#pull-requests","title":"Pull Requests","text":"

After you have agreed upon an implementation strategy for your feature or patch with an eva maintainer, the next step is to introduce your changes as a pull request against the eva repository.

Steps to make a pull request:

Once your pull request has been merged, your changes will be automatically included in the next eva release!

"},{"location":"DEVELOPER_GUIDE/","title":"Developer Guide","text":""},{"location":"DEVELOPER_GUIDE/#setting-up-a-dev-environment","title":"Setting up a DEV environment","text":"

We use PDM as a package and dependency manager. You can set up a local python environment for development as follows: 1. Install package and dependency manager PDM following the instructions here. 2. Install system dependencies - For MacOS: brew install Cmake - For Linux (Debian): sudo apt-get install build-essential cmake 3. Run pdm install -G dev to install the python dependencies. This will create a virtual environment in eva/.venv.

"},{"location":"DEVELOPER_GUIDE/#adding-new-dependencies","title":"Adding new dependencies","text":"

Add a new dependency to the core submodule: pdm add <package_name>

Add a new dependency to the vision submodule: pdm add -G vision -G all <package_name>

For more information about managing dependencies please look here.

"},{"location":"DEVELOPER_GUIDE/#continuous-integration-ci","title":"Continuous Integration (CI)","text":"

For testing automation, we use nox.

Installation: - with brew: brew install nox - with pip: pip install --user --upgrade nox (this way, you might need to run nox commands with python -m nox or specify an alias)

Commands: - nox to run all the automation tests. - nox -s fmt to run the code formatting tests. - nox -s lint to run the code lining tests. - nox -s check to run the type-annotation tests. - nox -s test to run the unit tests. - nox -s test -- tests/eva/metrics/test_average_loss.py to run specific tests

"},{"location":"STYLE_GUIDE/","title":"eva Style Guide","text":"

This document contains our style guides used in eva.

Our priority is consistency, so that developers can quickly ingest and understand the entire codebase without being distracted by style idiosyncrasies.

"},{"location":"STYLE_GUIDE/#general-coding-principles","title":"General coding principles","text":"

Q: How to keep code readable and maintainable? - Don't Repeat Yourself (DRY) - Use the lowest possible visibility for a variable or method (i.e. make private if possible) -- see Information Hiding / Encapsulation

Q: How big should a function be? - Single Level of Abstraction Principle (SLAP) - High Cohesion and Low Coupling

TL;DR: functions should usually be quite small, and _do one thing_\n
"},{"location":"STYLE_GUIDE/#python-style-guide","title":"Python Style Guide","text":"

In general we follow the following regulations: PEP8, the Google Python Style Guide and we expect type hints/annotations.

"},{"location":"STYLE_GUIDE/#docstrings","title":"Docstrings","text":"

Our docstring style is derived from Google Python style.

def example_function(variable: int, optional: str | None = None) -> str:\n    \"\"\"An example docstring that explains what this functions do.\n\n    Docs sections can be referenced via :ref:`custom text here <anchor-link>`.\n\n    Classes can be referenced via :class:`eva.data.datamodules.DataModule`.\n\n    Functions can be referenced via :func:`eva.data.datamodules.call.call_method_if_exists`.\n\n    Example:\n\n        >>> from torch import nn\n        >>> import eva\n        >>> eva.models.modules.HeadModule(\n        >>>     head=nn.Linear(10, 2),\n        >>>     criterion=nn.CrossEntropyLoss(),\n        >>> )\n\n    Args:\n        variable: A required argument.\n        optional: An optional argument.\n\n    Returns:\n        A description of the output string.\n    \"\"\"\n    pass\n
"},{"location":"STYLE_GUIDE/#module-docstrings","title":"Module docstrings","text":"

PEP-8 and PEP-257 indicate docstrings should have very specific syntax:

\"\"\"One line docstring that shouldn't wrap onto next line.\"\"\"\n
\"\"\"First line of multiline docstring that shouldn't wrap.\n\nSubsequent line or paragraphs.\n\"\"\"\n
"},{"location":"STYLE_GUIDE/#constants-docstrings","title":"Constants docstrings","text":"

Public constants should usually have docstrings. Optional on private constants. Docstrings on constants go underneath

SOME_CONSTANT = 3\n\"\"\"Either a single-line docstring or multiline as per above.\"\"\"\n
"},{"location":"STYLE_GUIDE/#function-docstrings","title":"Function docstrings","text":"

All public functions should have docstrings following the pattern shown below.

Each section can be omitted if there are no inputs, outputs, or no notable exceptions raised, respectively.

def fake_datamodule(\n    n_samples: int, random: bool = True\n) -> eva.data.datamodules.DataModule:\n    \"\"\"Generates a fake DataModule.\n\n    It builds a :class:`eva.data.datamodules.DataModule` by generating\n    a fake dataset with generated data while fixing the seed. It can\n    be useful for debugging purposes.\n\n    Args:\n        n_samples: The number of samples of the generated datasets.\n        random: Whether to generated randomly.\n\n    Returns:\n        A :class:`eva.data.datamodules.DataModule` with generated random data.\n\n    Raises:\n        ValueError: If `n_samples` is `0`.\n    \"\"\"\n    pass\n
"},{"location":"STYLE_GUIDE/#class-docstrings","title":"Class docstrings","text":"

All public classes should have class docstrings following the pattern shown below.

class DataModule(pl.LightningDataModule):\n    \"\"\"DataModule encapsulates all the steps needed to process data.\n\n    It will initialize and create the mapping between dataloaders and\n    datasets. During the `prepare_data`, `setup` and `teardown`, the\n    datamodule will call the respectively methods from all the datasets,\n    given that they are defined.\n    \"\"\"\n\n    def __init__(\n        self,\n        datasets: schemas.DatasetsSchema | None = None,\n        dataloaders: schemas.DataloadersSchema | None = None,\n    ) -> None:\n        \"\"\"Initializes the datamodule.\n\n        Args:\n            datasets: The desired datasets. Defaults to `None`.\n            dataloaders: The desired dataloaders. Defaults to `None`.\n        \"\"\"\n        pass\n
"},{"location":"datasets/","title":"Datasets","text":"

eva provides native support for several public datasets. When possible, the corresponding dataset classes facilitate automatic download to disk, if not possible, this documentation provides download instructions.

"},{"location":"datasets/#vision-datasets-overview","title":"Vision Datasets Overview","text":""},{"location":"datasets/#whole-slide-wsi-and-microscopy-image-datasets","title":"Whole Slide (WSI) and microscopy image datasets","text":"Dataset #Patches Patch Size Magnification (\u03bcm/px) Task Cancer Type BACH 400 2048x1536 20x (0.5) Classification (4 classes) Breast CRC 107,180 224x224 20x (0.5) Classification (9 classes) Colorectal PatchCamelyon 327,680 96x96 10x (1.0) * Classification (2 classes) Breast MHIST 3,152 224x224 5x (2.0) * Classification (2 classes) Colorectal Polyp

* Downsampled from 40x (0.25 \u03bcm/px) to increase the field of view.

"},{"location":"datasets/#radiology-datasets","title":"Radiology datasets","text":"Dataset #Images Image Size Task Download provided TotalSegmentator 1228 ~300 x ~300 x ~350 * Multilabel Classification (117 classes) Yes

* 3D images of varying sizes

"},{"location":"datasets/bach/","title":"BACH","text":"

The BACH dataset consists of microscopy and WSI images, of which we use only the microscopy images. These are 408 labeled images from 4 classes (\"Normal\", \"Benign\", \"Invasive\", \"InSitu\"). This dataset was used for the \"BACH Grand Challenge on Breast Cancer Histology images\".

"},{"location":"datasets/bach/#raw-data","title":"Raw data","text":""},{"location":"datasets/bach/#key-stats","title":"Key stats","text":"Modality Vision (microscopy images) Task Multiclass classification (4 classes) Cancer type Breast Data size total: 10.4GB / data in use: 7.37 GB (18.9 MB per image) Image dimension 1536 x 2048 x 3 Magnification (\u03bcm/px) 20x (0.42) Files format .tif images Number of images 408 (102 from each class) Splits in use one labeled split"},{"location":"datasets/bach/#organization","title":"Organization","text":"

The data ICIAR2018_BACH_Challenge.zip from zenodo is organized as follows:

ICAR2018_BACH_Challenge\n\u251c\u2500\u2500 Photos                    # All labeled patches used by eva\n\u2502   \u251c\u2500\u2500 Normal\n\u2502   \u2502   \u251c\u2500\u2500 n032.tif\n\u2502   \u2502   \u2514\u2500\u2500 ...\n\u2502   \u251c\u2500\u2500 Benign\n\u2502   \u2502   \u2514\u2500\u2500 ...\n\u2502   \u251c\u2500\u2500 Invasive\n\u2502   \u2502   \u2514\u2500\u2500 ...\n\u2502   \u251c\u2500\u2500 InSitu\n\u2502   \u2502   \u2514\u2500\u2500 ...\n\u251c\u2500\u2500 WSI                       # WSIs, not in use\n\u2502   \u251c\u2500\u2500 ...\n\u2514\u2500\u2500 ...\n
"},{"location":"datasets/bach/#download-and-preprocessing","title":"Download and preprocessing","text":"

The BACH dataset class supports downloading the data during runtime by setting the init argument download=True.

Note that in the provided BACH-config files the download argument is set to false. To enable automatic download you will need to open the config and set download: true.

The splits are created from the indices specified in the BACH dataset class. These indices were picked to prevent data leakage due to images belonging to the same patient. Because the small dataset in combination with the patient ID constraint does not allow to split the data three-ways with sufficient amount of data in each split, we only create a train and val split and leave it to the user to submit predictions on the official test split to the BACH Challenge Leaderboard.

Splits Train Validation #Samples 268 (67%) 132 (33%)"},{"location":"datasets/bach/#relevant-links","title":"Relevant links","text":""},{"location":"datasets/bach/#license","title":"License","text":"

Attribution-NonCommercial-ShareAlike 4.0 International

"},{"location":"datasets/crc/","title":"CRC","text":"

The CRC-HE dataset consists of labeled patches (9 classes) from colorectal cancer (CRC) and normal tissue. We use the NCT-CRC-HE-100K dataset for training and validation and the CRC-VAL-HE-7K for testing.

The NCT-CRC-HE-100K-NONORM consists of 100,000 images without applied color normalization. The CRC-VAL-HE-7K consists of 7,180 image patches from 50 patients without overlap with NCT-CRC-HE-100K-NONORM.

The tissue classes are: Adipose (ADI), background (BACK), debris (DEB), lymphocytes (LYM), mucus (MUC), smooth muscle (MUS), normal colon mucosa (NORM), cancer-associated stroma (STR) and colorectal adenocarcinoma epithelium (TUM)

"},{"location":"datasets/crc/#raw-data","title":"Raw data","text":""},{"location":"datasets/crc/#key-stats","title":"Key stats","text":"Modality Vision (WSI patches) Task Multiclass classification (9 classes) Cancer type Colorectal Data size total: 11.7GB (train), 800MB (val) Image dimension 224 x 224 x 3 Magnification (\u03bcm/px) 20x (0.5) Files format .tif images Number of images 107,180 (100k train, 7.2k val) Splits in use NCT-CRC-HE-100K (train), CRC-VAL-HE-7K (val)"},{"location":"datasets/crc/#splits","title":"Splits","text":"

We use the splits according to the data sources:

Splits Train Validation #Samples 100,000 (93.3%) 7,180 (6.7%)

A test split is not provided. Because the patient information for the training data is not available, dividing the training data in a train/val split (and using the given val split as test split) is not possible without risking data leakage. eva therefore reports evaluation results for CRC HE on the validation split.

"},{"location":"datasets/crc/#organization","title":"Organization","text":"

The data NCT-CRC-HE-100K.zip, NCT-CRC-HE-100K-NONORM.zip and CRC-VAL-HE-7K.zip from zenodo are organized as follows:

NCT-CRC-HE-100K                # All images used for training\n\u251c\u2500\u2500 ADI                        # All labeled patches belonging to the 1st class\n\u2502   \u251c\u2500\u2500 ADI-AAAFLCLY.tif\n\u2502   \u251c\u2500\u2500 ...\n\u251c\u2500\u2500 BACK                       # All labeled patches belonging to the 2nd class\n\u2502   \u251c\u2500\u2500 ...\n\u2514\u2500\u2500 ...\n\nNCT-CRC-HE-100K-NONORM         # All images used for training\n\u251c\u2500\u2500 ADI                        # All labeled patches belonging to the 1st class\n\u2502   \u251c\u2500\u2500 ADI-AAAFLCLY.tif\n\u2502   \u251c\u2500\u2500 ...\n\u251c\u2500\u2500 BACK                       # All labeled patches belonging to the 2nd class\n\u2502   \u251c\u2500\u2500 ...\n\u2514\u2500\u2500 ...\n\nCRC-VAL-HE-7K                  # All images used for validation\n\u251c\u2500\u2500 ...                        # identical structure as for NCT-CRC-HE-100K-NONORM\n\u2514\u2500\u2500 ...\n
"},{"location":"datasets/crc/#download-and-preprocessing","title":"Download and preprocessing","text":"

The CRC dataset class supports downloading the data during runtime by setting the init argument download=True.

Note that in the provided CRC-config files the download argument is set to false. To enable automatic download you will need to open the config and set download: true.

"},{"location":"datasets/crc/#relevant-links","title":"Relevant links","text":""},{"location":"datasets/crc/#license","title":"License","text":"

CC BY 4.0 LEGAL CODE

"},{"location":"datasets/mhist/","title":"MHIST","text":"

MHIST is a binary classification task which comprises of 3,152 hematoxylin and eosin (H&E)-stained Formalin Fixed Paraffin-Embedded (FFPE) fixed-size images (224 by 224 pixels) of colorectal polyps from the Department of Pathology and Laboratory Medicine at Dartmouth-Hitchcock Medical Center (DHMC).

The tissue classes are: Hyperplastic Polyp (HP), Sessile Serrated Adenoma (SSA). This classification task focuses on the clinically-important binary distinction between HPs and SSAs, a challenging problem with considerable inter-pathologist variability. HPs are typically benign, while sessile serrated adenomas are precancerous lesions that can turn into cancer if left untreated and require sooner follow-up examinations. Histologically, HPs have a superficial serrated architecture and elongated crypts, whereas SSAs are characterized by broad-based crypts, often with complex structure and heavy serration.

"},{"location":"datasets/mhist/#raw-data","title":"Raw data","text":""},{"location":"datasets/mhist/#key-stats","title":"Key stats","text":"Modality Vision (WSI patches) Task Binary classification (2 classes) Cancer type Colorectal Polyp Data size 354 MB Image dimension 224 x 224 x 3 Magnification (\u03bcm/px) 5x (2.0) * Files format .png images Number of images 3,152 (2,175 train, 977 test) Splits in use annotations.csv (train / test)

* Downsampled from 40x to increase the field of view.

"},{"location":"datasets/mhist/#organization","title":"Organization","text":"

The contents from images.zip and the file annotations.csv from bmirds are organized as follows:

mhist                           # Root folder\n\u251c\u2500\u2500 images                      # All the dataset images\n\u2502   \u251c\u2500\u2500 MHIST_aaa.png\n\u2502   \u251c\u2500\u2500 MHIST_aab.png\n\u2502   \u251c\u2500\u2500 ...\n\u2514\u2500\u2500 annotations.csv             # The dataset annotations file\n
"},{"location":"datasets/mhist/#download-and-preprocessing","title":"Download and preprocessing","text":"

To download the dataset, please visit the access portal on BMIRDS and follow the instructions. You will then receive an email with all the relative links that you can use to download the data (images.zip, annotations.csv, Dataset Research Use Agreement.pdf and MD5SUMs.txt).

Please create a root folder, e.g. mhist, and download all the files there, which unzipping the contents of images.zip to a directory named images inside your root folder (i.e. mhist/images). Afterwards, you can (optionally) delete the images.zip file.

"},{"location":"datasets/mhist/#splits","title":"Splits","text":"

We work with the splits provided by the data source. Since no \"validation\" split is provided, we use the \"test\" split as validation split.

Splits Train Validation #Samples 2,175 (69%) 977 (31%)"},{"location":"datasets/mhist/#relevant-links","title":"Relevant links","text":""},{"location":"datasets/patch_camelyon/","title":"PatchCamelyon","text":"

The PatchCamelyon benchmark is an image classification dataset with 327,680 color images (96 x 96px) extracted from histopathologic scans of lymph node sections. Each image is annotated with a binary label indicating presence of metastatic tissue.

"},{"location":"datasets/patch_camelyon/#raw-data","title":"Raw data","text":""},{"location":"datasets/patch_camelyon/#key-stats","title":"Key stats","text":"Modality Vision (WSI patches) Task Binary classification Cancer type Breast Data size 8 GB Image dimension 96 x 96 x 3 Magnification (\u03bcm/px) 10x (1.0) * Files format h5 Number of images 327,680 (50% of each class)

* The slides were acquired and digitized at 2 different medical centers using a 40x objective but under-sampled to 10x to increase the field of view.

"},{"location":"datasets/patch_camelyon/#splits","title":"Splits","text":"

The data source provides train/validation/test splits

Splits Train Validation Test #Samples 262,144 (80%) 32,768 (10%) 32,768 (10%)"},{"location":"datasets/patch_camelyon/#organization","title":"Organization","text":"

The PatchCamelyon data from zenodo is organized as follows:

\u251c\u2500\u2500 camelyonpatch_level_2_split_train_x.h5.gz               # train images\n\u251c\u2500\u2500 camelyonpatch_level_2_split_train_y.h5.gz               # train labels\n\u251c\u2500\u2500 camelyonpatch_level_2_split_valid_x.h5.gz               # val images\n\u251c\u2500\u2500 camelyonpatch_level_2_split_valid_y.h5.gz               # val labels\n\u251c\u2500\u2500 camelyonpatch_level_2_split_test_x.h5.gz                # test images\n\u251c\u2500\u2500 camelyonpatch_level_2_split_test_y.h5.gz                # test labels\n
"},{"location":"datasets/patch_camelyon/#download-and-preprocessing","title":"Download and preprocessing","text":"

The dataset class PatchCamelyon supports downloading the data during runtime by setting the init argument download=True.

Note that in the provided PatchCamelyon-config files the download argument is set to false. To enable automatic download you will need to open the config and set download: true.

Labels are provided by source files, splits are given by file names.

"},{"location":"datasets/patch_camelyon/#relevant-links","title":"Relevant links","text":""},{"location":"datasets/patch_camelyon/#citation","title":"Citation","text":"
@misc{b_s_veeling_j_linmans_j_winkens_t_cohen_2018_2546921,\n  author       = {B. S. Veeling, J. Linmans, J. Winkens, T. Cohen, M. Welling},\n  title        = {Rotation Equivariant CNNs for Digital Pathology},\n  month        = sep,\n  year         = 2018,\n  doi          = {10.1007/978-3-030-00934-2_24},\n  url          = {https://doi.org/10.1007/978-3-030-00934-2_24}\n}\n
"},{"location":"datasets/patch_camelyon/#license","title":"License","text":"

Creative Commons Zero v1.0 Universal

"},{"location":"datasets/total_segmentator/","title":"TotalSegmentator","text":"

The TotalSegmentator dataset is a radiology image-segmentation dataset with 1228 3D images and corresponding masks with 117 different anatomical structures. It can be used for segmentation and multilabel classification tasks.

"},{"location":"datasets/total_segmentator/#raw-data","title":"Raw data","text":""},{"location":"datasets/total_segmentator/#key-stats","title":"Key stats","text":"Modality Vision (radiology, CT scans) Task Segmentation / multilabel classification (117 classes) Data size total: 23.6GB Image dimension ~300 x ~300 x ~350 (number of slices) x 1 (grey scale) * Files format .nii (\"NIFTI\") images Number of images 1228 Splits in use one labeled split

/* image resolution and number of slices per image vary

"},{"location":"datasets/total_segmentator/#organization","title":"Organization","text":"

The data Totalsegmentator_dataset_v201.zip from zenodo is organized as follows:

Totalsegmentator_dataset_v201\n\u251c\u2500\u2500 s0011                               # one image\n\u2502   \u251c\u2500\u2500 ct.nii.gz                       # CT scan\n\u2502   \u251c\u2500\u2500 segmentations                   # directory with segmentation masks\n\u2502   \u2502   \u251c\u2500\u2500 adrenal_gland_left.nii.gz   # segmentation mask 1st anatomical structure\n\u2502   \u2502   \u251c\u2500\u2500 adrenal_gland_right.nii.gz  # segmentation mask 2nd anatomical structure\n\u2502   \u2502   \u2514\u2500\u2500 ...\n\u2514\u2500\u2500 ...\n
"},{"location":"datasets/total_segmentator/#download-and-preprocessing","title":"Download and preprocessing","text":" Splits Train Validation Test #Samples 737 (60%) 246 (20%) 245 (20%)"},{"location":"datasets/total_segmentator/#relevant-links","title":"Relevant links","text":""},{"location":"datasets/total_segmentator/#license","title":"License","text":"

Creative Commons Attribution 4.0 International

"},{"location":"reference/","title":"Reference API","text":"

Here is the Reference API, describing the classes, functions, parameters and attributes of the eva package.

To learn how to use eva, however, its best to get started with the User Guide

"},{"location":"reference/core/callbacks/","title":"Callbacks","text":""},{"location":"reference/core/callbacks/#writers","title":"Writers","text":""},{"location":"reference/core/callbacks/#eva.core.callbacks.writers.EmbeddingsWriter","title":"eva.core.callbacks.writers.EmbeddingsWriter","text":"

Bases: BasePredictionWriter

Callback for writing generated embeddings to disk.

This callback writes the embedding files in a separate process to avoid blocking the main process where the model forward pass is executed.

Parameters:

Name Type Description Default output_dir str

The directory where the embeddings will be saved.

required backbone Module | None

A model to be used as feature extractor. If None, it will be expected that the input batch returns the features directly.

None dataloader_idx_map Dict[int, str] | None

A dictionary mapping dataloader indices to their respective names (e.g. train, val, test).

None group_key str | None

The metadata key to group the embeddings by. If specified, the embedding files will be saved in subdirectories named after the group_key. If specified, the key must be present in the metadata of the input batch.

None overwrite bool

Whether to overwrite the output directory. Defaults to True.

True Source code in src/eva/core/callbacks/writers/embeddings.py
def __init__(\n    self,\n    output_dir: str,\n    backbone: nn.Module | None = None,\n    dataloader_idx_map: Dict[int, str] | None = None,\n    group_key: str | None = None,\n    overwrite: bool = True,\n) -> None:\n    \"\"\"Initializes a new EmbeddingsWriter instance.\n\n    This callback writes the embedding files in a separate process to avoid blocking the\n    main process where the model forward pass is executed.\n\n    Args:\n        output_dir: The directory where the embeddings will be saved.\n        backbone: A model to be used as feature extractor. If `None`,\n            it will be expected that the input batch returns the features directly.\n        dataloader_idx_map: A dictionary mapping dataloader indices to their respective\n            names (e.g. train, val, test).\n        group_key: The metadata key to group the embeddings by. If specified, the\n            embedding files will be saved in subdirectories named after the group_key.\n            If specified, the key must be present in the metadata of the input batch.\n        overwrite: Whether to overwrite the output directory. Defaults to True.\n    \"\"\"\n    super().__init__(write_interval=\"batch\")\n\n    self._output_dir = output_dir\n    self._backbone = backbone\n    self._dataloader_idx_map = dataloader_idx_map or {}\n    self._group_key = group_key\n    self._overwrite = overwrite\n\n    self._write_queue: multiprocessing.Queue\n    self._write_process: eva_multiprocessing.Process\n
"},{"location":"reference/core/interface/","title":"Interface API","text":"

Reference information for the Interface API.

"},{"location":"reference/core/interface/#eva.Interface","title":"eva.Interface","text":"

A high-level interface for training and validating a machine learning model.

This class provides a convenient interface to connect a model, data, and trainer to train and validate a model.

"},{"location":"reference/core/interface/#eva.Interface.fit","title":"fit","text":"

Perform model training and evaluation out-of-place.

This method uses the specified trainer to fit the model using the provided data.

Example use cases:

Parameters:

Name Type Description Default trainer Trainer

The base trainer to use but not modify.

required model ModelModule

The model module to use but not modify.

required data DataModule

The data module.

required Source code in src/eva/core/interface/interface.py
def fit(\n    self,\n    trainer: eva_trainer.Trainer,\n    model: modules.ModelModule,\n    data: datamodules.DataModule,\n) -> None:\n    \"\"\"Perform model training and evaluation out-of-place.\n\n    This method uses the specified trainer to fit the model using the provided data.\n\n    Example use cases:\n\n    - Using a model consisting of a frozen backbone and a head, the backbone will generate\n      the embeddings on the fly which are then used as input features to train the head on\n      the downstream task specified by the given dataset.\n    - Fitting only the head network using a dataset that loads pre-computed embeddings.\n\n    Args:\n        trainer: The base trainer to use but not modify.\n        model: The model module to use but not modify.\n        data: The data module.\n    \"\"\"\n    trainer.run_evaluation_session(model=model, datamodule=data)\n
"},{"location":"reference/core/interface/#eva.Interface.predict","title":"predict","text":"

Perform model prediction out-of-place.

This method performs inference with a pre-trained foundation model to compute embeddings.

Parameters:

Name Type Description Default trainer Trainer

The base trainer to use but not modify.

required model ModelModule

The model module to use but not modify.

required data DataModule

The data module.

required Source code in src/eva/core/interface/interface.py
def predict(\n    self,\n    trainer: eva_trainer.Trainer,\n    model: modules.ModelModule,\n    data: datamodules.DataModule,\n) -> None:\n    \"\"\"Perform model prediction out-of-place.\n\n    This method performs inference with a pre-trained foundation model to compute embeddings.\n\n    Args:\n        trainer: The base trainer to use but not modify.\n        model: The model module to use but not modify.\n        data: The data module.\n    \"\"\"\n    eva_trainer.infer_model(\n        base_trainer=trainer,\n        base_model=model,\n        datamodule=data,\n        return_predictions=False,\n    )\n
"},{"location":"reference/core/interface/#eva.Interface.predict_fit","title":"predict_fit","text":"

Combines the predict and fit commands in one method.

This method performs the following two steps: 1. predict: perform inference with a pre-trained foundation model to compute embeddings. 2. fit: training the head network using the embeddings generated in step 1.

Parameters:

Name Type Description Default trainer Trainer

The base trainer to use but not modify.

required model ModelModule

The model module to use but not modify.

required data DataModule

The data module.

required Source code in src/eva/core/interface/interface.py
def predict_fit(\n    self,\n    trainer: eva_trainer.Trainer,\n    model: modules.ModelModule,\n    data: datamodules.DataModule,\n) -> None:\n    \"\"\"Combines the predict and fit commands in one method.\n\n    This method performs the following two steps:\n    1. predict: perform inference with a pre-trained foundation model to compute embeddings.\n    2. fit: training the head network using the embeddings generated in step 1.\n\n    Args:\n        trainer: The base trainer to use but not modify.\n        model: The model module to use but not modify.\n        data: The data module.\n    \"\"\"\n    self.predict(trainer=trainer, model=model, data=data)\n    self.fit(trainer=trainer, model=model, data=data)\n
"},{"location":"reference/core/data/dataloaders/","title":"Dataloaders","text":"

Reference information for the Dataloader classes.

"},{"location":"reference/core/data/dataloaders/#eva.data.DataLoader","title":"eva.data.DataLoader dataclass","text":"

The DataLoader combines a dataset and a sampler.

It provides an iterable over the given dataset.

"},{"location":"reference/core/data/dataloaders/#eva.data.DataLoader.batch_size","title":"batch_size: int | None = 1 class-attribute instance-attribute","text":"

How many samples per batch to load.

Set to None for iterable dataset where dataset produces batches.

"},{"location":"reference/core/data/dataloaders/#eva.data.DataLoader.shuffle","title":"shuffle: bool = False class-attribute instance-attribute","text":"

Whether to shuffle the data at every epoch.

"},{"location":"reference/core/data/dataloaders/#eva.data.DataLoader.sampler","title":"sampler: samplers.Sampler | None = None class-attribute instance-attribute","text":"

Defines the strategy to draw samples from the dataset.

Can be any Iterable with __len__ implemented. If specified, shuffle must not be specified.

"},{"location":"reference/core/data/dataloaders/#eva.data.DataLoader.batch_sampler","title":"batch_sampler: samplers.Sampler | None = None class-attribute instance-attribute","text":"

Like sampler, but returns a batch of indices at a time.

Mutually exclusive with batch_size, shuffle, sampler and drop_last.

"},{"location":"reference/core/data/dataloaders/#eva.data.DataLoader.num_workers","title":"num_workers: int = multiprocessing.cpu_count() class-attribute instance-attribute","text":"

How many workers to use for loading the data.

By default, it will use the number of CPUs available.

"},{"location":"reference/core/data/dataloaders/#eva.data.DataLoader.collate_fn","title":"collate_fn: Callable | None = None class-attribute instance-attribute","text":"

The batching process.

"},{"location":"reference/core/data/dataloaders/#eva.data.DataLoader.pin_memory","title":"pin_memory: bool = True class-attribute instance-attribute","text":"

Will copy Tensors into CUDA pinned memory before returning them.

"},{"location":"reference/core/data/dataloaders/#eva.data.DataLoader.drop_last","title":"drop_last: bool = False class-attribute instance-attribute","text":"

Drops the last incomplete batch.

"},{"location":"reference/core/data/dataloaders/#eva.data.DataLoader.persistent_workers","title":"persistent_workers: bool = True class-attribute instance-attribute","text":"

Will keep the worker processes after a dataset has been consumed once.

"},{"location":"reference/core/data/dataloaders/#eva.data.DataLoader.prefetch_factor","title":"prefetch_factor: int | None = 2 class-attribute instance-attribute","text":"

Number of batches loaded in advance by each worker.

"},{"location":"reference/core/data/datamodules/","title":"Datamodules","text":"

Reference information for the Datamodule classes and functions.

"},{"location":"reference/core/data/datamodules/#eva.data.DataModule","title":"eva.data.DataModule","text":"

Bases: LightningDataModule

DataModule encapsulates all the steps needed to process data.

It will initialize and create the mapping between dataloaders and datasets. During the prepare_data, setup and teardown, the datamodule will call the respective methods from all datasets, given that they are defined.

Parameters:

Name Type Description Default datasets DatasetsSchema | None

The desired datasets.

None dataloaders DataloadersSchema | None

The desired dataloaders.

None Source code in src/eva/core/data/datamodules/datamodule.py
def __init__(\n    self,\n    datasets: schemas.DatasetsSchema | None = None,\n    dataloaders: schemas.DataloadersSchema | None = None,\n) -> None:\n    \"\"\"Initializes the datamodule.\n\n    Args:\n        datasets: The desired datasets.\n        dataloaders: The desired dataloaders.\n    \"\"\"\n    super().__init__()\n\n    self.datasets = datasets or self.default_datasets\n    self.dataloaders = dataloaders or self.default_dataloaders\n
"},{"location":"reference/core/data/datamodules/#eva.data.DataModule.default_datasets","title":"default_datasets: schemas.DatasetsSchema property","text":"

Returns the default datasets.

"},{"location":"reference/core/data/datamodules/#eva.data.DataModule.default_dataloaders","title":"default_dataloaders: schemas.DataloadersSchema property","text":"

Returns the default dataloader schema.

"},{"location":"reference/core/data/datamodules/#eva.data.datamodules.call.call_method_if_exists","title":"eva.data.datamodules.call.call_method_if_exists","text":"

Calls a desired method from the datasets if exists.

Parameters:

Name Type Description Default objects Iterable[Any]

An iterable of objects.

required method str

The dataset method name to call if exists.

required Source code in src/eva/core/data/datamodules/call.py
def call_method_if_exists(objects: Iterable[Any], /, method: str) -> None:\n    \"\"\"Calls a desired `method` from the datasets if exists.\n\n    Args:\n        objects: An iterable of objects.\n        method: The dataset method name to call if exists.\n    \"\"\"\n    for _object in _recursive_iter(objects):\n        if hasattr(_object, method):\n            fn = getattr(_object, method)\n            fn()\n
"},{"location":"reference/core/data/datamodules/#eva.data.datamodules.schemas.DatasetsSchema","title":"eva.data.datamodules.schemas.DatasetsSchema dataclass","text":"

Datasets schema used in DataModule.

"},{"location":"reference/core/data/datamodules/#eva.data.datamodules.schemas.DatasetsSchema.train","title":"train: TRAIN_DATASET = None class-attribute instance-attribute","text":"

Train dataset.

"},{"location":"reference/core/data/datamodules/#eva.data.datamodules.schemas.DatasetsSchema.val","title":"val: EVAL_DATASET = None class-attribute instance-attribute","text":"

Validation dataset.

"},{"location":"reference/core/data/datamodules/#eva.data.datamodules.schemas.DatasetsSchema.test","title":"test: EVAL_DATASET = None class-attribute instance-attribute","text":"

Test dataset.

"},{"location":"reference/core/data/datamodules/#eva.data.datamodules.schemas.DatasetsSchema.predict","title":"predict: EVAL_DATASET = None class-attribute instance-attribute","text":"

Predict dataset.

"},{"location":"reference/core/data/datamodules/#eva.data.datamodules.schemas.DatasetsSchema.tolist","title":"tolist","text":"

Returns the dataclass as a list and optionally filters it given the stage.

Source code in src/eva/core/data/datamodules/schemas.py
def tolist(self, stage: str | None = None) -> List[EVAL_DATASET]:\n    \"\"\"Returns the dataclass as a list and optionally filters it given the stage.\"\"\"\n    match stage:\n        case \"fit\":\n            return [self.train, self.val]\n        case \"validate\":\n            return [self.val]\n        case \"test\":\n            return [self.test]\n        case \"predict\":\n            return [self.predict]\n        case None:\n            return [self.train, self.val, self.test, self.predict]\n        case _:\n            raise ValueError(f\"Invalid stage `{stage}`.\")\n
"},{"location":"reference/core/data/datasets/","title":"Datasets","text":"

Reference information for the Dataset base class.

"},{"location":"reference/core/data/datasets/#eva.core.data.Dataset","title":"eva.core.data.Dataset","text":"

Bases: TorchDataset

Base dataset class.

"},{"location":"reference/core/data/datasets/#eva.core.data.Dataset.prepare_data","title":"prepare_data","text":"

Encapsulates all disk related tasks.

This method is preferred for downloading and preparing the data, for example generate manifest files. If implemented, it will be called via :class:eva.core.data.datamodules.DataModule, which ensures that is called only within a single process, making it multi-processes safe.

Source code in src/eva/core/data/datasets/base.py
def prepare_data(self) -> None:\n    \"\"\"Encapsulates all disk related tasks.\n\n    This method is preferred for downloading and preparing the data, for\n    example generate manifest files. If implemented, it will be called via\n    :class:`eva.core.data.datamodules.DataModule`, which ensures that is called\n    only within a single process, making it multi-processes safe.\n    \"\"\"\n
"},{"location":"reference/core/data/datasets/#eva.core.data.Dataset.setup","title":"setup","text":"

Setups the dataset.

This method is preferred for creating datasets or performing train/val/test splits. If implemented, it will be called via :class:eva.core.data.datamodules.DataModule at the beginning of fit (train + validate), validate, test, or predict and it will be called from every process (i.e. GPU) across all the nodes in DDP.

Source code in src/eva/core/data/datasets/base.py
def setup(self) -> None:\n    \"\"\"Setups the dataset.\n\n    This method is preferred for creating datasets or performing\n    train/val/test splits. If implemented, it will be called via\n    :class:`eva.core.data.datamodules.DataModule` at the beginning of fit\n    (train + validate), validate, test, or predict and it will be called\n    from every process (i.e. GPU) across all the nodes in DDP.\n    \"\"\"\n    self.configure()\n    self.validate()\n
"},{"location":"reference/core/data/datasets/#eva.core.data.Dataset.configure","title":"configure","text":"

Configures the dataset.

This method is preferred to configure the dataset; assign values to attributes, perform splits etc. This would be called from the method ::method::setup, before calling the ::method::validate.

Source code in src/eva/core/data/datasets/base.py
def configure(self):\n    \"\"\"Configures the dataset.\n\n    This method is preferred to configure the dataset; assign values\n    to attributes, perform splits etc. This would be called from the\n    method ::method::`setup`, before calling the ::method::`validate`.\n    \"\"\"\n
"},{"location":"reference/core/data/datasets/#eva.core.data.Dataset.validate","title":"validate","text":"

Validates the dataset.

This method aims to check the integrity of the dataset and verify that is configured properly. This would be called from the method ::method::setup, after calling the ::method::configure.

Source code in src/eva/core/data/datasets/base.py
def validate(self):\n    \"\"\"Validates the dataset.\n\n    This method aims to check the integrity of the dataset and verify\n    that is configured properly. This would be called from the method\n    ::method::`setup`, after calling the ::method::`configure`.\n    \"\"\"\n
"},{"location":"reference/core/data/datasets/#eva.core.data.Dataset.teardown","title":"teardown","text":"

Cleans up the data artifacts.

Used to clean-up when the run is finished. If implemented, it will be called via :class:eva.core.data.datamodules.DataModule at the end of fit (train + validate), validate, test, or predict and it will be called from every process (i.e. GPU) across all the nodes in DDP.

Source code in src/eva/core/data/datasets/base.py
def teardown(self) -> None:\n    \"\"\"Cleans up the data artifacts.\n\n    Used to clean-up when the run is finished. If implemented, it will\n    be called via :class:`eva.core.data.datamodules.DataModule` at the end\n    of fit (train + validate), validate, test, or predict and it will be\n    called from every process (i.e. GPU) across all the nodes in DDP.\n    \"\"\"\n
"},{"location":"reference/core/data/datasets/#embeddings-datasets","title":"Embeddings datasets","text":""},{"location":"reference/core/data/datasets/#eva.core.data.datasets.EmbeddingsClassificationDataset","title":"eva.core.data.datasets.EmbeddingsClassificationDataset","text":"

Bases: EmbeddingsDataset

Embeddings dataset class for classification tasks.

Expects a manifest file listing the paths of .pt files that contain tensor embeddings of shape [embedding_dim] or [1, embedding_dim].

Parameters:

Name Type Description Default root str

Root directory of the dataset.

required manifest_file str

The path to the manifest file, which is relative to the root argument.

required split Literal['train', 'val', 'test'] | None

The dataset split to use. The split column of the manifest file will be splitted based on this value.

None column_mapping Dict[str, str]

Defines the map between the variables and the manifest columns. It will overwrite the default_column_mapping with the provided values, so that column_mapping can contain only the values which are altered or missing.

default_column_mapping embeddings_transforms Callable | None

A function/transform that transforms the embedding.

None target_transforms Callable | None

A function/transform that transforms the target.

None Source code in src/eva/core/data/datasets/embeddings/classification/embeddings.py
def __init__(\n    self,\n    root: str,\n    manifest_file: str,\n    split: Literal[\"train\", \"val\", \"test\"] | None = None,\n    column_mapping: Dict[str, str] = base.default_column_mapping,\n    embeddings_transforms: Callable | None = None,\n    target_transforms: Callable | None = None,\n) -> None:\n    \"\"\"Initialize dataset.\n\n    Expects a manifest file listing the paths of .pt files that contain\n    tensor embeddings of shape [embedding_dim] or [1, embedding_dim].\n\n    Args:\n        root: Root directory of the dataset.\n        manifest_file: The path to the manifest file, which is relative to\n            the `root` argument.\n        split: The dataset split to use. The `split` column of the manifest\n            file will be splitted based on this value.\n        column_mapping: Defines the map between the variables and the manifest\n            columns. It will overwrite the `default_column_mapping` with\n            the provided values, so that `column_mapping` can contain only the\n            values which are altered or missing.\n        embeddings_transforms: A function/transform that transforms the embedding.\n        target_transforms: A function/transform that transforms the target.\n    \"\"\"\n    super().__init__(\n        root=root,\n        manifest_file=manifest_file,\n        split=split,\n        column_mapping=column_mapping,\n        embeddings_transforms=embeddings_transforms,\n        target_transforms=target_transforms,\n    )\n
"},{"location":"reference/core/data/datasets/#eva.core.data.datasets.MultiEmbeddingsClassificationDataset","title":"eva.core.data.datasets.MultiEmbeddingsClassificationDataset","text":"

Bases: EmbeddingsDataset

Dataset class for where a sample corresponds to multiple embeddings.

Example use case: Slide level dataset where each slide has multiple patch embeddings.

Expects a manifest file listing the paths of .pt files containing tensor embeddings.

The manifest must have a column_mapping[\"multi_id\"] column that contains the unique identifier group of embeddings. For oncology datasets, this would be usually the slide id. Each row in the manifest file points to a .pt file that can contain one or multiple embeddings. There can also be multiple rows for the same multi_id, in which case the embeddings from the different .pt files corresponding to that same multi_id will be stacked along the first dimension.

Parameters:

Name Type Description Default root str

Root directory of the dataset.

required manifest_file str

The path to the manifest file, which is relative to the root argument.

required split Literal['train', 'val', 'test']

The dataset split to use. The split column of the manifest file will be splitted based on this value.

required column_mapping Dict[str, str]

Defines the map between the variables and the manifest columns. It will overwrite the default_column_mapping with the provided values, so that column_mapping can contain only the values which are altered or missing.

default_column_mapping embeddings_transforms Callable | None

A function/transform that transforms the embedding.

None target_transforms Callable | None

A function/transform that transforms the target.

None Source code in src/eva/core/data/datasets/embeddings/classification/multi_embeddings.py
def __init__(\n    self,\n    root: str,\n    manifest_file: str,\n    split: Literal[\"train\", \"val\", \"test\"],\n    column_mapping: Dict[str, str] = base.default_column_mapping,\n    embeddings_transforms: Callable | None = None,\n    target_transforms: Callable | None = None,\n):\n    \"\"\"Initialize dataset.\n\n    Expects a manifest file listing the paths of `.pt` files containing tensor embeddings.\n\n    The manifest must have a `column_mapping[\"multi_id\"]` column that contains the\n    unique identifier group of embeddings. For oncology datasets, this would be usually\n    the slide id. Each row in the manifest file points to a .pt file that can contain\n    one or multiple embeddings. There can also be multiple rows for the same `multi_id`,\n    in which case the embeddings from the different .pt files corresponding to that same\n    `multi_id` will be stacked along the first dimension.\n\n    Args:\n        root: Root directory of the dataset.\n        manifest_file: The path to the manifest file, which is relative to\n            the `root` argument.\n        split: The dataset split to use. The `split` column of the manifest\n            file will be splitted based on this value.\n        column_mapping: Defines the map between the variables and the manifest\n            columns. It will overwrite the `default_column_mapping` with\n            the provided values, so that `column_mapping` can contain only the\n            values which are altered or missing.\n        embeddings_transforms: A function/transform that transforms the embedding.\n        target_transforms: A function/transform that transforms the target.\n    \"\"\"\n    super().__init__(\n        manifest_file=manifest_file,\n        root=root,\n        split=split,\n        column_mapping=column_mapping,\n        embeddings_transforms=embeddings_transforms,\n        target_transforms=target_transforms,\n    )\n\n    self._multi_ids: List[int]\n
"},{"location":"reference/core/data/transforms/","title":"Transforms","text":""},{"location":"reference/core/data/transforms/#eva.data.transforms.ArrayToTensor","title":"eva.data.transforms.ArrayToTensor","text":"

Converts a numpy array to a torch tensor.

"},{"location":"reference/core/data/transforms/#eva.data.transforms.ArrayToFloatTensor","title":"eva.data.transforms.ArrayToFloatTensor","text":"

Bases: ArrayToTensor

Converts a numpy array to a torch tensor and casts it to float.

"},{"location":"reference/core/data/transforms/#eva.data.transforms.Pad2DTensor","title":"eva.data.transforms.Pad2DTensor","text":"

Pads a 2D tensor to a fixed dimension accross the first dimension.

Parameters:

Name Type Description Default pad_size int

The size to pad the tensor to. If the tensor is larger than this size, no padding will be applied.

required pad_value int | float

The value to use for padding.

float('-inf') Source code in src/eva/core/data/transforms/padding/pad_2d_tensor.py
def __init__(self, pad_size: int, pad_value: int | float = float(\"-inf\")):\n    \"\"\"Initialize the transformation.\n\n    Args:\n        pad_size: The size to pad the tensor to. If the tensor is larger than this size,\n            no padding will be applied.\n        pad_value: The value to use for padding.\n    \"\"\"\n    self._pad_size = pad_size\n    self._pad_value = pad_value\n
"},{"location":"reference/core/data/transforms/#eva.data.transforms.SampleFromAxis","title":"eva.data.transforms.SampleFromAxis","text":"

Samples n_samples entries from a tensor along a given axis.

Parameters:

Name Type Description Default n_samples int

The number of samples to draw.

required seed int

The seed to use for sampling.

42 axis int

The axis along which to sample.

0 Source code in src/eva/core/data/transforms/sampling/sample_from_axis.py
def __init__(self, n_samples: int, seed: int = 42, axis: int = 0):\n    \"\"\"Initialize the transformation.\n\n    Args:\n        n_samples: The number of samples to draw.\n        seed: The seed to use for sampling.\n        axis: The axis along which to sample.\n    \"\"\"\n    self._seed = seed\n    self._n_samples = n_samples\n    self._axis = axis\n    self._generator = self._get_generator()\n
"},{"location":"reference/core/loggers/loggers/","title":"Loggers","text":""},{"location":"reference/core/loggers/loggers/#eva.core.loggers.DummyLogger","title":"eva.core.loggers.DummyLogger","text":"

Bases: DummyLogger

Dummy logger class.

This logger is currently used as a placeholder when saving results to remote storage, as common lightning loggers do not work with azure blob storage:

https://github.com/Lightning-AI/pytorch-lightning/issues/18861 https://github.com/Lightning-AI/pytorch-lightning/issues/19736

Simply disabling the loggers when pointing to remote storage doesn't work because callbacks such as LearningRateMonitor or ModelCheckpoint require a logger to be present.

Parameters:

Name Type Description Default save_dir str

The save directory (this logger does not save anything, but callbacks might use this path to save their outputs).

required Source code in src/eva/core/loggers/dummy.py
def __init__(self, save_dir: str) -> None:\n    \"\"\"Initializes the logger.\n\n    Args:\n        save_dir: The save directory (this logger does not save anything,\n            but callbacks might use this path to save their outputs).\n    \"\"\"\n    super().__init__()\n    self._save_dir = save_dir\n
"},{"location":"reference/core/loggers/loggers/#eva.core.loggers.DummyLogger.save_dir","title":"save_dir: str property","text":"

Returns the save directory.

"},{"location":"reference/core/metrics/","title":"Metrics","text":"

Reference information for the Metrics classes.

"},{"location":"reference/core/metrics/average_loss/","title":"Average Loss","text":""},{"location":"reference/core/metrics/average_loss/#eva.metrics.AverageLoss","title":"eva.metrics.AverageLoss","text":"

Bases: Metric

Average loss metric tracker.

Source code in src/eva/core/metrics/average_loss.py
def __init__(self) -> None:\n    \"\"\"Initializes the metric.\"\"\"\n    super().__init__()\n\n    self.add_state(\"value\", default=torch.tensor(0), dist_reduce_fx=\"sum\")\n    self.add_state(\"total\", default=torch.tensor(0), dist_reduce_fx=\"sum\")\n
"},{"location":"reference/core/metrics/binary_balanced_accuracy/","title":"Binary Balanced Accuracy","text":""},{"location":"reference/core/metrics/binary_balanced_accuracy/#eva.metrics.BinaryBalancedAccuracy","title":"eva.metrics.BinaryBalancedAccuracy","text":"

Bases: BinaryStatScores

Computes the balanced accuracy for binary classification.

"},{"location":"reference/core/metrics/binary_balanced_accuracy/#eva.metrics.BinaryBalancedAccuracy.compute","title":"compute","text":"

Compute accuracy based on inputs passed in to update previously.

Source code in src/eva/core/metrics/binary_balanced_accuracy.py
def compute(self) -> Tensor:\n    \"\"\"Compute accuracy based on inputs passed in to ``update`` previously.\"\"\"\n    tp, fp, tn, fn = self._final_state()\n    sensitivity = _safe_divide(tp, tp + fn)\n    specificity = _safe_divide(tn, tn + fp)\n    return 0.5 * (sensitivity + specificity)\n
"},{"location":"reference/core/metrics/core/","title":"Core","text":""},{"location":"reference/core/metrics/core/#eva.metrics.MetricModule","title":"eva.metrics.MetricModule","text":"

Bases: Module

The metrics module.

Allows to store and keep track of train, val and test metrics.

Parameters:

Name Type Description Default train MetricCollection | None

The training metric collection.

required val MetricCollection | None

The validation metric collection.

required test MetricCollection | None

The test metric collection.

required Source code in src/eva/core/metrics/structs/module.py
def __init__(\n    self,\n    train: collection.MetricCollection | None,\n    val: collection.MetricCollection | None,\n    test: collection.MetricCollection | None,\n) -> None:\n    \"\"\"Initializes the metrics for the Trainer.\n\n    Args:\n        train: The training metric collection.\n        val: The validation metric collection.\n        test: The test metric collection.\n    \"\"\"\n    super().__init__()\n\n    self._train = train or self.default_metric_collection\n    self._val = val or self.default_metric_collection\n    self._test = test or self.default_metric_collection\n
"},{"location":"reference/core/metrics/core/#eva.metrics.MetricModule.default_metric_collection","title":"default_metric_collection: collection.MetricCollection property","text":"

Returns the default metric collection.

"},{"location":"reference/core/metrics/core/#eva.metrics.MetricModule.training_metrics","title":"training_metrics: collection.MetricCollection property","text":"

Returns the metrics of the train dataset.

"},{"location":"reference/core/metrics/core/#eva.metrics.MetricModule.validation_metrics","title":"validation_metrics: collection.MetricCollection property","text":"

Returns the metrics of the validation dataset.

"},{"location":"reference/core/metrics/core/#eva.metrics.MetricModule.test_metrics","title":"test_metrics: collection.MetricCollection property","text":"

Returns the metrics of the test dataset.

"},{"location":"reference/core/metrics/core/#eva.metrics.MetricModule.from_metrics","title":"from_metrics classmethod","text":"

Initializes a metric module from a list of metrics.

Parameters:

Name Type Description Default train MetricModuleType | None

Metrics for the training stage.

required val MetricModuleType | None

Metrics for the validation stage.

required test MetricModuleType | None

Metrics for the test stage.

required separator str

The separator between the group name of the metric and the metric itself.

'/' Source code in src/eva/core/metrics/structs/module.py
@classmethod\ndef from_metrics(\n    cls,\n    train: MetricModuleType | None,\n    val: MetricModuleType | None,\n    test: MetricModuleType | None,\n    *,\n    separator: str = \"/\",\n) -> MetricModule:\n    \"\"\"Initializes a metric module from a list of metrics.\n\n    Args:\n        train: Metrics for the training stage.\n        val: Metrics for the validation stage.\n        test: Metrics for the test stage.\n        separator: The separator between the group name of the metric\n            and the metric itself.\n    \"\"\"\n    return cls(\n        train=_create_collection_from_metrics(train, prefix=\"train\" + separator),\n        val=_create_collection_from_metrics(val, prefix=\"val\" + separator),\n        test=_create_collection_from_metrics(test, prefix=\"test\" + separator),\n    )\n
"},{"location":"reference/core/metrics/core/#eva.metrics.MetricModule.from_schema","title":"from_schema classmethod","text":"

Initializes a metric module from the metrics schema.

Parameters:

Name Type Description Default schema MetricsSchema

The dataclass metric schema.

required separator str

The separator between the group name of the metric and the metric itself.

'/' Source code in src/eva/core/metrics/structs/module.py
@classmethod\ndef from_schema(\n    cls,\n    schema: schemas.MetricsSchema,\n    *,\n    separator: str = \"/\",\n) -> MetricModule:\n    \"\"\"Initializes a metric module from the metrics schema.\n\n    Args:\n        schema: The dataclass metric schema.\n        separator: The separator between the group name of the metric\n            and the metric itself.\n    \"\"\"\n    return cls.from_metrics(\n        train=schema.training_metrics,\n        val=schema.evaluation_metrics,\n        test=schema.evaluation_metrics,\n        separator=separator,\n    )\n
"},{"location":"reference/core/metrics/core/#eva.metrics.MetricsSchema","title":"eva.metrics.MetricsSchema dataclass","text":"

Metrics schema.

"},{"location":"reference/core/metrics/core/#eva.metrics.MetricsSchema.common","title":"common: MetricModuleType | None = None class-attribute instance-attribute","text":"

Holds the common train and evaluation metrics.

"},{"location":"reference/core/metrics/core/#eva.metrics.MetricsSchema.train","title":"train: MetricModuleType | None = None class-attribute instance-attribute","text":"

The exclusive training metrics.

"},{"location":"reference/core/metrics/core/#eva.metrics.MetricsSchema.evaluation","title":"evaluation: MetricModuleType | None = None class-attribute instance-attribute","text":"

The exclusive evaluation metrics.

"},{"location":"reference/core/metrics/core/#eva.metrics.MetricsSchema.training_metrics","title":"training_metrics: MetricModuleType | None property","text":"

Returns the training metics.

"},{"location":"reference/core/metrics/core/#eva.metrics.MetricsSchema.evaluation_metrics","title":"evaluation_metrics: MetricModuleType | None property","text":"

Returns the evaluation metics.

"},{"location":"reference/core/metrics/defaults/","title":"Defaults","text":""},{"location":"reference/core/metrics/defaults/#eva.metrics.BinaryClassificationMetrics","title":"eva.metrics.BinaryClassificationMetrics","text":"

Bases: MetricCollection

Default metrics for binary classification tasks.

The metrics instantiated here are:

Parameters:

Name Type Description Default threshold float

Threshold for transforming probability to binary (0,1) predictions

0.5 ignore_index int | None

Specifies a target value that is ignored and does not contribute to the metric calculation.

None prefix str | None

A string to append in front of the keys of the output dict.

None postfix str | None

A string to append after the keys of the output dict.

None Source code in src/eva/core/metrics/defaults/classification/binary.py
def __init__(\n    self,\n    threshold: float = 0.5,\n    ignore_index: int | None = None,\n    prefix: str | None = None,\n    postfix: str | None = None,\n) -> None:\n    \"\"\"Initializes the binary classification metrics.\n\n    The metrics instantiated here are:\n\n    - BinaryAUROC\n    - BinaryAccuracy\n    - BinaryBalancedAccuracy\n    - BinaryF1Score\n    - BinaryPrecision\n    - BinaryRecall\n\n    Args:\n        threshold: Threshold for transforming probability to binary (0,1) predictions\n        ignore_index: Specifies a target value that is ignored and does not\n            contribute to the metric calculation.\n        prefix: A string to append in front of the keys of the output dict.\n        postfix: A string to append after the keys of the output dict.\n    \"\"\"\n    super().__init__(\n        metrics=[\n            classification.BinaryAUROC(\n                ignore_index=ignore_index,\n            ),\n            classification.BinaryAccuracy(\n                threshold=threshold,\n                ignore_index=ignore_index,\n            ),\n            binary_balanced_accuracy.BinaryBalancedAccuracy(\n                threshold=threshold,\n                ignore_index=ignore_index,\n            ),\n            classification.BinaryF1Score(\n                threshold=threshold,\n                ignore_index=ignore_index,\n            ),\n            classification.BinaryPrecision(\n                threshold=threshold,\n                ignore_index=ignore_index,\n            ),\n            classification.BinaryRecall(\n                threshold=threshold,\n                ignore_index=ignore_index,\n            ),\n        ],\n        prefix=prefix,\n        postfix=postfix,\n        compute_groups=[\n            [\n                \"BinaryAccuracy\",\n                \"BinaryBalancedAccuracy\",\n                \"BinaryF1Score\",\n                \"BinaryPrecision\",\n                \"BinaryRecall\",\n            ],\n            [\n                \"BinaryAUROC\",\n            ],\n        ],\n    )\n
"},{"location":"reference/core/metrics/defaults/#eva.metrics.MulticlassClassificationMetrics","title":"eva.metrics.MulticlassClassificationMetrics","text":"

Bases: MetricCollection

Default metrics for multi-class classification tasks.

The metrics instantiated here are:

Parameters:

Name Type Description Default num_classes int

Integer specifying the number of classes.

required average Literal['macro', 'weighted', 'none']

Defines the reduction that is applied over labels.

'macro' ignore_index int | None

Specifies a target value that is ignored and does not contribute to the metric calculation.

None prefix str | None

A string to append in front of the keys of the output dict.

None postfix str | None

A string to append after the keys of the output dict.

None Source code in src/eva/core/metrics/defaults/classification/multiclass.py
def __init__(\n    self,\n    num_classes: int,\n    average: Literal[\"macro\", \"weighted\", \"none\"] = \"macro\",\n    ignore_index: int | None = None,\n    prefix: str | None = None,\n    postfix: str | None = None,\n) -> None:\n    \"\"\"Initializes the multi-class classification metrics.\n\n    The metrics instantiated here are:\n\n    - MulticlassAccuracy\n    - MulticlassPrecision\n    - MulticlassRecall\n    - MulticlassF1Score\n    - MulticlassAUROC\n\n    Args:\n        num_classes: Integer specifying the number of classes.\n        average: Defines the reduction that is applied over labels.\n        ignore_index: Specifies a target value that is ignored and does not\n            contribute to the metric calculation.\n        prefix: A string to append in front of the keys of the output dict.\n        postfix: A string to append after the keys of the output dict.\n    \"\"\"\n    super().__init__(\n        metrics=[\n            classification.MulticlassAUROC(\n                num_classes=num_classes,\n                average=average,\n                ignore_index=ignore_index,\n            ),\n            classification.MulticlassAccuracy(\n                num_classes=num_classes,\n                average=average,\n                ignore_index=ignore_index,\n            ),\n            classification.MulticlassF1Score(\n                num_classes=num_classes,\n                average=average,\n                ignore_index=ignore_index,\n            ),\n            classification.MulticlassPrecision(\n                num_classes=num_classes,\n                average=average,\n                ignore_index=ignore_index,\n            ),\n            classification.MulticlassRecall(\n                num_classes=num_classes,\n                average=average,\n                ignore_index=ignore_index,\n            ),\n        ],\n        prefix=prefix,\n        postfix=postfix,\n        compute_groups=[\n            [\n                \"MulticlassAccuracy\",\n                \"MulticlassF1Score\",\n                \"MulticlassPrecision\",\n                \"MulticlassRecall\",\n            ],\n            [\n                \"MulticlassAUROC\",\n            ],\n        ],\n    )\n
"},{"location":"reference/core/models/modules/","title":"Modules","text":"

Reference information for the model Modules API.

"},{"location":"reference/core/models/modules/#eva.models.modules.ModelModule","title":"eva.models.modules.ModelModule","text":"

Bases: LightningModule

The base model module.

Parameters:

Name Type Description Default metrics MetricsSchema | None

The metric groups to track.

None postprocess BatchPostProcess | None

A list of helper functions to apply after the loss and before the metrics calculation to the model predictions and targets.

None Source code in src/eva/core/models/modules/module.py
def __init__(\n    self,\n    metrics: metrics_lib.MetricsSchema | None = None,\n    postprocess: batch_postprocess.BatchPostProcess | None = None,\n) -> None:\n    \"\"\"Initializes the basic module.\n\n    Args:\n        metrics: The metric groups to track.\n        postprocess: A list of helper functions to apply after the\n            loss and before the metrics calculation to the model\n            predictions and targets.\n    \"\"\"\n    super().__init__()\n\n    self._metrics = metrics or self.default_metrics\n    self._postprocess = postprocess or self.default_postprocess\n\n    self.metrics = metrics_lib.MetricModule.from_schema(self._metrics)\n
"},{"location":"reference/core/models/modules/#eva.models.modules.ModelModule.default_metrics","title":"default_metrics: metrics_lib.MetricsSchema property","text":"

The default metrics.

"},{"location":"reference/core/models/modules/#eva.models.modules.ModelModule.default_postprocess","title":"default_postprocess: batch_postprocess.BatchPostProcess property","text":"

The default post-processes.

"},{"location":"reference/core/models/modules/#eva.models.modules.ModelModule.metrics_device","title":"metrics_device: torch.device property","text":"

Returns the device by which the metrics should be calculated.

We allocate the metrics to CPU when operating on single device, as it is much faster, but to GPU when employing multiple ones, as DDP strategy requires the metrics to be allocated to the module's GPU.

"},{"location":"reference/core/models/modules/#eva.models.modules.HeadModule","title":"eva.models.modules.HeadModule","text":"

Bases: ModelModule

Neural Net Head Module for training on features.

It can be used for supervised (mini-batch) stochastic gradient descent downstream tasks such as classification, regression and segmentation.

Parameters:

Name Type Description Default head MODEL_TYPE

The neural network that would be trained on the features.

required criterion Callable[..., Tensor]

The loss function to use.

required backbone MODEL_TYPE | None

The feature extractor. If None, it will be expected that the input batch returns the features directly.

None optimizer OptimizerCallable

The optimizer to use.

Adam lr_scheduler LRSchedulerCallable

The learning rate scheduler to use.

ConstantLR metrics MetricsSchema | None

The metric groups to track.

None postprocess BatchPostProcess | None

A list of helper functions to apply after the loss and before the metrics calculation to the model predictions and targets.

None Source code in src/eva/core/models/modules/head.py
def __init__(\n    self,\n    head: MODEL_TYPE,\n    criterion: Callable[..., torch.Tensor],\n    backbone: MODEL_TYPE | None = None,\n    optimizer: OptimizerCallable = optim.Adam,\n    lr_scheduler: LRSchedulerCallable = lr_scheduler.ConstantLR,\n    metrics: metrics_lib.MetricsSchema | None = None,\n    postprocess: batch_postprocess.BatchPostProcess | None = None,\n) -> None:\n    \"\"\"Initializes the neural net head module.\n\n    Args:\n        head: The neural network that would be trained on the features.\n        criterion: The loss function to use.\n        backbone: The feature extractor. If `None`, it will be expected\n            that the input batch returns the features directly.\n        optimizer: The optimizer to use.\n        lr_scheduler: The learning rate scheduler to use.\n        metrics: The metric groups to track.\n        postprocess: A list of helper functions to apply after the\n            loss and before the metrics calculation to the model\n            predictions and targets.\n    \"\"\"\n    super().__init__(metrics=metrics, postprocess=postprocess)\n\n    self.head = head\n    self.criterion = criterion\n    self.backbone = backbone\n    self.optimizer = optimizer\n    self.lr_scheduler = lr_scheduler\n
"},{"location":"reference/core/models/modules/#eva.models.modules.InferenceModule","title":"eva.models.modules.InferenceModule","text":"

Bases: ModelModule

An lightweight model module to perform inference.

Parameters:

Name Type Description Default backbone MODEL_TYPE

The network to be used for inference.

required Source code in src/eva/core/models/modules/inference.py
def __init__(self, backbone: MODEL_TYPE) -> None:\n    \"\"\"Initializes the module.\n\n    Args:\n        backbone: The network to be used for inference.\n    \"\"\"\n    super().__init__(metrics=None)\n\n    self.backbone = backbone\n
"},{"location":"reference/core/models/networks/","title":"Networks","text":"

Reference information for the model Networks API.

"},{"location":"reference/core/models/networks/#eva.models.networks.MLP","title":"eva.models.networks.MLP","text":"

Bases: Module

A Multi-layer Perceptron (MLP) network.

Parameters:

Name Type Description Default input_size int

The number of input features.

required output_size int

The number of output features.

required hidden_layer_sizes tuple[int, ...] | None

A list specifying the number of units in each hidden layer.

None dropout float

Dropout probability for hidden layers.

0.0 hidden_activation_fn Type[Module] | None

Activation function to use for hidden layers. Default is ReLU.

ReLU output_activation_fn Type[Module] | None

Activation function to use for the output layer. Default is None.

None Source code in src/eva/core/models/networks/mlp.py
def __init__(\n    self,\n    input_size: int,\n    output_size: int,\n    hidden_layer_sizes: tuple[int, ...] | None = None,\n    hidden_activation_fn: Type[torch.nn.Module] | None = nn.ReLU,\n    output_activation_fn: Type[torch.nn.Module] | None = None,\n    dropout: float = 0.0,\n) -> None:\n    \"\"\"Initializes the MLP.\n\n    Args:\n        input_size: The number of input features.\n        output_size: The number of output features.\n        hidden_layer_sizes: A list specifying the number of units in each hidden layer.\n        dropout: Dropout probability for hidden layers.\n        hidden_activation_fn: Activation function to use for hidden layers. Default is ReLU.\n        output_activation_fn: Activation function to use for the output layer. Default is None.\n    \"\"\"\n    super().__init__()\n\n    self.input_size = input_size\n    self.output_size = output_size\n    self.hidden_layer_sizes = hidden_layer_sizes if hidden_layer_sizes is not None else ()\n    self.hidden_activation_fn = hidden_activation_fn\n    self.output_activation_fn = output_activation_fn\n    self.dropout = dropout\n\n    self._network = self._build_network()\n
"},{"location":"reference/core/models/networks/#eva.models.networks.MLP.forward","title":"forward","text":"

Defines the forward pass of the MLP.

Parameters:

Name Type Description Default x Tensor

The input tensor.

required

Returns:

Type Description Tensor

The output of the network.

Source code in src/eva/core/models/networks/mlp.py
def forward(self, x: torch.Tensor) -> torch.Tensor:\n    \"\"\"Defines the forward pass of the MLP.\n\n    Args:\n        x: The input tensor.\n\n    Returns:\n        The output of the network.\n    \"\"\"\n    return self._network(x)\n
"},{"location":"reference/core/models/networks/#wrappers","title":"Wrappers","text":""},{"location":"reference/core/models/networks/#eva.models.networks.wrappers.BaseModel","title":"eva.models.networks.wrappers.BaseModel","text":"

Bases: Module

Base class for model wrappers.

Parameters:

Name Type Description Default tensor_transforms Callable | None

The transforms to apply to the output tensor produced by the model.

None Source code in src/eva/core/models/networks/wrappers/base.py
def __init__(self, tensor_transforms: Callable | None = None) -> None:\n    \"\"\"Initializes the model.\n\n    Args:\n        tensor_transforms: The transforms to apply to the output\n            tensor produced by the model.\n    \"\"\"\n    super().__init__()\n\n    self._output_transforms = tensor_transforms\n
"},{"location":"reference/core/models/networks/#eva.models.networks.wrappers.BaseModel.load_model","title":"load_model abstractmethod","text":"

Loads the model.

Source code in src/eva/core/models/networks/wrappers/base.py
@abc.abstractmethod\ndef load_model(self) -> Callable[..., torch.Tensor]:\n    \"\"\"Loads the model.\"\"\"\n    raise NotImplementedError\n
"},{"location":"reference/core/models/networks/#eva.models.networks.wrappers.BaseModel.model_forward","title":"model_forward abstractmethod","text":"

Implements the forward pass of the model.

Parameters:

Name Type Description Default tensor Tensor

The input tensor to the model.

required Source code in src/eva/core/models/networks/wrappers/base.py
@abc.abstractmethod\ndef model_forward(self, tensor: torch.Tensor) -> torch.Tensor:\n    \"\"\"Implements the forward pass of the model.\n\n    Args:\n        tensor: The input tensor to the model.\n    \"\"\"\n    raise NotImplementedError\n
"},{"location":"reference/core/models/networks/#eva.models.networks.wrappers.ModelFromFunction","title":"eva.models.networks.wrappers.ModelFromFunction","text":"

Bases: BaseModel

Wrapper class for models which are initialized from functions.

This is helpful for initializing models in a .yaml configuration file.

Parameters:

Name Type Description Default path Callable[..., Module]

The path to the callable object (class or function).

required arguments Dict[str, Any] | None

The extra callable function / class arguments.

None checkpoint_path str | None

The path to the checkpoint to load the model weights from. This is currently only supported for torch model checkpoints. For other formats, the checkpoint loading should be handled within the provided callable object in . None tensor_transforms Callable | None

The transforms to apply to the output tensor produced by the model.

None Source code in src/eva/core/models/networks/wrappers/from_function.py
def __init__(\n    self,\n    path: Callable[..., nn.Module],\n    arguments: Dict[str, Any] | None = None,\n    checkpoint_path: str | None = None,\n    tensor_transforms: Callable | None = None,\n) -> None:\n    \"\"\"Initializes and constructs the model.\n\n    Args:\n        path: The path to the callable object (class or function).\n        arguments: The extra callable function / class arguments.\n        checkpoint_path: The path to the checkpoint to load the model\n            weights from. This is currently only supported for torch\n            model checkpoints. For other formats, the checkpoint loading\n            should be handled within the provided callable object in <path>.\n        tensor_transforms: The transforms to apply to the output tensor\n            produced by the model.\n    \"\"\"\n    super().__init__()\n\n    self._path = path\n    self._arguments = arguments\n    self._checkpoint_path = checkpoint_path\n    self._tensor_transforms = tensor_transforms\n\n    self._model = self.load_model()\n
"},{"location":"reference/core/models/networks/#eva.models.networks.wrappers.HuggingFaceModel","title":"eva.models.networks.wrappers.HuggingFaceModel","text":"

Bases: BaseModel

Wrapper class for loading HuggingFace transformers models.

Parameters:

Name Type Description Default model_name_or_path str

The model name or path to load the model from. This can be a local path or a model name from the HuggingFace model hub.

required tensor_transforms Callable | None

The transforms to apply to the output tensor produced by the model.

None Source code in src/eva/core/models/networks/wrappers/huggingface.py
def __init__(self, model_name_or_path: str, tensor_transforms: Callable | None = None) -> None:\n    \"\"\"Initializes the model.\n\n    Args:\n        model_name_or_path: The model name or path to load the model from.\n            This can be a local path or a model name from the `HuggingFace`\n            model hub.\n        tensor_transforms: The transforms to apply to the output tensor\n            produced by the model.\n    \"\"\"\n    super().__init__(tensor_transforms=tensor_transforms)\n\n    self._model_name_or_path = model_name_or_path\n    self._model = self.load_model()\n
"},{"location":"reference/core/models/networks/#eva.models.networks.wrappers.ONNXModel","title":"eva.models.networks.wrappers.ONNXModel","text":"

Bases: BaseModel

Wrapper class for loading ONNX models.

Parameters:

Name Type Description Default path str

The path to the .onnx model file.

required device Literal['cpu', 'cuda'] | None

The device to run the model on. This can be either \"cpu\" or \"cuda\".

'cpu' tensor_transforms Callable | None

The transforms to apply to the output tensor produced by the model.

None Source code in src/eva/core/models/networks/wrappers/onnx.py
def __init__(\n    self,\n    path: str,\n    device: Literal[\"cpu\", \"cuda\"] | None = \"cpu\",\n    tensor_transforms: Callable | None = None,\n):\n    \"\"\"Initializes the model.\n\n    Args:\n        path: The path to the .onnx model file.\n        device: The device to run the model on. This can be either \"cpu\" or \"cuda\".\n        tensor_transforms: The transforms to apply to the output tensor produced by the model.\n    \"\"\"\n    super().__init__(tensor_transforms=tensor_transforms)\n\n    self._path = path\n    self._device = device\n    self._model = self.load_model()\n
"},{"location":"reference/core/trainers/functional/","title":"Functional","text":"

Reference information for the trainers Functional API.

"},{"location":"reference/core/trainers/functional/#eva.core.trainers.functional.run_evaluation_session","title":"eva.core.trainers.functional.run_evaluation_session","text":"

Runs a downstream evaluation session out-of-place.

It performs an evaluation run (fit and evaluate) on the model multiple times. Note that as the input base_trainer and base_model would be cloned, the input object would not be modified.

Parameters:

Name Type Description Default base_trainer Trainer

The base trainer module to use.

required base_model ModelModule

The base model module to use.

required datamodule DataModule

The data module.

required n_runs int

The amount of runs (fit and evaluate) to perform.

1 verbose bool

Whether to verbose the session metrics instead of these of each individual runs and vice-versa.

True Source code in src/eva/core/trainers/functional.py
def run_evaluation_session(\n    base_trainer: eva_trainer.Trainer,\n    base_model: modules.ModelModule,\n    datamodule: datamodules.DataModule,\n    *,\n    n_runs: int = 1,\n    verbose: bool = True,\n) -> None:\n    \"\"\"Runs a downstream evaluation session out-of-place.\n\n    It performs an evaluation run (fit and evaluate) on the model\n    multiple times. Note that as the input `base_trainer` and\n    `base_model` would be cloned, the input object would not\n    be modified.\n\n    Args:\n        base_trainer: The base trainer module to use.\n        base_model: The base model module to use.\n        datamodule: The data module.\n        n_runs: The amount of runs (fit and evaluate) to perform.\n        verbose: Whether to verbose the session metrics instead of\n            these of each individual runs and vice-versa.\n    \"\"\"\n    recorder = _recorder.SessionRecorder(output_dir=base_trainer.default_log_dir, verbose=verbose)\n    for run_index in range(n_runs):\n        validation_scores, test_scores = run_evaluation(\n            base_trainer,\n            base_model,\n            datamodule,\n            run_id=f\"run_{run_index}\",\n            verbose=not verbose,\n        )\n        recorder.update(validation_scores, test_scores)\n    recorder.save()\n
"},{"location":"reference/core/trainers/functional/#eva.core.trainers.functional.run_evaluation","title":"eva.core.trainers.functional.run_evaluation","text":"

Fits and evaluates a model out-of-place.

Parameters:

Name Type Description Default base_trainer Trainer

The base trainer to use but not modify.

required base_model ModelModule

The model module to use but not modify.

required datamodule DataModule

The data module.

required run_id str | None

The run id to be appended to the output log directory. If None, it will use the log directory of the trainer as is.

None verbose bool

Whether to print the validation and test metrics in the end of the training.

True

Returns:

Type Description Tuple[_EVALUATE_OUTPUT, _EVALUATE_OUTPUT | None]

A tuple of with the validation and the test metrics (if exists).

Source code in src/eva/core/trainers/functional.py
def run_evaluation(\n    base_trainer: eva_trainer.Trainer,\n    base_model: modules.ModelModule,\n    datamodule: datamodules.DataModule,\n    *,\n    run_id: str | None = None,\n    verbose: bool = True,\n) -> Tuple[_EVALUATE_OUTPUT, _EVALUATE_OUTPUT | None]:\n    \"\"\"Fits and evaluates a model out-of-place.\n\n    Args:\n        base_trainer: The base trainer to use but not modify.\n        base_model: The model module to use but not modify.\n        datamodule: The data module.\n        run_id: The run id to be appended to the output log directory.\n            If `None`, it will use the log directory of the trainer as is.\n        verbose: Whether to print the validation and test metrics\n            in the end of the training.\n\n    Returns:\n        A tuple of with the validation and the test metrics (if exists).\n    \"\"\"\n    trainer, model = _utils.clone(base_trainer, base_model)\n    trainer.setup_log_dirs(run_id or \"\")\n    return fit_and_validate(trainer, model, datamodule, verbose=verbose)\n
"},{"location":"reference/core/trainers/functional/#eva.core.trainers.functional.fit_and_validate","title":"eva.core.trainers.functional.fit_and_validate","text":"

Fits and evaluates a model in-place.

If the test set is set in the datamodule, it will evaluate the model on the test set as well.

Parameters:

Name Type Description Default trainer Trainer

The trainer module to use and update in-place.

required model ModelModule

The model module to use and update in-place.

required datamodule DataModule

The data module.

required verbose bool

Whether to print the validation and test metrics in the end of the training.

True

Returns:

Type Description Tuple[_EVALUATE_OUTPUT, _EVALUATE_OUTPUT | None]

A tuple of with the validation and the test metrics (if exists).

Source code in src/eva/core/trainers/functional.py
def fit_and_validate(\n    trainer: eva_trainer.Trainer,\n    model: modules.ModelModule,\n    datamodule: datamodules.DataModule,\n    verbose: bool = True,\n) -> Tuple[_EVALUATE_OUTPUT, _EVALUATE_OUTPUT | None]:\n    \"\"\"Fits and evaluates a model in-place.\n\n    If the test set is set in the datamodule, it will evaluate the model\n    on the test set as well.\n\n    Args:\n        trainer: The trainer module to use and update in-place.\n        model: The model module to use and update in-place.\n        datamodule: The data module.\n        verbose: Whether to print the validation and test metrics\n            in the end of the training.\n\n    Returns:\n        A tuple of with the validation and the test metrics (if exists).\n    \"\"\"\n    trainer.fit(model, datamodule=datamodule)\n    validation_scores = trainer.validate(datamodule=datamodule, verbose=verbose)\n    test_scores = (\n        None\n        if datamodule.datasets.test is None\n        else trainer.test(datamodule=datamodule, verbose=verbose)\n    )\n    return validation_scores, test_scores\n
"},{"location":"reference/core/trainers/functional/#eva.core.trainers.functional.infer_model","title":"eva.core.trainers.functional.infer_model","text":"

Performs model inference out-of-place.

Note that the input base_model and base_trainer would not be modified.

Parameters:

Name Type Description Default base_trainer Trainer

The base trainer to use but not modify.

required base_model ModelModule

The model module to use but not modify.

required datamodule DataModule

The data module.

required return_predictions bool

Whether to return the model predictions.

False Source code in src/eva/core/trainers/functional.py
def infer_model(\n    base_trainer: eva_trainer.Trainer,\n    base_model: modules.ModelModule,\n    datamodule: datamodules.DataModule,\n    *,\n    return_predictions: bool = False,\n) -> None:\n    \"\"\"Performs model inference out-of-place.\n\n    Note that the input `base_model` and `base_trainer` would\n    not be modified.\n\n    Args:\n        base_trainer: The base trainer to use but not modify.\n        base_model: The model module to use but not modify.\n        datamodule: The data module.\n        return_predictions: Whether to return the model predictions.\n    \"\"\"\n    trainer, model = _utils.clone(base_trainer, base_model)\n    return trainer.predict(\n        model=model,\n        datamodule=datamodule,\n        return_predictions=return_predictions,\n    )\n
"},{"location":"reference/core/trainers/trainer/","title":"Trainers","text":"

Reference information for the Trainers API.

"},{"location":"reference/core/trainers/trainer/#eva.core.trainers.Trainer","title":"eva.core.trainers.Trainer","text":"

Bases: Trainer

Core trainer class.

This is an extended version of lightning's core trainer class.

For the input arguments, refer to ::class::lightning.pytorch.Trainer.

Parameters:

Name Type Description Default args Any

Positional arguments of ::class::lightning.pytorch.Trainer.

() default_root_dir str

The default root directory to store the output logs. Unlike in ::class::lightning.pytorch.Trainer, this path would be the prioritized destination point.

'logs' n_runs int

The amount of runs (fit and evaluate) to perform in an evaluation session.

1 kwargs Any

Kew-word arguments of ::class::lightning.pytorch.Trainer.

{} Source code in src/eva/core/trainers/trainer.py
@argparse._defaults_from_env_vars\ndef __init__(\n    self,\n    *args: Any,\n    default_root_dir: str = \"logs\",\n    n_runs: int = 1,\n    **kwargs: Any,\n) -> None:\n    \"\"\"Initializes the trainer.\n\n    For the input arguments, refer to ::class::`lightning.pytorch.Trainer`.\n\n    Args:\n        args: Positional arguments of ::class::`lightning.pytorch.Trainer`.\n        default_root_dir: The default root directory to store the output logs.\n            Unlike in ::class::`lightning.pytorch.Trainer`, this path would be the\n            prioritized destination point.\n        n_runs: The amount of runs (fit and evaluate) to perform in an evaluation session.\n        kwargs: Kew-word arguments of ::class::`lightning.pytorch.Trainer`.\n    \"\"\"\n    super().__init__(*args, default_root_dir=default_root_dir, **kwargs)\n\n    self._n_runs = n_runs\n\n    self._session_id: str = _logging.generate_session_id()\n    self._log_dir: str = self.default_log_dir\n\n    self.setup_log_dirs()\n
"},{"location":"reference/core/trainers/trainer/#eva.core.trainers.Trainer.default_log_dir","title":"default_log_dir: str property","text":"

Returns the default log directory.

"},{"location":"reference/core/trainers/trainer/#eva.core.trainers.Trainer.setup_log_dirs","title":"setup_log_dirs","text":"

Setups the logging directory of the trainer and experimental loggers in-place.

Parameters:

Name Type Description Default subdirectory str

Whether to append a subdirectory to the output log.

'' Source code in src/eva/core/trainers/trainer.py
def setup_log_dirs(self, subdirectory: str = \"\") -> None:\n    \"\"\"Setups the logging directory of the trainer and experimental loggers in-place.\n\n    Args:\n        subdirectory: Whether to append a subdirectory to the output log.\n    \"\"\"\n    self._log_dir = os.path.join(self.default_root_dir, self._session_id, subdirectory)\n\n    enabled_loggers = []\n    if isinstance(self.loggers, list) and len(self.loggers) > 0:\n        for logger in self.loggers:\n            if isinstance(logger, (pl_loggers.CSVLogger, pl_loggers.TensorBoardLogger)):\n                if not cloud_io._is_local_file_protocol(self.default_root_dir):\n                    loguru.logger.warning(\n                        f\"Skipped {type(logger).__name__} as remote storage is not supported.\"\n                    )\n                    continue\n                else:\n                    logger._root_dir = self.default_root_dir\n                    logger._name = self._session_id\n                    logger._version = subdirectory\n            enabled_loggers.append(logger)\n\n    self._loggers = enabled_loggers or [eva_loggers.DummyLogger(self._log_dir)]\n
"},{"location":"reference/core/trainers/trainer/#eva.core.trainers.Trainer.run_evaluation_session","title":"run_evaluation_session","text":"

Runs an evaluation session out-of-place.

It performs an evaluation run (fit and evaluate) the model self._n_run times. Note that the input base_model would not be modified, so the weights of the input model will remain as they are.

Parameters:

Name Type Description Default model ModelModule

The base model module to evaluate.

required datamodule DataModule

The data module.

required Source code in src/eva/core/trainers/trainer.py
def run_evaluation_session(\n    self,\n    model: modules.ModelModule,\n    datamodule: datamodules.DataModule,\n) -> None:\n    \"\"\"Runs an evaluation session out-of-place.\n\n    It performs an evaluation run (fit and evaluate) the model\n    `self._n_run` times. Note that the input `base_model` would\n    not be modified, so the weights of the input model will remain\n    as they are.\n\n    Args:\n        model: The base model module to evaluate.\n        datamodule: The data module.\n    \"\"\"\n    functional.run_evaluation_session(\n        base_trainer=self,\n        base_model=model,\n        datamodule=datamodule,\n        n_runs=self._n_runs,\n        verbose=self._n_runs > 1,\n    )\n
"},{"location":"reference/core/utils/multiprocessing/","title":"Multiprocessing","text":"

Reference information for the utils Multiprocessing API.

"},{"location":"reference/core/utils/multiprocessing/#eva.core.utils.multiprocessing.Process","title":"eva.core.utils.multiprocessing.Process","text":"

Bases: Process

Multiprocessing wrapper with logic to propagate exceptions to the parent process.

Source: https://stackoverflow.com/a/33599967/4992248

Source code in src/eva/core/utils/multiprocessing.py
def __init__(self, *args: Any, **kwargs: Any) -> None:\n    \"\"\"Initialize the process.\"\"\"\n    multiprocessing.Process.__init__(self, *args, **kwargs)\n\n    self._parent_conn, self._child_conn = multiprocessing.Pipe()\n    self._exception = None\n
"},{"location":"reference/core/utils/multiprocessing/#eva.core.utils.multiprocessing.Process.exception","title":"exception property","text":"

Property that contains exception information from the process.

"},{"location":"reference/core/utils/multiprocessing/#eva.core.utils.multiprocessing.Process.run","title":"run","text":"

Run the process.

Source code in src/eva/core/utils/multiprocessing.py
def run(self) -> None:\n    \"\"\"Run the process.\"\"\"\n    try:\n        multiprocessing.Process.run(self)\n        self._child_conn.send(None)\n    except Exception as e:\n        tb = traceback.format_exc()\n        self._child_conn.send((e, tb))\n
"},{"location":"reference/core/utils/multiprocessing/#eva.core.utils.multiprocessing.Process.check_exceptions","title":"check_exceptions","text":"

Check for exception propagate it to the parent process.

Source code in src/eva/core/utils/multiprocessing.py
def check_exceptions(self) -> None:\n    \"\"\"Check for exception propagate it to the parent process.\"\"\"\n    if not self.is_alive():\n        if self.exception:\n            error, traceback = self.exception\n            sys.stderr.write(traceback + \"\\n\")\n            raise error\n
"},{"location":"reference/core/utils/workers/","title":"Workers","text":"

Reference information for the utils Workers API.

"},{"location":"reference/core/utils/workers/#eva.core.utils.workers.main_worker_only","title":"eva.core.utils.workers.main_worker_only","text":"

Function decorator which will execute it only on main / worker process.

Source code in src/eva/core/utils/workers.py
def main_worker_only(func: Callable) -> Any:\n    \"\"\"Function decorator which will execute it only on main / worker process.\"\"\"\n\n    def wrapper(*args: Any, **kwargs: Any) -> Any:\n        \"\"\"Wrapper function for the decorated method.\"\"\"\n        if is_main_worker():\n            return func(*args, **kwargs)\n\n    return wrapper\n
"},{"location":"reference/core/utils/workers/#eva.core.utils.workers.is_main_worker","title":"eva.core.utils.workers.is_main_worker","text":"

Returns whether the main process / worker is currently used.

Source code in src/eva/core/utils/workers.py
def is_main_worker() -> bool:\n    \"\"\"Returns whether the main process / worker is currently used.\"\"\"\n    process = multiprocessing.current_process()\n    return process.name == \"MainProcess\"\n
"},{"location":"reference/vision/","title":"Vision","text":"

Reference information for the Vision API.

If you have not already installed the Vision-package, install it with:

pip install 'kaiko-eva[vision]'\n

"},{"location":"reference/vision/utils/","title":"Utils","text":""},{"location":"reference/vision/utils/#eva.vision.utils.io.image","title":"eva.vision.utils.io.image","text":"

Image I/O related functions.

"},{"location":"reference/vision/utils/#eva.vision.utils.io.image.read_image","title":"read_image","text":"

Reads and loads the image from a file path as a RGB.

Parameters:

Name Type Description Default path str

The path of the image file.

required

Returns:

Type Description NDArray[uint8]

The RGB image as a numpy array.

Raises:

Type Description FileExistsError

If the path does not exist or it is unreachable.

IOError

If the image could not be loaded.

Source code in src/eva/vision/utils/io/image.py
def read_image(path: str) -> npt.NDArray[np.uint8]:\n    \"\"\"Reads and loads the image from a file path as a RGB.\n\n    Args:\n        path: The path of the image file.\n\n    Returns:\n        The RGB image as a numpy array.\n\n    Raises:\n        FileExistsError: If the path does not exist or it is unreachable.\n        IOError: If the image could not be loaded.\n    \"\"\"\n    return read_image_as_array(path, cv2.IMREAD_COLOR)\n
"},{"location":"reference/vision/utils/#eva.vision.utils.io.image.read_image_as_array","title":"read_image_as_array","text":"

Reads and loads an image file as a numpy array.

Parameters:

Name Type Description Default path str

The path to the image file.

required flags int

Specifies the way in which the image should be read.

IMREAD_UNCHANGED

Returns:

Type Description NDArray[uint8]

The image as a numpy array.

Raises:

Type Description FileExistsError

If the path does not exist or it is unreachable.

IOError

If the image could not be loaded.

Source code in src/eva/vision/utils/io/image.py
def read_image_as_array(path: str, flags: int = cv2.IMREAD_UNCHANGED) -> npt.NDArray[np.uint8]:\n    \"\"\"Reads and loads an image file as a numpy array.\n\n    Args:\n        path: The path to the image file.\n        flags: Specifies the way in which the image should be read.\n\n    Returns:\n        The image as a numpy array.\n\n    Raises:\n        FileExistsError: If the path does not exist or it is unreachable.\n        IOError: If the image could not be loaded.\n    \"\"\"\n    _utils.check_file(path)\n    image = cv2.imread(path, flags=flags)\n    if image is None:\n        raise IOError(\n            f\"Input '{path}' could not be loaded. \"\n            \"Please verify that the path is a valid image file.\"\n        )\n\n    if image.ndim == 3:\n        image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)\n\n    if image.ndim == 2 and flags == cv2.IMREAD_COLOR:\n        image = image[:, :, np.newaxis]\n\n    return np.asarray(image).astype(np.uint8)\n
"},{"location":"reference/vision/utils/#eva.vision.utils.io.nifti","title":"eva.vision.utils.io.nifti","text":"

NIfTI I/O related functions.

"},{"location":"reference/vision/utils/#eva.vision.utils.io.nifti.read_nifti_slice","title":"read_nifti_slice","text":"

Reads and loads a NIfTI image from a file path as uint8.

Parameters:

Name Type Description Default path str

The path to the NIfTI file.

required slice_index int

The image slice index to return.

required use_storage_dtype bool

Whether to cast the raw image array to the inferred type.

True

Returns:

Type Description NDArray[Any]

The image as a numpy array (height, width, channels).

Raises:

Type Description FileExistsError

If the path does not exist or it is unreachable.

ValueError

If the input channel is invalid for the image.

Source code in src/eva/vision/utils/io/nifti.py
def read_nifti_slice(\n    path: str, slice_index: int, *, use_storage_dtype: bool = True\n) -> npt.NDArray[Any]:\n    \"\"\"Reads and loads a NIfTI image from a file path as `uint8`.\n\n    Args:\n        path: The path to the NIfTI file.\n        slice_index: The image slice index to return.\n        use_storage_dtype: Whether to cast the raw image\n            array to the inferred type.\n\n    Returns:\n        The image as a numpy array (height, width, channels).\n\n    Raises:\n        FileExistsError: If the path does not exist or it is unreachable.\n        ValueError: If the input channel is invalid for the image.\n    \"\"\"\n    _utils.check_file(path)\n    image_data = nib.load(path)  # type: ignore\n    image_slice = image_data.slicer[:, :, slice_index : slice_index + 1]  # type: ignore\n    image_array = image_slice.get_fdata()\n    if use_storage_dtype:\n        image_array = image_array.astype(image_data.get_data_dtype())  # type: ignore\n    return image_array\n
"},{"location":"reference/vision/utils/#eva.vision.utils.io.nifti.fetch_total_nifti_slices","title":"fetch_total_nifti_slices","text":"

Fetches the total slides of a NIfTI image file.

Parameters:

Name Type Description Default path str

The path to the NIfTI file.

required

Returns:

Type Description int

The number of the total available slides.

Raises:

Type Description FileExistsError

If the path does not exist or it is unreachable.

ValueError

If the input channel is invalid for the image.

Source code in src/eva/vision/utils/io/nifti.py
def fetch_total_nifti_slices(path: str) -> int:\n    \"\"\"Fetches the total slides of a NIfTI image file.\n\n    Args:\n        path: The path to the NIfTI file.\n\n    Returns:\n        The number of the total available slides.\n\n    Raises:\n        FileExistsError: If the path does not exist or it is unreachable.\n        ValueError: If the input channel is invalid for the image.\n    \"\"\"\n    _utils.check_file(path)\n    image = nib.load(path)  # type: ignore\n    image_shape = image.header.get_data_shape()  # type: ignore\n    return image_shape[-1]\n
"},{"location":"reference/vision/data/","title":"Vision Data","text":"

Reference information for the Vision Data API.

"},{"location":"reference/vision/data/datasets/","title":"Datasets","text":""},{"location":"reference/vision/data/datasets/#visiondataset","title":"VisionDataset","text":""},{"location":"reference/vision/data/datasets/#eva.vision.data.datasets.VisionDataset","title":"eva.vision.data.datasets.VisionDataset","text":"

Bases: Dataset, ABC, Generic[DataSample]

Base dataset class for vision tasks.

"},{"location":"reference/vision/data/datasets/#eva.vision.data.datasets.VisionDataset.filename","title":"filename abstractmethod","text":"

Returns the filename of the index'th data sample.

Note that this is the relative file path to the root.

Parameters:

Name Type Description Default index int

The index of the data-sample to select.

required

Returns:

Type Description str

The filename of the index'th data sample.

Source code in src/eva/vision/data/datasets/vision.py
@abc.abstractmethod\ndef filename(self, index: int) -> str:\n    \"\"\"Returns the filename of the `index`'th data sample.\n\n    Note that this is the relative file path to the root.\n\n    Args:\n        index: The index of the data-sample to select.\n\n    Returns:\n        The filename of the `index`'th data sample.\n    \"\"\"\n
"},{"location":"reference/vision/data/datasets/#classification-datasets","title":"Classification datasets","text":""},{"location":"reference/vision/data/datasets/#eva.vision.data.datasets.BACH","title":"eva.vision.data.datasets.BACH","text":"

Bases: ImageClassification

Dataset class for BACH images and corresponding targets.

The dataset is split into train and validation by taking into account the patient IDs to avoid any data leakage.

Parameters:

Name Type Description Default root str

Path to the root directory of the dataset. The dataset will be downloaded and extracted here, if it does not already exist.

required split Literal['train', 'val'] | None

Dataset split to use. If None, the entire dataset is used.

None download bool

Whether to download the data for the specified split. Note that the download will be executed only by additionally calling the :meth:prepare_data method and if the data does not yet exist on disk.

False image_transforms Callable | None

A function/transform that takes in an image and returns a transformed version.

None target_transforms Callable | None

A function/transform that takes in the target and transforms it.

None Source code in src/eva/vision/data/datasets/classification/bach.py
def __init__(\n    self,\n    root: str,\n    split: Literal[\"train\", \"val\"] | None = None,\n    download: bool = False,\n    image_transforms: Callable | None = None,\n    target_transforms: Callable | None = None,\n) -> None:\n    \"\"\"Initialize the dataset.\n\n    The dataset is split into train and validation by taking into account\n    the patient IDs to avoid any data leakage.\n\n    Args:\n        root: Path to the root directory of the dataset. The dataset will\n            be downloaded and extracted here, if it does not already exist.\n        split: Dataset split to use. If `None`, the entire dataset is used.\n        download: Whether to download the data for the specified split.\n            Note that the download will be executed only by additionally\n            calling the :meth:`prepare_data` method and if the data does\n            not yet exist on disk.\n        image_transforms: A function/transform that takes in an image\n            and returns a transformed version.\n        target_transforms: A function/transform that takes in the target\n            and transforms it.\n    \"\"\"\n    super().__init__(\n        image_transforms=image_transforms,\n        target_transforms=target_transforms,\n    )\n\n    self._root = root\n    self._split = split\n    self._download = download\n\n    self._samples: List[Tuple[str, int]] = []\n    self._indices: List[int] = []\n
"},{"location":"reference/vision/data/datasets/#eva.vision.data.datasets.PatchCamelyon","title":"eva.vision.data.datasets.PatchCamelyon","text":"

Bases: ImageClassification

Dataset class for PatchCamelyon images and corresponding targets.

Parameters:

Name Type Description Default root str

The path to the dataset root. This path should contain the uncompressed h5 files and the metadata.

required split Literal['train', 'val', 'test']

The dataset split for training, validation, or testing.

required download bool

Whether to download the data for the specified split. Note that the download will be executed only by additionally calling the :meth:prepare_data method.

False image_transforms Callable | None

A function/transform that takes in an image and returns a transformed version.

None target_transforms Callable | None

A function/transform that takes in the target and transforms it.

None Source code in src/eva/vision/data/datasets/classification/patch_camelyon.py
def __init__(\n    self,\n    root: str,\n    split: Literal[\"train\", \"val\", \"test\"],\n    download: bool = False,\n    image_transforms: Callable | None = None,\n    target_transforms: Callable | None = None,\n) -> None:\n    \"\"\"Initializes the dataset.\n\n    Args:\n        root: The path to the dataset root. This path should contain\n            the uncompressed h5 files and the metadata.\n        split: The dataset split for training, validation, or testing.\n        download: Whether to download the data for the specified split.\n            Note that the download will be executed only by additionally\n            calling the :meth:`prepare_data` method.\n        image_transforms: A function/transform that takes in an image\n            and returns a transformed version.\n        target_transforms: A function/transform that takes in the target\n            and transforms it.\n    \"\"\"\n    super().__init__(\n        image_transforms=image_transforms,\n        target_transforms=target_transforms,\n    )\n\n    self._root = root\n    self._split = split\n    self._download = download\n
"},{"location":"reference/vision/data/datasets/#segmentation-datasets","title":"Segmentation datasets","text":""},{"location":"reference/vision/data/datasets/#eva.vision.data.datasets.ImageSegmentation","title":"eva.vision.data.datasets.ImageSegmentation","text":"

Bases: VisionDataset[Tuple[Image, Mask]], ABC

Image segmentation abstract dataset.

Parameters:

Name Type Description Default transforms Callable | None

A function/transforms that takes in an image and a label and returns the transformed versions of both.

None Source code in src/eva/vision/data/datasets/segmentation/base.py
def __init__(\n    self,\n    transforms: Callable | None = None,\n) -> None:\n    \"\"\"Initializes the image segmentation base class.\n\n    Args:\n        transforms: A function/transforms that takes in an\n            image and a label and returns the transformed versions of both.\n    \"\"\"\n    super().__init__()\n\n    self._transforms = transforms\n
"},{"location":"reference/vision/data/datasets/#eva.vision.data.datasets.ImageSegmentation.classes","title":"classes: List[str] | None property","text":"

Returns the list with names of the dataset names.

"},{"location":"reference/vision/data/datasets/#eva.vision.data.datasets.ImageSegmentation.class_to_idx","title":"class_to_idx: Dict[str, int] | None property","text":"

Returns a mapping of the class name to its target index.

"},{"location":"reference/vision/data/datasets/#eva.vision.data.datasets.ImageSegmentation.load_metadata","title":"load_metadata","text":"

Returns the dataset metadata.

Parameters:

Name Type Description Default index int | None

The index of the data sample to return the metadata of. If None, it will return the metadata of the current dataset.

required

Returns:

Type Description Dict[str, Any] | List[Dict[str, Any]] | None

The sample metadata.

Source code in src/eva/vision/data/datasets/segmentation/base.py
def load_metadata(self, index: int | None) -> Dict[str, Any] | List[Dict[str, Any]] | None:\n    \"\"\"Returns the dataset metadata.\n\n    Args:\n        index: The index of the data sample to return the metadata of.\n            If `None`, it will return the metadata of the current dataset.\n\n    Returns:\n        The sample metadata.\n    \"\"\"\n
"},{"location":"reference/vision/data/datasets/#eva.vision.data.datasets.ImageSegmentation.load_image","title":"load_image abstractmethod","text":"

Loads and returns the index'th image sample.

Parameters:

Name Type Description Default index int

The index of the data sample to load.

required

Returns:

Type Description Image

An image torchvision tensor (channels, height, width).

Source code in src/eva/vision/data/datasets/segmentation/base.py
@abc.abstractmethod\ndef load_image(self, index: int) -> tv_tensors.Image:\n    \"\"\"Loads and returns the `index`'th image sample.\n\n    Args:\n        index: The index of the data sample to load.\n\n    Returns:\n        An image torchvision tensor (channels, height, width).\n    \"\"\"\n
"},{"location":"reference/vision/data/datasets/#eva.vision.data.datasets.ImageSegmentation.load_mask","title":"load_mask abstractmethod","text":"

Returns the index'th target masks sample.

Parameters:

Name Type Description Default index int

The index of the data sample target masks to load.

required

Returns:

Type Description Mask

The semantic mask as a (H x W) shaped tensor with integer

Mask

values which represent the pixel class id.

Source code in src/eva/vision/data/datasets/segmentation/base.py
@abc.abstractmethod\ndef load_mask(self, index: int) -> tv_tensors.Mask:\n    \"\"\"Returns the `index`'th target masks sample.\n\n    Args:\n        index: The index of the data sample target masks to load.\n\n    Returns:\n        The semantic mask as a (H x W) shaped tensor with integer\n        values which represent the pixel class id.\n    \"\"\"\n
"},{"location":"reference/vision/data/datasets/#eva.vision.data.datasets.TotalSegmentator2D","title":"eva.vision.data.datasets.TotalSegmentator2D","text":"

Bases: ImageSegmentation

TotalSegmentator 2D segmentation dataset.

Parameters:

Name Type Description Default root str

Path to the root directory of the dataset. The dataset will be downloaded and extracted here, if it does not already exist.

required split Literal['train', 'val'] | None

Dataset split to use. If None, the entire dataset is used.

required version Literal['small', 'full'] | None

The version of the dataset to initialize. If None, it will use the files located at root as is and wont perform any checks.

'small' download bool

Whether to download the data for the specified split. Note that the download will be executed only by additionally calling the :meth:prepare_data method and if the data does not exist yet on disk.

False as_uint8 bool

Whether to convert and return the images as a 8-bit.

True transforms Callable | None

A function/transforms that takes in an image and a target mask and returns the transformed versions of both.

None Source code in src/eva/vision/data/datasets/segmentation/total_segmentator.py
def __init__(\n    self,\n    root: str,\n    split: Literal[\"train\", \"val\"] | None,\n    version: Literal[\"small\", \"full\"] | None = \"small\",\n    download: bool = False,\n    as_uint8: bool = True,\n    transforms: Callable | None = None,\n) -> None:\n    \"\"\"Initialize dataset.\n\n    Args:\n        root: Path to the root directory of the dataset. The dataset will\n            be downloaded and extracted here, if it does not already exist.\n        split: Dataset split to use. If `None`, the entire dataset is used.\n        version: The version of the dataset to initialize. If `None`, it will\n            use the files located at root as is and wont perform any checks.\n        download: Whether to download the data for the specified split.\n            Note that the download will be executed only by additionally\n            calling the :meth:`prepare_data` method and if the data does not\n            exist yet on disk.\n        as_uint8: Whether to convert and return the images as a 8-bit.\n        transforms: A function/transforms that takes in an image and a target\n            mask and returns the transformed versions of both.\n    \"\"\"\n    super().__init__(transforms=transforms)\n\n    self._root = root\n    self._split = split\n    self._version = version\n    self._download = download\n    self._as_uint8 = as_uint8\n\n    self._samples_dirs: List[str] = []\n    self._indices: List[Tuple[int, int]] = []\n
"},{"location":"reference/vision/data/transforms/","title":"Transforms","text":""},{"location":"reference/vision/data/transforms/#eva.core.data.transforms.dtype.ArrayToTensor","title":"eva.core.data.transforms.dtype.ArrayToTensor","text":"

Converts a numpy array to a torch tensor.

"},{"location":"reference/vision/data/transforms/#eva.core.data.transforms.dtype.ArrayToFloatTensor","title":"eva.core.data.transforms.dtype.ArrayToFloatTensor","text":"

Bases: ArrayToTensor

Converts a numpy array to a torch tensor and casts it to float.

"},{"location":"reference/vision/data/transforms/#eva.vision.data.transforms.ResizeAndCrop","title":"eva.vision.data.transforms.ResizeAndCrop","text":"

Bases: Compose

Resizes, crops and normalizes an input image while preserving its aspect ratio.

Parameters:

Name Type Description Default size int | Sequence[int]

Desired output size of the crop. If size is an int instead of sequence like (h, w), a square crop (size, size) is made.

224 mean Sequence[float]

Sequence of means for each image channel.

(0.5, 0.5, 0.5) std Sequence[float]

Sequence of standard deviations for each image channel.

(0.5, 0.5, 0.5) Source code in src/eva/vision/data/transforms/common/resize_and_crop.py
def __init__(\n    self,\n    size: int | Sequence[int] = 224,\n    mean: Sequence[float] = (0.5, 0.5, 0.5),\n    std: Sequence[float] = (0.5, 0.5, 0.5),\n) -> None:\n    \"\"\"Initializes the transform object.\n\n    Args:\n        size: Desired output size of the crop. If size is an `int` instead\n            of sequence like (h, w), a square crop (size, size) is made.\n        mean: Sequence of means for each image channel.\n        std: Sequence of standard deviations for each image channel.\n    \"\"\"\n    self._size = size\n    self._mean = mean\n    self._std = std\n\n    super().__init__(transforms=self._build_transforms())\n
"},{"location":"reference/vision/models/networks/","title":"Networks","text":""},{"location":"reference/vision/models/networks/#eva.vision.models.networks.ABMIL","title":"eva.vision.models.networks.ABMIL","text":"

Bases: Module

ABMIL network for multiple instance learning classification tasks.

Takes an array of patch level embeddings per slide as input. This implementation supports batched inputs of shape (batch_size, n_instances, input_size). For slides with less than n_instances patches, you can apply padding and provide a mask tensor to the forward pass.

The original implementation from [1] was used as a reference: https://github.com/AMLab-Amsterdam/AttentionDeepMIL/blob/master/model.py

Notes

[1] Maximilian Ilse, Jakub M. Tomczak, Max Welling, \"Attention-based Deep Multiple Instance Learning\", 2018 https://arxiv.org/abs/1802.04712

Parameters:

Name Type Description Default input_size int

input embedding dimension

required output_size int

number of classes

required projected_input_size int | None

size of the projected input. if None, no projection is performed.

required hidden_size_attention int

hidden dimension in attention network

128 hidden_sizes_mlp tuple

dimensions for hidden layers in last mlp

(128, 64) use_bias bool

whether to use bias in the attention network

True dropout_input_embeddings float

dropout rate for the input embeddings

0.0 dropout_attention float

dropout rate for the attention network and classifier

0.0 dropout_mlp float

dropout rate for the final MLP network

0.0 pad_value int | float | None

Value indicating padding in the input tensor. If specified, entries with this value in the will be masked. If set to None, no masking is applied.

float('-inf') Source code in src/eva/vision/models/networks/abmil.py
def __init__(\n    self,\n    input_size: int,\n    output_size: int,\n    projected_input_size: int | None,\n    hidden_size_attention: int = 128,\n    hidden_sizes_mlp: tuple = (128, 64),\n    use_bias: bool = True,\n    dropout_input_embeddings: float = 0.0,\n    dropout_attention: float = 0.0,\n    dropout_mlp: float = 0.0,\n    pad_value: int | float | None = float(\"-inf\"),\n) -> None:\n    \"\"\"Initializes the ABMIL network.\n\n    Args:\n        input_size: input embedding dimension\n        output_size: number of classes\n        projected_input_size: size of the projected input. if `None`, no projection is\n            performed.\n        hidden_size_attention: hidden dimension in attention network\n        hidden_sizes_mlp: dimensions for hidden layers in last mlp\n        use_bias: whether to use bias in the attention network\n        dropout_input_embeddings: dropout rate for the input embeddings\n        dropout_attention: dropout rate for the attention network and classifier\n        dropout_mlp: dropout rate for the final MLP network\n        pad_value: Value indicating padding in the input tensor. If specified, entries with\n            this value in the will be masked. If set to `None`, no masking is applied.\n    \"\"\"\n    super().__init__()\n\n    self._pad_value = pad_value\n\n    if projected_input_size:\n        self.projector = nn.Sequential(\n            nn.Linear(input_size, projected_input_size, bias=True),\n            nn.Dropout(p=dropout_input_embeddings),\n        )\n        input_size = projected_input_size\n    else:\n        self.projector = nn.Dropout(p=dropout_input_embeddings)\n\n    self.gated_attention = GatedAttention(\n        input_dim=input_size,\n        hidden_dim=hidden_size_attention,\n        dropout=dropout_attention,\n        n_classes=1,\n        use_bias=use_bias,\n    )\n\n    self.classifier = MLP(\n        input_size=input_size,\n        output_size=output_size,\n        hidden_layer_sizes=hidden_sizes_mlp,\n        dropout=dropout_mlp,\n        hidden_activation_fn=nn.ReLU,\n    )\n
"},{"location":"reference/vision/models/networks/#eva.vision.models.networks.ABMIL.forward","title":"forward","text":"

Forward pass.

Parameters:

Name Type Description Default input_tensor Tensor

Tensor with expected shape of (batch_size, n_instances, input_size).

required Source code in src/eva/vision/models/networks/abmil.py
def forward(self, input_tensor: torch.Tensor) -> torch.Tensor:\n    \"\"\"Forward pass.\n\n    Args:\n        input_tensor: Tensor with expected shape of (batch_size, n_instances, input_size).\n    \"\"\"\n    input_tensor, mask = self._mask_values(input_tensor, self._pad_value)\n\n    # (batch_size, n_instances, input_size) -> (batch_size, n_instances, projected_input_size)\n    input_tensor = self.projector(input_tensor)\n\n    attention_logits = self.gated_attention(input_tensor)  # (batch_size, n_instances, 1)\n    if mask is not None:\n        # fill masked values with -inf, which will yield 0s after softmax\n        attention_logits = attention_logits.masked_fill(mask, float(\"-inf\"))\n\n    attention_weights = nn.functional.softmax(attention_logits, dim=1)\n    # (batch_size, n_instances, 1)\n\n    attention_result = torch.matmul(torch.transpose(attention_weights, 1, 2), input_tensor)\n    # (batch_size, 1, hidden_size_attention)\n\n    attention_result = torch.squeeze(attention_result, 1)  # (batch_size, hidden_size_attention)\n\n    return self.classifier(attention_result)  # (batch_size, output_size)\n
"},{"location":"reference/vision/utils/io/","title":"IO","text":""},{"location":"reference/vision/utils/io/#eva.vision.utils.io.image","title":"eva.vision.utils.io.image","text":"

Image I/O related functions.

"},{"location":"reference/vision/utils/io/#eva.vision.utils.io.image.read_image","title":"read_image","text":"

Reads and loads the image from a file path as a RGB.

Parameters:

Name Type Description Default path str

The path of the image file.

required

Returns:

Type Description NDArray[uint8]

The RGB image as a numpy array.

Raises:

Type Description FileExistsError

If the path does not exist or it is unreachable.

IOError

If the image could not be loaded.

Source code in src/eva/vision/utils/io/image.py
def read_image(path: str) -> npt.NDArray[np.uint8]:\n    \"\"\"Reads and loads the image from a file path as a RGB.\n\n    Args:\n        path: The path of the image file.\n\n    Returns:\n        The RGB image as a numpy array.\n\n    Raises:\n        FileExistsError: If the path does not exist or it is unreachable.\n        IOError: If the image could not be loaded.\n    \"\"\"\n    return read_image_as_array(path, cv2.IMREAD_COLOR)\n
"},{"location":"reference/vision/utils/io/#eva.vision.utils.io.image.read_image_as_array","title":"read_image_as_array","text":"

Reads and loads an image file as a numpy array.

Parameters:

Name Type Description Default path str

The path to the image file.

required flags int

Specifies the way in which the image should be read.

IMREAD_UNCHANGED

Returns:

Type Description NDArray[uint8]

The image as a numpy array.

Raises:

Type Description FileExistsError

If the path does not exist or it is unreachable.

IOError

If the image could not be loaded.

Source code in src/eva/vision/utils/io/image.py
def read_image_as_array(path: str, flags: int = cv2.IMREAD_UNCHANGED) -> npt.NDArray[np.uint8]:\n    \"\"\"Reads and loads an image file as a numpy array.\n\n    Args:\n        path: The path to the image file.\n        flags: Specifies the way in which the image should be read.\n\n    Returns:\n        The image as a numpy array.\n\n    Raises:\n        FileExistsError: If the path does not exist or it is unreachable.\n        IOError: If the image could not be loaded.\n    \"\"\"\n    _utils.check_file(path)\n    image = cv2.imread(path, flags=flags)\n    if image is None:\n        raise IOError(\n            f\"Input '{path}' could not be loaded. \"\n            \"Please verify that the path is a valid image file.\"\n        )\n\n    if image.ndim == 3:\n        image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)\n\n    if image.ndim == 2 and flags == cv2.IMREAD_COLOR:\n        image = image[:, :, np.newaxis]\n\n    return np.asarray(image).astype(np.uint8)\n
"},{"location":"reference/vision/utils/io/#eva.vision.utils.io.nifti","title":"eva.vision.utils.io.nifti","text":"

NIfTI I/O related functions.

"},{"location":"reference/vision/utils/io/#eva.vision.utils.io.nifti.read_nifti_slice","title":"read_nifti_slice","text":"

Reads and loads a NIfTI image from a file path as uint8.

Parameters:

Name Type Description Default path str

The path to the NIfTI file.

required slice_index int

The image slice index to return.

required use_storage_dtype bool

Whether to cast the raw image array to the inferred type.

True

Returns:

Type Description NDArray[Any]

The image as a numpy array (height, width, channels).

Raises:

Type Description FileExistsError

If the path does not exist or it is unreachable.

ValueError

If the input channel is invalid for the image.

Source code in src/eva/vision/utils/io/nifti.py
def read_nifti_slice(\n    path: str, slice_index: int, *, use_storage_dtype: bool = True\n) -> npt.NDArray[Any]:\n    \"\"\"Reads and loads a NIfTI image from a file path as `uint8`.\n\n    Args:\n        path: The path to the NIfTI file.\n        slice_index: The image slice index to return.\n        use_storage_dtype: Whether to cast the raw image\n            array to the inferred type.\n\n    Returns:\n        The image as a numpy array (height, width, channels).\n\n    Raises:\n        FileExistsError: If the path does not exist or it is unreachable.\n        ValueError: If the input channel is invalid for the image.\n    \"\"\"\n    _utils.check_file(path)\n    image_data = nib.load(path)  # type: ignore\n    image_slice = image_data.slicer[:, :, slice_index : slice_index + 1]  # type: ignore\n    image_array = image_slice.get_fdata()\n    if use_storage_dtype:\n        image_array = image_array.astype(image_data.get_data_dtype())  # type: ignore\n    return image_array\n
"},{"location":"reference/vision/utils/io/#eva.vision.utils.io.nifti.fetch_total_nifti_slices","title":"fetch_total_nifti_slices","text":"

Fetches the total slides of a NIfTI image file.

Parameters:

Name Type Description Default path str

The path to the NIfTI file.

required

Returns:

Type Description int

The number of the total available slides.

Raises:

Type Description FileExistsError

If the path does not exist or it is unreachable.

ValueError

If the input channel is invalid for the image.

Source code in src/eva/vision/utils/io/nifti.py
def fetch_total_nifti_slices(path: str) -> int:\n    \"\"\"Fetches the total slides of a NIfTI image file.\n\n    Args:\n        path: The path to the NIfTI file.\n\n    Returns:\n        The number of the total available slides.\n\n    Raises:\n        FileExistsError: If the path does not exist or it is unreachable.\n        ValueError: If the input channel is invalid for the image.\n    \"\"\"\n    _utils.check_file(path)\n    image = nib.load(path)  # type: ignore\n    image_shape = image.header.get_data_shape()  # type: ignore\n    return image_shape[-1]\n
"},{"location":"user-guide/","title":"User Guide","text":"

Here you can find everything you need to install, understand and interact with eva.

"},{"location":"user-guide/#getting-started","title":"Getting started","text":"

Install eva on your machine and learn how to use eva.

"},{"location":"user-guide/#tutorials","title":"Tutorials","text":"

To familiarize yourself with eva, try out some of our tutorials.

"},{"location":"user-guide/#advanced-user-guide","title":"Advanced user guide","text":"

Get to know eva in more depth by studying our advanced user guides.

"},{"location":"user-guide/advanced/model_wrappers/","title":"Model Wrappers","text":"

This document shows how to use eva's Model Wrapper API (eva.models.networks.wrappers) to load different model formats from a series of sources such as PyTorch Hub, HuggingFace Model Hub and ONNX.

"},{"location":"user-guide/advanced/model_wrappers/#loading-pytorch-models","title":"Loading PyTorch models","text":"

The eva framework is built on top of PyTorch Lightning and thus naturally supports loading PyTorch models. You just need to specify the class path of your model in the backbone section of the .yaml config file.

backbone:\n  class_path: path.to.your.ModelClass\n  init_args:\n    arg_1: ...\n    arg_2: ...\n

Note that your ModelClass should subclass torch.nn.Module and implement the forward() method to return embedding tensors of shape [embedding_dim].

"},{"location":"user-guide/advanced/model_wrappers/#pytorch-hub","title":"PyTorch Hub","text":"

To load models from PyTorch Hub or other torch model providers, the easiest way is to use the ModelFromFunction wrapper class:

backbone:\n  class_path: eva.models.networks.wrappers.ModelFromFunction\n  init_args:\n    path: torch.hub.load\n    arguments:\n      repo_or_dir: facebookresearch/dino:main\n      model: dino_vits16\n      pretrained: false\n    checkpoint_path: path/to/your/checkpoint.torch\n

Note that if a checkpoint_path is provided, ModelFromFunction will automatically initialize the specified model using the provided weights from that checkpoint file.

"},{"location":"user-guide/advanced/model_wrappers/#timm","title":"timm","text":"

Similar to the above example, we can easily load models using the common vision library timm:

backbone:\n  class_path: eva.models.networks.wrappers.ModelFromFunction\n  init_args:\n    path: timm.create_model\n    arguments:\n      model_name: resnet18\n      pretrained: true\n

"},{"location":"user-guide/advanced/model_wrappers/#loading-models-from-huggingface-hub","title":"Loading models from HuggingFace Hub","text":"

For loading models from HuggingFace Hub, eva provides a custom wrapper class HuggingFaceModel which can be used as follows:

backbone:\n  class_path: eva.models.networks.wrappers.HuggingFaceModel\n  init_args:\n    model_name_or_path: owkin/phikon\n    tensor_transforms: \n      class_path: eva.models.networks.transforms.ExtractCLSFeatures\n

In the above example, the forward pass implemented by the owkin/phikon model returns an output tensor containing the hidden states of all input tokens. In order to extract the state corresponding to the CLS token only, we can specify a transformation via the tensor_transforms argument which will be applied to the model output.

"},{"location":"user-guide/advanced/model_wrappers/#loading-onnx-models","title":"Loading ONNX models","text":"

.onnx model checkpoints can be loaded using the ONNXModel wrapper class as follows:

class_path: eva.models.networks.wrappers.ONNXModel\ninit_args:\n  path: path/to/model.onnx\n  device: cuda\n
"},{"location":"user-guide/advanced/model_wrappers/#implementing-custom-model-wrappers","title":"Implementing custom model wrappers","text":"

You can also implement your own model wrapper classes, in case your model format is not supported by the wrapper classes that eva already provides. To do so, you need to subclass eva.models.networks.wrappers.BaseModel and implement the following abstract methods:

You can take the implementations of ModelFromFunction, HuggingFaceModel and ONNXModel wrappers as a reference.

"},{"location":"user-guide/advanced/replicate_evaluations/","title":"Replicate evaluations","text":"

To produce the evaluation results presented here, you can run eva with the settings below.

Make sure to replace <task> in the commands below with bach, crc, mhist or patch_camelyon.

Note that to run the commands below you will need to first download the data. BACH, CRC and PatchCamelyon provide automatic download by setting the argument download: true in their respective config-files. In the case of MHIST you will need to download the data manually by following the instructions provided here.

"},{"location":"user-guide/advanced/replicate_evaluations/#dino-vit-s16-random-weights","title":"DINO ViT-S16 (random weights)","text":"

Evaluating the backbone with randomly initialized weights serves as a baseline to compare the pretrained FMs to an FM that produces embeddings without any prior learning on image tasks. To evaluate, run:

PRETRAINED=false \\\nEMBEDDINGS_ROOT=\"./data/embeddings/dino_vits16_random\" \\\neva predict_fit --config configs/vision/dino_vit/offline/<task>.yaml\n
"},{"location":"user-guide/advanced/replicate_evaluations/#dino-vit-s16-imagenet","title":"DINO ViT-S16 (ImageNet)","text":"

The next baseline model, uses a pretrained ViT-S16 backbone with ImageNet weights. To evaluate, run:

EMBEDDINGS_ROOT=\"./data/embeddings/dino_vits16_imagenet\" \\\neva predict_fit --config configs/vision/dino_vit/offline/<task>.yaml\n
"},{"location":"user-guide/advanced/replicate_evaluations/#dino-vit-b8-imagenet","title":"DINO ViT-B8 (ImageNet)","text":"

To evaluate performance on the larger ViT-B8 backbone pretrained on ImageNet, run:

EMBEDDINGS_ROOT=\"./data/embeddings/dino_vitb8_imagenet\" \\\nDINO_BACKBONE=dino_vitb8 \\\nIN_FEATURES=768 \\\neva predict_fit --config configs/vision/dino_vit/offline/<task>.yaml\n

"},{"location":"user-guide/advanced/replicate_evaluations/#dinov2-vit-l14-imagenet","title":"DINOv2 ViT-L14 (ImageNet)","text":"

To evaluate performance on Dino v2 ViT-L14 backbone pretrained on ImageNet, run:

PRETRAINED=true \\\nEMBEDDINGS_ROOT=\"./data/embeddings/dinov2_vitl14_kaiko\" \\\nREPO_OR_DIR=facebookresearch/dinov2:main \\\nDINO_BACKBONE=dinov2_vitl14_reg \\\nFORCE_RELOAD=true \\\nIN_FEATURES=1024 \\\neva predict_fit --config configs/vision/dino_vit/offline/<task>.yaml\n

"},{"location":"user-guide/advanced/replicate_evaluations/#lunit-dino-vit-s16-tcga","title":"Lunit - DINO ViT-S16 (TCGA)","text":"

Lunit, released the weights for a DINO ViT-S16 backbone, pretrained on TCGA data on GitHub. To evaluate, run:

PRETRAINED=false \\\nEMBEDDINGS_ROOT=\"./data/embeddings/dino_vits16_lunit\" \\\nCHECKPOINT_PATH=\"https://github.com/lunit-io/benchmark-ssl-pathology/releases/download/pretrained-weights/dino_vit_small_patch16_ep200.torch\" \\\nNORMALIZE_MEAN=[0.70322989,0.53606487,0.66096631] \\\nNORMALIZE_STD=[0.21716536,0.26081574,0.20723464] \\\neva predict_fit --config configs/vision/dino_vit/offline/<task>.yaml\n
"},{"location":"user-guide/advanced/replicate_evaluations/#owkin-ibot-vit-b16-tcga","title":"Owkin - iBOT ViT-B16 (TCGA)","text":"

Owkin released the weights for \"Phikon\", an FM trained with iBOT on TCGA data, via HuggingFace. To evaluate, run:

EMBEDDINGS_ROOT=\"./data/embeddings/dino_vitb16_owkin\" \\\neva predict_fit --config configs/vision/owkin/phikon/offline/<task>.yaml\n

Note: since eva provides the config files to evaluate tasks with the Phikon FM in \"configs/vision/owkin/phikon/offline\", it is not necessary to set the environment variables needed for the runs above.

"},{"location":"user-guide/advanced/replicate_evaluations/#uni-dinov2-vit-l16-mass-100k","title":"UNI - DINOv2 ViT-L16 (Mass-100k)","text":"

The UNI FM, introduced in [1] is available on HuggingFace. Note that access needs to be requested.

Unlike the other FMs evaluated for our leaderboard, the UNI model uses the vision library timm to load the model. To accomodate this, you will need to modify the config files (see also Model Wrappers).

Make a copy of the task-config you'd like to run, and replace the backbone section with:

backbone:\n    class_path: eva.models.ModelFromFunction\n    init_args:\n        path: timm.create_model\n        arguments:\n            model_name: vit_large_patch16_224\n            patch_size: 16\n            init_values: 1e-5\n            num_classes: 0\n            dynamic_img_size: true\n        checkpoint_path: <path/to/pytorch_model.bin>\n

Now evaluate the model by running:

EMBEDDINGS_ROOT=\"./data/embeddings/dinov2_vitl16_uni\" \\\nIN_FEATURES=1024 \\\neva predict_fit --config path/to/<task>.yaml\n

"},{"location":"user-guide/advanced/replicate_evaluations/#kaikoai-dino-vit-s16-tcga","title":"kaiko.ai - DINO ViT-S16 (TCGA)","text":"

To evaluate kaiko.ai's FM with DINO ViT-S16 backbone, pretrained on TCGA data on GitHub, run:

PRETRAINED=false \\\nEMBEDDINGS_ROOT=\"./data/embeddings/dino_vits16_kaiko\" \\\nCHECKPOINT_PATH=[TBD*] \\\nNORMALIZE_MEAN=[0.5,0.5,0.5] \\\nNORMALIZE_STD=[0.5,0.5,0.5] \\\neva predict_fit --config configs/vision/dino_vit/offline/<task>.yaml\n

* path to public checkpoint will be added when available.

"},{"location":"user-guide/advanced/replicate_evaluations/#kaikoai-dino-vit-s8-tcga","title":"kaiko.ai - DINO ViT-S8 (TCGA)","text":"

To evaluate kaiko.ai's FM with DINO ViT-S8 backbone, pretrained on TCGA data on GitHub, run:

PRETRAINED=false \\\nEMBEDDINGS_ROOT=\"./data/embeddings/dino_vits8_kaiko\" \\\nDINO_BACKBONE=dino_vits8 \\\nCHECKPOINT_PATH=[TBD*] \\\nNORMALIZE_MEAN=[0.5,0.5,0.5] \\\nNORMALIZE_STD=[0.5,0.5,0.5] \\\neva predict_fit --config configs/vision/dino_vit/offline/<task>.yaml\n

* path to public checkpoint will be added when available.

"},{"location":"user-guide/advanced/replicate_evaluations/#kaikoai-dino-vit-b16-tcga","title":"kaiko.ai - DINO ViT-B16 (TCGA)","text":"

To evaluate kaiko.ai's FM with the larger DINO ViT-B16 backbone, pretrained on TCGA data, run:

PRETRAINED=false \\\nEMBEDDINGS_ROOT=\"./data/embeddings/dino_vitb16_kaiko\" \\\nDINO_BACKBONE=dino_vitb16 \\\nCHECKPOINT_PATH=[TBD*] \\\nIN_FEATURES=768 \\\nNORMALIZE_MEAN=[0.5,0.5,0.5] \\\nNORMALIZE_STD=[0.5,0.5,0.5] \\\neva predict_fit --config configs/vision/dino_vit/offline/<task>.yaml\n

* path to public checkpoint will be added when available.

"},{"location":"user-guide/advanced/replicate_evaluations/#kaikoai-dino-vit-b8-tcga","title":"kaiko.ai - DINO ViT-B8 (TCGA)","text":"

To evaluate kaiko.ai's FM with the larger DINO ViT-B8 backbone, pretrained on TCGA data, run:

PRETRAINED=false \\\nEMBEDDINGS_ROOT=\"./data/embeddings/dino_vitb8_kaiko\" \\\nDINO_BACKBONE=dino_vitb8 \\\nCHECKPOINT_PATH=[TBD*] \\\nIN_FEATURES=768 \\\nNORMALIZE_MEAN=[0.5,0.5,0.5] \\\nNORMALIZE_STD=[0.5,0.5,0.5] \\\neva predict_fit --config configs/vision/dino_vit/offline/<task>.yaml\n

* path to public checkpoint will be added when available.

"},{"location":"user-guide/advanced/replicate_evaluations/#kaikoai-dinov2-vit-l14-tcga","title":"kaiko.ai - DINOv2 ViT-L14 (TCGA)","text":"

To evaluate kaiko.ai's FM with the larger DINOv2 ViT-L14 backbone, pretrained on TCGA data, run:

PRETRAINED=false \\\nEMBEDDINGS_ROOT=\"./data/embeddings/dinov2_vitl14_kaiko\" \\\nREPO_OR_DIR=facebookresearch/dinov2:main \\\nDINO_BACKBONE=dinov2_vitl14_reg \\\nFORCE_RELOAD=true \\\nCHECKPOINT_PATH=[TBD*] \\\nIN_FEATURES=1024 \\\nNORMALIZE_MEAN=[0.5,0.5,0.5] \\\nNORMALIZE_STD=[0.5,0.5,0.5] \\\neva predict_fit --config configs/vision/dino_vit/offline/<task>.yaml\n

* path to public checkpoint will be added when available.

"},{"location":"user-guide/advanced/replicate_evaluations/#references","title":"References","text":"

[1]: Chen: A General-Purpose Self-Supervised Model for Computational Pathology, 2023 (arxiv)

"},{"location":"user-guide/getting-started/how_to_use/","title":"How to use eva","text":"

Before starting to use eva, it's important to get familiar with the different workflows, subcommands and configurations.

"},{"location":"user-guide/getting-started/how_to_use/#eva-subcommands","title":"eva subcommands","text":"

To run an evaluation, we call:

eva <subcommand> --config <path-to-config-file>\n

The eva interface supports the subcommands: predict, fit and predict_fit.

"},{"location":"user-guide/getting-started/how_to_use/#online-vs-offline-workflows","title":"* online vs. offline workflows","text":"

We distinguish between the online and offline workflow:

The online workflow can be used to quickly run a complete evaluation without saving and tracking embeddings. The offline workflow runs faster (only one FM-backbone forward pass) and is ideal to experiment with different decoders on the same FM-backbone.

"},{"location":"user-guide/getting-started/how_to_use/#run-configurations","title":"Run configurations","text":""},{"location":"user-guide/getting-started/how_to_use/#config-files","title":"Config files","text":"

The setup for an eva run is provided in a .yaml config file which is defined with the --config flag.

A config file specifies the setup for the trainer (including callback for the model backbone), the model (setup of the trainable decoder) and data module.

The config files for the datasets and models that eva supports out of the box, you can find on GitHub. We recommend that you inspect some of them to get a better understanding of their structure and content.

"},{"location":"user-guide/getting-started/how_to_use/#environment-variables","title":"Environment variables","text":"

To customize runs, without the need of creating custom config-files, you can overwrite the config-parameters listed below by setting them as environment variables.

Type Description OUTPUT_ROOT str The directory to store logging outputs and evaluation results EMBEDDINGS_ROOT str The directory to store the computed embeddings CHECKPOINT_PATH str Path to the FM-checkpoint to be evaluated IN_FEATURES int The input feature dimension (embedding) NUM_CLASSES int Number of classes for classification tasks N_RUNS int Number of fit runs to perform in a session, defaults to 5 MAX_STEPS int Maximum number of training steps (if early stopping is not triggered) BATCH_SIZE int Batch size for a training step PREDICT_BATCH_SIZE int Batch size for a predict step LR_VALUE float Learning rate for training the decoder MONITOR_METRIC str The metric to monitor for early stopping and final model checkpoint loading MONITOR_METRIC_MODE str \"min\" or \"max\", depending on the MONITOR_METRIC used REPO_OR_DIR str GitHub repo with format containing model implementation, e.g. \"facebookresearch/dino:main\" DINO_BACKBONE str Backbone model architecture if a facebookresearch/dino FM is evaluated FORCE_RELOAD bool Whether to force a fresh download of the github repo unconditionally PRETRAINED bool Whether to load FM-backbone weights from a pretrained model"},{"location":"user-guide/getting-started/installation/","title":"Installation","text":"
pip install \"kaiko-eva[vision]\"\n
"},{"location":"user-guide/getting-started/installation/#run-eva","title":"Run eva","text":"

Now you are all set and you can start running eva with:

eva <subcommand> --config <path-to-config-file>\n
To learn how the subcommands and configs work, we recommend you familiarize yourself with How to use eva and then proceed to running eva with the Tutorials.

"},{"location":"user-guide/tutorials/evaluate_resnet/","title":"Train and evaluate a ResNet","text":"

If you read How to use eva and followed the Tutorials to this point, you might ask yourself why you would not always use the offline workflow to run a complete evaluation. An offline-run stores the computed embeddings and runs faster than the online-workflow which computes a backbone-forward pass in every epoch.

One use case for the online-workflow is the evaluation of a supervised ML model that does not rely on a backbone/head architecture. To demonstrate this, let's train a ResNet 18 from PyTorch Image Models (timm).

To do this we need to create a new config-file:

Now let's adapt the new bach.yaml-config to the new model:

     head:\n      class_path: eva.models.ModelFromFunction\n      init_args:\n        path: timm.create_model\n        arguments:\n          model_name: resnet18\n          num_classes: &NUM_CLASSES 4\n          drop_rate: 0.0\n          pretrained: false\n
To reduce training time, let's overwrite some of the default parameters. Run the training & evaluation with:
OUTPUT_ROOT=logs/resnet/bach \\\nMAX_STEPS=50 \\\nLR_VALUE=0.01 \\\neva fit --config configs/vision/resnet18/bach.yaml\n
Once the run is complete, take a look at the results in logs/resnet/bach/<session-id>/results.json and check out the tensorboard with tensorboard --logdir logs/resnet/bach. How does the performance compare to the results observed in the previous tutorials?

"},{"location":"user-guide/tutorials/offline_vs_online/","title":"Offline vs. online evaluations","text":"

In this tutorial we run eva with the three subcommands predict, fit and predict_fit, and take a look at the difference between offline and online workflows.

"},{"location":"user-guide/tutorials/offline_vs_online/#before-you-start","title":"Before you start","text":"

If you haven't downloaded the config files yet, please download them from GitHub.

For this tutorial we use the BACH classification task which is available on Zenodo and is distributed under Attribution-NonCommercial-ShareAlike 4.0 International license.

To let eva automatically handle the dataset download, you can open configs/vision/dino_vit/offline/bach.yaml and set download: true. Before doing so, please make sure that your use case is compliant with the dataset license.

"},{"location":"user-guide/tutorials/offline_vs_online/#offline-evaluations","title":"Offline evaluations","text":""},{"location":"user-guide/tutorials/offline_vs_online/#1-compute-the-embeddings","title":"1. Compute the embeddings","text":"

First, let's use the predict-command to download the data and compute embeddings. In this example we use a randomly initialized dino_vits16 as backbone.

Open a terminal in the folder where you installed eva and run:

PRETRAINED=false \\\nEMBEDDINGS_ROOT=./data/embeddings/dino_vits16_random \\\neva predict --config configs/vision/dino_vit/offline/bach.yaml\n

Executing this command will:

Once the session is complete, verify that:

"},{"location":"user-guide/tutorials/offline_vs_online/#2-evaluate-the-fm","title":"2. Evaluate the FM","text":"

Now we can use the fit-command to evaluate the FM on the precomputed embeddings.

To ensure a quick run for the purpose of this exercise, we overwrite some of the default parameters. Run eva to fit the decoder classifier with:

N_RUNS=2 \\\nMAX_STEPS=20 \\\nLR_VALUE=0.1 \\\neva fit --config configs/vision/dino_vit/offline/bach.yaml\n

Executing this command will:

Once the session is complete:

"},{"location":"user-guide/tutorials/offline_vs_online/#3-run-a-complete-offline-workflow","title":"3. Run a complete offline-workflow","text":"

With the predict_fit-command, the two steps above can be executed with one command. Let's do this, but this time let's use an FM pretrained from ImageNet.

Go back to the terminal and execute:

N_RUNS=1 \\\nMAX_STEPS=20 \\\nLR_VALUE=0.1 \\\nPRETRAINED=true \\\nEMBEDDINGS_ROOT=./data/embeddings/dino_vits16_pretrained \\\neva predict_fit --config configs/vision/dino_vit/offline/bach.yaml\n

Once the session is complete, inspect the evaluation results as you did in Step 2. Compare the performance metrics and training curves. Can you observe better performance with the ImageNet pretrained encoder?

"},{"location":"user-guide/tutorials/offline_vs_online/#online-evaluations","title":"Online evaluations","text":"

Alternatively to the offline workflow from Step 3, a complete evaluation can also be computed online. In this case we don't save and track embeddings and instead fit the ML model (encoder with frozen layers + trainable decoder) directly on the given task.

As in Step 3 above, we again use a dino_vits16 pretrained from ImageNet.

Run a complete online workflow with the following command:

N_RUNS=1 \\\nMAX_STEPS=20 \\\nLR_VALUE=0.1 \\\nPRETRAINED=true \\\neva fit --config configs/vision/dino_vit/online/bach.yaml\n

Executing this command will:

Once the run is complete:

"}]} \ No newline at end of file diff --git a/dev/user-guide/getting-started/how_to_use/index.html b/dev/user-guide/getting-started/how_to_use/index.html index c84725fe..9fe22ed1 100644 --- a/dev/user-guide/getting-started/how_to_use/index.html +++ b/dev/user-guide/getting-started/how_to_use/index.html @@ -2251,7 +2251,7 @@

Run configurations

Config files

The setup for an eva run is provided in a .yaml config file which is defined with the --config flag.

A config file specifies the setup for the trainer (including callback for the model backbone), the model (setup of the trainable decoder) and data module.

-

The config files for the datasets and models that eva supports out of the box, you can find on GitHub (scroll to the bottom of the page). We recommend that you inspect some of them to get a better understanding of their structure and content.

+

The config files for the datasets and models that eva supports out of the box, you can find on GitHub. We recommend that you inspect some of them to get a better understanding of their structure and content.

Environment variables

To customize runs, without the need of creating custom config-files, you can overwrite the config-parameters listed below by setting them as environment variables.

diff --git a/dev/user-guide/tutorials/offline_vs_online/index.html b/dev/user-guide/tutorials/offline_vs_online/index.html index 4e274e32..1bf0db33 100644 --- a/dev/user-guide/tutorials/offline_vs_online/index.html +++ b/dev/user-guide/tutorials/offline_vs_online/index.html @@ -2231,7 +2231,7 @@

Offline vs. online evaluations

In this tutorial we run eva with the three subcommands predict, fit and predict_fit, and take a look at the difference between offline and online workflows.

Before you start

-

If you haven't downloaded the config files yet, please download them from GitHub (scroll to the bottom of the page).

+

If you haven't downloaded the config files yet, please download them from GitHub.

For this tutorial we use the BACH classification task which is available on Zenodo and is distributed under Attribution-NonCommercial-ShareAlike 4.0 International license.

To let eva automatically handle the dataset download, you can open configs/vision/dino_vit/offline/bach.yaml and set download: true. Before doing so, please make sure that your use case is compliant with the dataset license.

Offline evaluations