Gregory Holste, Evangelos K. Oikonomou, Zhangyang Wang, Rohan Khera
[Preprint
] [Project Page
]
PanEcho is a view-agnostic, multi-task model that performs 39 reporting tasks from echocardiogram videos. The model consists of an image encoder, temporal frame Transformer, and task-specific output heads. Since PanEcho is view-agnostic, multi-view echocardiography can be leveraged to integrate information across views. At test time, predictions are aggregated across echocardiograms acquired during the same study to generate study-level predictions for each task.
The pretrained model is available with a simple one-line import using PyTorch
.
import torch
# Import PanEcho
model = torch.hub.load('CarDS-Yale/PanEcho', 'PanEcho', force_reload=True)
# Demo inference on random video input
x = torch.rand(1, 3, 16, 224, 224)
print(model(x))
Input
: PyTorch Tensor of shape (batch_size, 3, 16, 224, 224)
containing a 16-frame echocardiogram video clip at 224 x 224 resolution that has been ImageNet-normalized. See our data loader for implementation details, though you are free to preprocess however you like.
Output
: Dictionary whose keys are the tasks (see here for breakdown of all tasks) and values are the predictions. The model outputs class-wise probabilities for classification tasks and direct estimates for regression tasks.
If you want to run inference on video clips of more or less than 16 frames, you can control this through the clip_len
argument:
import torch
# Import PanEcho
model = torch.hub.load('CarDS-Yale/PanEcho', 'PanEcho', force_reload=True, clip_len=32)
# Demo inference on random 32-frame video input
x = torch.rand(1, 3, 32, 224, 224)
print(model(x))
We recommend using 16-frame video clips when using PanEcho directly for inference in accordance with how the model was trained and evaluated. However, feel free to experiment with video length, particularly if you are fine-tuning PanEcho on your own data (see "Transfer Learning" below).
If you want to use PanEcho but are only interested in a subset of tasks, you can easily select your tasks of interest as follows. Note: Be sure to consult this table to select the desired task names.
# Import PanEcho, but remove unwanted task heads
tasks = ['EF', 'LVSystolicFunction', 'AVStenosis']
model = torch.hub.load('CarDS-Yale/PanEcho', 'PanEcho', force_reload=True, tasks=tasks)
# Demo inference on random video input
x = torch.rand(1, 3, 16, 224, 224)
print(model(x)) # predictions only for EF, LVSystolicFunction, & AVStenosis
To fine-tune PanEcho on your own data, we provide two options:
# Import PanEcho backbone w/o task-specific output heads
model = torch.hub.load('CarDS-Yale/PanEcho', 'PanEcho', force_reload=True, backbone_only=True)
# Demo inference on random video input
x = torch.rand(1, 3, 16, 224, 224)
print(model(x)) # (1, 768) video embedding
Input
: PyTorch Tensor of shape (batch_size, 3, 16, 224, 224)
containing a 16-frame echocardiogram video clip, as described earlier.
Output
: PyTorch Tensor of shape (batch_size, 768)
containing an embedding (learned representation) of the video.
# Import PanEcho image encoder w/o temporal Transformer or task-specific output heads
model = torch.hub.load('CarDS-Yale/PanEcho', 'PanEcho', force_reload=True, image_encoder_only=True)
# Demo inference on random image input
x = torch.rand(1, 3, 224, 224)
print(model(x)) # (1, 768) image embedding
Input
: PyTorch Tensor of shape (batch_size, 3, 224, 224)
containing an echocardiogram image (one video frame).
Output
: PyTorch Tensor of shape (batch_size, 768)
containing an embedding (learned representation) of the image.
Note: This repository is intended for use on a Linux machine with access to at least one NVIDIA GPU. While PanEcho was trained using NVIDIA A100 GPUs, the codebase is expected to function with virtually any recent NVIDIA GPUs. Specifically, PanEcho was developed on a Linux machine running Ubuntu 22.04.4 LTS with 8 NVIDIA A100 GPUs. All code and the following demo was also tested on an older Linux machine running Ubuntu 20.04.6 LTS with 4 NVIDIA 3090 GPUs with identical results.
On a Linux machine with Anaconda or Miniconda installed, run the following at the command line. Installation of required packages may take a few minutes.
# Navigate to the repository
cd PanEcho/ # if you are one directory above
# Install required packages
conda env create -f content/panecho.yml # this can take up to 5-10 minutes
# Activate environment
conda activate panecho
While the YNHHS cohort used to develop PanEcho is not currently available for sharing due to IRB restrictions, the following commands were used to produce the results in the paper:
# Navigate to proper directory
cd src/
# Train PanEcho with distributed training across 8 GPUs
torchrun --nproc_per_node=8 train.py --output_dir panecho --model_name frame_transformer --arch convnext_tiny --fc_dropout 0.25 --normalization imagenet --augment --reduce_lr 3 --amp
### EchoNet-Dynamic fine-tuning experiments ###
# Fine-tune PanEcho on EchoNet-Dynamic
torchrun --nproc_per_node=8 train_echonetdynamic.py --model_dir_path panecho --output_dir echonetdynamic_ft --model_name frame_transformer --arch convnext_tiny --normalization imagenet --amp
# Fine-tune a randomly initialized PanEcho architecture
torchrun --nproc_per_node=8 train_echonetdynamic.py --output_dir echonetdynamic_ft --rand_init --model_name frame_transformer --arch convnext_tiny --normalization '' --amp
# Fine-tune an ImageNet-pretrained PanEcho architecture
torchrun --nproc_per_node=8 train_echonetdynamic.py --output_dir echonetdynamic_ft --model_name frame_transformer --arch convnext_tiny --normalization imagenet --amp
# Fine-tune an EchoCLIP-pretrained PanEcho architecture
torchrun --nproc_per_node=8 train_echonetdynamic.py --output_dir echonetdynamic_ft --model_name frame_transformer --arch echo-clip --normalization echo-clip --amp
# Fine-tune a Kinetics-400-pretrained 3DResNet18
torchrun --nproc_per_node=8 train_echonetdynamic.py --output_dir echonetdynamic_ft --model_name 3dresnet18 --normalization kinetics --amp
### EchoNet-Pediatric fine-tuning experiments ###
# Simply replace "echonetdynamic" with "echonetpediatric" in the previous five commands
MLA:
Holste, Gregory, et al. "PanEcho: Complete AI-enabled echocardiography interpretation with multi-task deep learning." medRxiv (2024): 2024-11.
BibTeX:
@article{holste2024panecho,
title={PanEcho: Complete AI-enabled echocardiography interpretation with multi-task deep learning},
author={Holste, Gregory and Oikonomou, Evangelos K and Wang, Zhangyang and Khera, Rohan},
journal={medRxiv},
pages={2024--11},
year={2024},
publisher={Cold Spring Harbor Laboratory Press}
}
Reach out to Greg Holste ([email protected]) and Rohan Khera ([email protected]) with any questions.