TTVSR Description

Video super-resolution (VSR) aims to restore a sequence of high-resolution (HR) frames from their low-resolution (LR) counterparts. Although some progress has been made, there are grand challenges to effectively utilize temporal dependency in entire video sequences. Existing approaches usually align and aggregate video frames from limited adjacent frames (e.g., 5 or 7 frames), which prevents these approaches from satisfactory results. In this paper, we take one step further to enable effective spatio-temporal learning in videos. We propose a novel Trajectory-aware Transformer for Video Super-Resolution (TTVSR). In particular, we formulate video frames into several pre-aligned trajectories which consist of continuous visual tokens. For a query token, self-attention is only learned on relevant visual tokens along spatio-temporal trajectories. Compared with vanilla vision Transformers, such a design significantly reduces the computational cost and enables Transformers to model long-range features. We further propose a cross-scale feature tokenization module to overcome scale-changing problems that often occur in long-range videos. Experimental results demonstrate the superiority of the proposed TTVSR over state-of-the-art models, by extensive quantitative and qualitative evaluations in four widely-used video super-resolution benchmarks.

Paper: Learning Trajectory-Aware Transformer for Video Super-Resolution

Reference github repository (PyTorch)

Optimization

After the export to MindSpore the computation graph has been optimized for the best performance on both training and inference phases.

Dataset

Training set

REDS dataset. We regroup the training and validation dataset into one folder. The original training dataset has 240 clips from 000 to 239. The original validation dataset were renamed from 240 to 269.

Make REDS structure be:

    ├────REDS
            ├────trainval_sharp
                ├────000
                ├────...
                ├────269
            ├────trainval_sharp_bicubic
                ├────X4
                    ├────000
                    ├────...
                    ├────269

Testing set
- REDS4 dataset. The 000, 011, 015, 020 clips from the original training dataset of REDS.

Environmental requirements

GPU

Hardware (GPU)
- Prepare hardware environment with GPU processor
Framework
- MindSpore
For details, see the following resources:
- MindSpore Tutorial
- MindSpore Python API
Additional python packages:
- Install additional packages manually or using pip install -r requirements.txt command in the model directory.

Ascend 910

Hardware (Ascend)
- Prepare hardware environment with Ascend 910 (cann_5.1.2, euler_2.8.3, py_3.7)
Framework
- MindSpore 2.0.0-alpha or later

Performance

Training Performance

Training dataset info


Dataset name	REDS
Batch size	2
Sequence length (frames)	50
Input patch size	64x64
Upscale	x4
Iterations num	400000
Warm up steps	100

Training results

Model	Device type	Device	PSNR (dB)	SSIM	Train time (seconds per iter)	Train time (expected)**
TTVSR-PyTorch	GPU	8 x A100	32.12	0.9021	4.2	20 days
TTVSR-MS	GPU	8 x A100	31.97*	0.9016*	3.8	18 days
TTVSR-MS-Ascend	Ascend	Ascend-910A	31.97*	0.9016*	15.6	72 days

* Metrics are measured with the exported from PyTorch original weights. ** Training does not finished yet due to long training time but it's expected to reproduce the original result.

Inference performance

Inference dataset info


Dataset name	REDS
Batch size	1
Sequence length (frames)	100
Input resolution (HxW)	320x180
Output resolution (HxW)	1280x720
Upscale	x4

Timings

Model	Device type	Device	Inference time (seconds per sequence)
TTVSR-Torch	GPU	V100	11
TTVSR-MS	GPU	V100	7.5
TTVSR-MS-Ascend	Ascend	Ascend-910A	13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ttvsr.md

ttvsr.md

Contents

TTVSR Description

Optimization

Dataset

Environmental requirements

GPU

Ascend 910

Performance

Training Performance

Training dataset info

Training results

Inference performance

Inference dataset info

Timings

Files

ttvsr.md

Latest commit

History

ttvsr.md

File metadata and controls

Contents

TTVSR Description

Optimization

Dataset

Environmental requirements

GPU

Ascend 910

Performance

Training Performance

Training dataset info

Training results

Inference performance

Inference dataset info

Timings