Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add stuff to readme #15

Merged
merged 3 commits into from
Nov 4, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
78 changes: 75 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,11 +7,83 @@
![Coverage](https://img.shields.io/badge/coverage-97%25-brightgreen?logo=codecov)
![Python](https://img.shields.io/badge/python->=3.10-blue?logo=python)

This project is used to encode behavior videos and, depending on the user
request, will compress those videos while ensuring that they are formatted for
correct display across many devices. This may include common image
preprocessing steps, such as gamma encoding, that are necessary for correct
display, but have to be done post-hoc for our behavior videos.

## Goals

This will attempt to compress videos so that the results:

* Retain the majority of the detail of the input video
* Take up as little space as possible for a target amount of visual detail
* Are in a format that can be widely viewed across devices, and streamed by
browsers
* Have pixel data in a color space that most players can properly display

This video compression is often lossy, and the original videos are not kept, so
this library will attempt to produce the highest-quality video for a target
compression ratio. The _speed_ of this compression is strictly secondary to the
_quality_ of the compression, as measured by the visual detail retained and the
compression ratio. See
[this section](#brief-benchmarks-on-video-compression-with-cpu-based-encoders-and-gpu-based-encoders)
for more details.


Additionally, this package should provide an easy to use interface that:

* Presents users with a curated set of compression settings, which have been
rigorously tested in terms of their visual quality using perception-based
metrics like VMAF.
* Allow users to also provide their own compression settings, if they have
specific requirements

## Non-goals

* Sacrifice the visual fidelity of videos in order to decrease encoding time.

## Usage
- The BehaviorVideoJob.run_job method in the transform_videos should be the primary method to call for processing video files.
- On a merge to main, this package will be published as a singularity container, which can easily be run on a SLURM cluster.
- The BehaviorVideoJob.run_job method in the transform_videos should be the
primary method to call for processing video files.
- On a merge to main, this package will be published as a singularity
container, which can easily be run on a SLURM cluster.

## Brief benchmarks on video compression with CPU-based encoders and GPU-based encoders

A surprising fact is that video encoders implementing the same algorithm, but
written for different compute resources do _not_ have the same visual
performance; for a given compression ratio, or similar settings, they do not
retain the same amount of visual detail. This is also true for different presets
of the same encoder and compute resource even if the other settings are
identical. For example, the presets `-preset fast` and `-preset veryslow` of the
encoder `libx264` produce videos with the same compression ratio, but differing
visual quality.

This can be seen in the plot below, where the GPU encoder and CPU encoders
retain different amounts of visual detail, as assessed with visual
perception-based metric
[VMAF](https://en.wikipedia.org/wiki/Video_Multimethod_Assessment_Fusion). Also
note the difference between presets for the same encoder and compute resource:
_CPU Fast_ and _CPU Slow_.

![visual performance vs compress ratio](/assets/compression-vs-quality.png)

This figure shows that for compression ratios greater than 100, it often makes
sense to take your time and use a slow preset of a CPU-based encoder to retain
as much visual information for a given amount of compression.

While it may be tempting to select a faster preset, or faster compute resource
like GPU for dramatic speedups shown below, doing will degrade the quality of
the resulting video.

![throughput vs compress ratio](/assets/compression-vs-speed.png)

Because the output of this package are permanent video artifacts, the
compression is lossy, and the intent is to delete the original, taking the CPU
time to produce the highest quality video possible might well be worth it.


## Development

Expand Down Expand Up @@ -84,7 +156,7 @@ The table below, from [semantic release](https://github.com/semantic-release/sem
### Documentation
To generate the rst files source files for documentation, run
```bash
sphinx-apidoc -o doc_template/source/ src
sphinx-apidoc -o doc_template/source/ src
```
Then to create the documentation HTML files, run
```bash
Expand Down
Binary file added assets/compression-vs-quality.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/compression-vs-speed.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.