diff --git a/README.md b/README.md index 5197f80..4e1e648 100644 --- a/README.md +++ b/README.md @@ -7,11 +7,83 @@ ![Coverage](https://img.shields.io/badge/coverage-97%25-brightgreen?logo=codecov) ![Python](https://img.shields.io/badge/python->=3.10-blue?logo=python) +This project is used to encode behavior videos and, depending on the user +request, will compress those videos while ensuring that they are formatted for +correct display across many devices. This may include common image +preprocessing steps, such as gamma encoding, that are necessary for correct +display, but have to be done post-hoc for our behavior videos. +## Goals + +This will attempt to compress videos so that the results: + +* Retain the majority of the detail of the input video +* Take up as little space as possible for a target amount of visual detail +* Are in a format that can be widely viewed across devices, and streamed by + browsers +* Have pixel data in a color space that most players can properly display + +This video compression is often lossy, and the original videos are not kept, so +this library will attempt to produce the highest-quality video for a target +compression ratio. The _speed_ of this compression is strictly secondary to the +_quality_ of the compression, as measured by the visual detail retained and the +compression ratio. See +[this section](#brief-benchmarks-on-video-compression-with-cpu-based-encoders-and-gpu-based-encoders) +for more details. + + +Additionally, this package should provide an easy to use interface that: + +* Presents users with a curated set of compression settings, which have been + rigorously tested in terms of their visual quality using perception-based + metrics like VMAF. +* Allow users to also provide their own compression settings, if they have + specific requirements + +## Non-goals + +* Sacrifice the visual fidelity of videos in order to decrease encoding time. ## Usage - - The BehaviorVideoJob.run_job method in the transform_videos should be the primary method to call for processing video files. - - On a merge to main, this package will be published as a singularity container, which can easily be run on a SLURM cluster. + - The BehaviorVideoJob.run_job method in the transform_videos should be the + primary method to call for processing video files. + - On a merge to main, this package will be published as a singularity + container, which can easily be run on a SLURM cluster. + +## Brief benchmarks on video compression with CPU-based encoders and GPU-based encoders + +A surprising fact is that video encoders implementing the same algorithm, but +written for different compute resources do _not_ have the same visual +performance; for a given compression ratio, or similar settings, they do not +retain the same amount of visual detail. This is also true for different presets +of the same encoder and compute resource even if the other settings are +identical. For example, the presets `-preset fast` and `-preset veryslow` of the +encoder `libx264` produce videos with the same compression ratio, but differing +visual quality. + +This can be seen in the plot below, where the GPU encoder and CPU encoders +retain different amounts of visual detail, as assessed with visual +perception-based metric +[VMAF](https://en.wikipedia.org/wiki/Video_Multimethod_Assessment_Fusion). Also +note the difference between presets for the same encoder and compute resource: +_CPU Fast_ and _CPU Slow_. + +![visual performance vs compress ratio](/assets/compression-vs-quality.png) + +This figure shows that for compression ratios greater than 100, it often makes +sense to take your time and use a slow preset of a CPU-based encoder to retain +as much visual information for a given amount of compression. + +While it may be tempting to select a faster preset, or faster compute resource +like GPU for dramatic speedups shown below, doing will degrade the quality of +the resulting video. + +![throughput vs compress ratio](/assets/compression-vs-speed.png) + +Because the output of this package are permanent video artifacts, the +compression is lossy, and the intent is to delete the original, taking the CPU +time to produce the highest quality video possible might well be worth it. + ## Development @@ -84,7 +156,7 @@ The table below, from [semantic release](https://github.com/semantic-release/sem ### Documentation To generate the rst files source files for documentation, run ```bash -sphinx-apidoc -o doc_template/source/ src +sphinx-apidoc -o doc_template/source/ src ``` Then to create the documentation HTML files, run ```bash diff --git a/assets/compression-vs-quality.png b/assets/compression-vs-quality.png new file mode 100644 index 0000000..f491bc7 Binary files /dev/null and b/assets/compression-vs-quality.png differ diff --git a/assets/compression-vs-speed.png b/assets/compression-vs-speed.png new file mode 100644 index 0000000..e4177c6 Binary files /dev/null and b/assets/compression-vs-speed.png differ