Internal Image Benchmarks

Why this benchmark?

This test aims to determine what quality setting and bits-per-pixel value is the best for each image codec as well as how the codecs behave when the ideal resolution is not selected. We'll use the very reliable SSIMULACRA2 metric in order to find the best quality range for each codec.

How does it work?

We'll first encode an image at full resolution using the lowest quality being tested for each image. This is CQ 45 for AVIF and Q30 for Q0-100 codecs. Then, we'll increase the quality and decrease the resolution until we have an image of each quality that is the same file size as the first image. Higher quality images will be lower resolution and vice versa. To compare to the original using the SSIMULACRA2 metric, we'll scale all images back up to the original resolution. Lower resolution images will inherently score worse, but so do lower quality images. The peak of each generated curve is the best quality level to use when excess resolution is available.

What do the different curve shapes mean?

A very flat curve indicates that the image codec is not as sensitive to being encoded at the wrong resolution. This means that AUTO-REZ™ technology will have less of an impact and naive implementations will perform better. Additionally, it permits a wider quality range when encoding to arbitrary sizes, as an image CDN would. A very peaked curve, on the other hand, means the codec is very particular about its quality range. AUTO-REZ™ technology will have a huge benefit against a naive implementation. Image CDNs will have to cope with encoding to the quality range at the peak of the curve. A flatter curve is preferable. Absolutely no attention should be paid to the absolute SSIMULACRA2 values, only to the shape and peak of the curves. Target sizes differ.

What image did you use to test this?

We used this image to test the behaviors of the codecs. It has a few things that can be a challenge to encode, like visually important red details, grain, flat dark areas, and concrete, which can "steal" bits from the rest of the image. The displayed image is lossy. Download the full quality image: PNG - WebP

A black mini ebike locked to a rack.

JPEG (Mozjpeg) test

We find that qualities 65-76 are the most efficient to use, with 70 being the best. Quality 70 comes out to a BPP of 0.574. Most image CDNs use a quality value in that range. Mozjpeg has a fairly flat curve in this test, so it can be useful even without AUTO-REZ™ technology. If excess resolution is available (i.e. the image isn't being encoded at its source resolution), it isn't a good idea to use a quality 90 or higher. We find this to be consistent among all codecs.
A graph with X axis 'Mozjpeg Quality', Y axis 'SSIMULACRA2'. It is flat until Quality reaches 90 and then it drops dramatically.

WebP (CWebP M6) test

We find that qualities 80-85 are the most efficient to use, with 83 possibly being the best. Quality 83 comes out to a BPP of 0.856. The encoder is highly inconsistent and doesn't display smooth transitions between quality levels. The curve is very peaked, meaning that AUTO-REZ™ technology is mandatory for target size WebP encoding and image CDNs are stuck with qualities 80-85. We don't recommend using WebP for most purposes due to that, unless you have the technology to tame this behavior. Autocompressor will keep quality above 50 and aim for 80-85 due to this research, downscaling the image if necessary. Note that this isn't an indictment of Lossless WebP, which is a completely different technology that we highly recommend using.
A graph with X axis 'CWebP Quality', Y axis 'SSIMULACRA2'. It slopes up until Quality reaches 85 and then it drops substantially.

AVIF (Avifenc S4) test

We find that CQ-levels 32-35 are the most efficient to use, with 32 being the best. Quality 32 comes out to a BPP of 0.365. AVIF is partial to very low BPP levels and prefers too high of a resolution over too high of a quality. AUTO-REZ™ technology is not necessarily needed to exploit this because encoding at the source resolution regardless of quality is a workable solution. The target size is very low because our starting image, CQ-level 45, is very low BPP compared to Q30 with WebP, JPEG, or JXL. Note: CQ-level is the INVERSE of quality. Later in this article, we'll test which image codecs have the highest quality at their peak efficiency so that image CDNs can determine which image codec to use.
A graph with X axis 'CQ-level', Y axis 'SSIMULACRA2'. It slopes up until CQ-level reaches 32 and then it drops very slightly. AVIF has an inverted quality scale.'

JPEG XL (CJXL E7) test

We find that CJXL 0.8 behaves very much like JPEG. Qualities 63-73 are the most efficient to use, with quality 70 being the best. Quality 70 comes out to a BPP of 0.444. Quality 70 in CJXL 0.8 maps to a distance of 2.8. This encoder displays the smoothest curve yet, so the quality scale has a very good implementation. At low BPP, CJXL appears to suffer much more than AVIF, but it holds its own at higher BPP.
A graph with X axis 'cjxl -q', Y axis 'SSIMULACRA2'. It is flat until Quality reaches 90 and then it drops substantially.

So which image codec has the highest quality at its peak efficiency?

We encoded the image at full resolution using mozjpeg quality 70, cwebp quality 83, avifenc cq-level 32, and cjxl quality 70. This represents what an image CDN might do when following this research. The following SSIMULACRA2 scores were achieved:
MozJPEG: 66.64 at a BPP of 0.489
CWebP M6: 68.21 at a BPP of 0.581
Avifenc S4: 57.26 at a BPP of 0.224
CJXL E7: 68.43 at a BPP of 0.371
So CJXL has its peak efficiency at the highest quality, followed closely by CWebP, then MozJPEG, then Avifenc. This doesn't mean much for the absolute efficiency of each codec. It's fairly well known that WebP is competitive with MozJPEG, and that JPEG-XL is competitive with AVIF while both are up to 50% more efficient than JPEG and WebP. What it means is that with any given format, these qualities will give you the largest amount of viewing enjoyment per bit given an arbitrary resolution. It also means that you should avoid making tiny, ultra high quality images instead of larger, medium quality images when you have a target filesize, and it also illustrates the diminishing returns of encoding "visually lossless" images at all.