Lossless audio codec comparison revision 5 - multichannel part

Introduction

This document compares the performance of lossless audio codecs on multichannel material, which are PCM audio files with more than 2 channels present. Various so-called channel orderings exists, but by far the most common is 5.1, which consists of a front left, front right, center, low-frequency effects (LFE), surround left and surround right channel.

Most of the test corpus for this comparison consists of 5.1 surround sources, but included are also one 4.0 (quad) source and two 7.1 surround sources. As some codecs do support this 5.1 surround channel ordering but no other orderings, the results are presented twice: first all used sources with only the codecs that support these, and second only the 5.1 surround sources with all tested codecs.

Method

To compare the performance of each codec, the following steps are followed for each combination of corpus file, codec and codec setting:

  • A WAV file is placed on a large enough ramdisk
  • An MD5sum is calculated for the WAV file excluding its header
  • The WAV file is encoded by the chosen codec provided with the required settings. The amount of CPU time required to do this conversion is measured and the resulting filesize is recorded
  • The encoded file is decoded by the chosen codec. The amount of CPU time required to do this conversion is measured
  • An MD5sum is calculated for the resulting decoded file excluding its header
  • The MD5sum of the provided WAV file and the decoded file are compared

The following codecs and settings are used:

Codec Settings used
FLAC -0, -3, -5, -6, -7, -8
ALAC [-]
WavPack -f, [-], -h, -hh, -x4f, -x4, -x4h, -x4hh
TAK -p0, -p0e, -p0m, -p1, -p1e, -p1m, -p2, -p2e, -p2m, -p3, -p3e, -p3m, -p4, -p4e, -p4m
Monkeys Audio -c1000, -c2000, -c3000, -c4000, -c5000
MP4ALS -a -o5, -a, -a -o20, -a -o40, -a -b, -a -b -o20, -a -b -o40, -a -b -o1023, -7
WMA [-]
TTA [-]

For each codec, the latest Windows binary provided by the author of the codec is used, no specially tuned compiles are used. In case of ALAC, encoding is done with refalac64 as provided by QAAC. In case of WMA, encoding is done with WMAEncode.exe (which uses the encoder provided by Windows 10) and decoding with FFmpeg 5.0.

Measurements are made on a Windows 10 machine with a AMD A4-5000 CPU with 4GB of RAM. This CPU has all x86 instruction set extensions up to and including AVX (i.e. lacks AVX2). Measuring the CPU-time used is done with timer64.exe, part of the 7-max/7-benchmark suite.

Timing is done per track (for music sources) or per chapter (for sources from a movie or broadcast). This measured time is divided by the content length, i.e. the execution time of the encoding or decoding process in seconds divided by the playback length of the track or chapter in seconds. The result of this division is called CPU-usage. The filesize of the encoded file is divided by the filesize of the original WAV file to calculate a compression.

Results per source are obtained by averaging the compression and CPU-usage so each track or chapter contributes the same amount to the average, i.e. length of the track or chapter is not incorporated.

The average of all sources is obtained by averaging the results per source, again without any weighing. The total results are therefore not influenced by the length of the corpus content, each source contributes an equal amount to the average.

Results (PDF)

The best results will be obtained in the bottom left corner of the graphs: this represents the best compression (smallest file size) and the lowest CPU-usage (fastest compression and/or decompression).

In the graph each codec is represented by a group of markers connected by a line. Each combination of settings as mentioned in the previous table corresponds to one marker, in the order listed in the table. The first combination of settings is (usually) the fastest and the last the slowest. Therefore, the marker closest to the upper left corner of the graph corresponds to the first listed combination of settings, and the marker closest to the lower right corner corresponds to the last listed combination of settings.

Discussion

Looking at the average of all 5.1 surround sources, TAK seems to be the clear winner. There is not a single combination of settings for TAK where it is beat in both file size and either encoding or decoding speed by any other codec. Its best compressing setting (-p4m) also yields the best compression across all tested codecs. Its decoding speed is second only to FLAC, but the difference is small while producing much smaller files.

When looking at achieved compression, ALS' -7 preset comes very close to TAK's -p4m, but encoding is more than 10 times as slow and decoding is almost 10 times as slow.

The worst performing codec is arguably TTA. The produced average file size as a percentage of the (uncompressed) WAV file size is more than 1 percentage point larger than the next worst (WMA Lossless), while not being particularly fast at both encoding and decoding. WMA Lossless does better than TTA, but it still produces much larger files than codecs with similar speed and is much slower than codecs with similar compression.

When also taking into account surround sources other than 5.1, some codecs no longer compete. TAK, ALAC, WMA Lossless and TTA do not support 7.1 surround. Contrary to the results for 5.1 sources, there is no clear winner here: ALS performs best in terms of compression but at the cost of slow encoding and decoding. FLAC doesn't compresses as much as ALS, but is much faster in both encoding and decoding. The WavPack -x4 presets hold the middle ground: while encoding is rather slow, decoding is quite fast and compression is in between FLAC and ALS.

Monkey's audio doesn't really shine here: while it's normal preset is the fastest encoding for that particular compression, it decodes rather slowly. All other presets seem to be performing worse than other codecs in all aspects. Strangely enough, Monkey's Audio insane preset is outperformed by the extrahigh preset, while the latter is 3 times as fast in both encoding and decoding.

Usability

While any of the listed codecs is very much suitable for storing multichannel audio on its own, it is often stored along with a moving picture. Therefore, the possibility of embedding the audio alongside the video is of concern. The Matroska container supports FLAC, ALAC, TTA and WavPack. For MP4, ALAC is officially supported, for FLAC support is present in some tools (for example Firefox), but there is no official support.

Sources used

Source is lossless except when noted with Dolby Digital, which is lossy.

Music
Artist - AlbumFormatCommentYear
Adele - Live at the Royal Albert Hall5.1 48kHz 16-bit (Dolby Digital)DVD rip2011
Coldplay - A Head Full of Dreams5.1 96kHz 24-bitBD-A rip2016
Hans Zimmer - Inception5.1 48kHz 24-bitBD-A rip2010
Lord of the Rings - The Return of the King (Complete Recordings)5.1 48kHz 16-bitDVD-Audio rip2007
Mozart - Violin concerto in D major: Allegro (Marianne Thorsen / TrondheimSolistene)5.1 96kHz 24-bitDigital download2006
Nightwish - Vehicle of Spirit5.1 48kHz 16-bitBD rip2016
Pink Floyd - The Dark Side of the Moon4.0 96kHz 24-bitBD-A rip1973
 
Movie soundtrack (music and dialogue)
MovieFormatCommentYear
Gravity5.1 48kHz 16-bitBD rip2013
Incredibles 27.1 48kHz 24-bitBD rip2018
Johnny English Reborn5.1 48kHz 24-bitBD rip2011
Kingdom of Heaven5.1 48kHz 24-bitBD rip2005
Tron: Legacy7.1 48kHz 16-bitBD rip2012
 
Dialogue
SourceFormatCommentYear
The Big Bang Theory, Season 1, Ep 1 & 25.1 48kHz 16-bitBD rip2007