Skip to content

FA 0.11 release notes

Bulat-Ziganshin edited this page Oct 5, 2016 · 31 revisions

FA 0.11 has the following major drawbacks:

  • .arc format isn't yet supported (sheduled for FA 0.12)
  • existing archives cannot be modified, only archive creation is supported
  • on archive extraction, entire archive is decompressed in memory (it cannot skip solid blocks not required for current operation)

Other missing features can be seen from comparison of command/option sets. The remaining sections briefly describe new features in FA'Next compared to FreeArc:

32 & 64 bit, Windows and Linux versions

At last!!! 64-bit executables mean that you can use memory-hungry compression methods. In particular, LZMA now supports dictionaries up to 4000mb.

FA added support for 64-bit CLS DLLs. "foo" compression method tries to load cls64-foo.dll and then cls-foo.dll in 64-bit executable, while in 32-bit executable it tries cls32-foo.dll and then cls-foo.dll.

Deduplication

With -dup option, full-archive deduplication is performed. It works like ZPAQ, but 1) much faster, 2) since archive updating isn't yet implemented, you cannot use it for incremental backups.

Full-archive deduplication is useful alternative to REP filter. Usually, replacing REP with deduplication makes archive larger, but reduces memory required for decompression. In my experiment with compression of 4.7 GB, -m4 -dup produced archive 4% larger than -m4, but decompression memory was dropped from 735 to 179 MB, while compression and decompression speeds weren't affected. Unfortunately, memory required for decompression cannot be controlled, and isn't known until archive is created.

In deduplication mode, input data are split into fixed-size buffers (--bufsize option), each buffer contents is split into chunks using ZPAQ algorithm (--chunk/--min-chunk options specify average/minimum chunk size), duplicated chunks are eliminated while remaining data are recombined into fixed-size buffers once again, and finally compressed with the selected algorithm.

FA doesn't try to compress deduplicated data if order-0 entropy is >99%; --ratio option skips incompressible blocks.

  •           --ratio=PERCENTS         automatically store blocks with order-0 entropy >PERCENTS% (99 by default)
    

--threads option selects number of worker threads performing CPU-intensive deduplication and compression operations (default==number of CPU threads), --io-threads option specifies number of I/O threads.

By default, chunks are deduplicated using very fast VMAC hashing algorithm, and these hashes aren't stored in the archive. --save-sha-hashes option employs SHA-256 hashes instead and stores them into archive. These hashes are then checked on extraction, ensuring data integrity. Even if archive contains SHA-256 hashes, you can disable their checking on extraction with --no-check option.

ZSTD support

FA incorporates ZSTD 1.1, and provided fa.ini replaces Tornado with ZSTD in -m1/m2 modes.

Example of compression method with all parameters specified: zstd:19:d4m:c1m:h4m:l16:b4:t64:s7.

Description of all parameters:

  • :19 = compression level (profile), 1..22 : larger == more compression
  • :d4m = dictionary size in bytes, AKA largest match distance : larger == more compression, more memory needed during compression and decompression
  • :c1m = fully searched segment : larger == more compression, slower, more memory (useless for fast strategy)
  • :h4m = hash size in bytes : larger == faster and more compression, more memory needed during compression
  • :l16 = chain length, AKA number of searches : larger == more compression, slower
  • :b4 = minimum match length : larger == faster decompression, sometimes less compression
  • :t64 = target length AKA fast bytes - acceptable match size (only for optimal parser AKA btopt strategy) : larger == more compression, slower
  • :s7 = parsing strategy: fast=1, dfast=2, greedy=3, lazy=4, lazy2=5, btlazy2=6, btopt=7

By default, compression level parameter is set to 1, and other parameters are set to 0, that means "use default values as specified in the table".

Lua programming

Major new feature of FA is its Lua programmability. You can use Lua code to add/change/remove options, execute actions on operation start/finish, on warnings and errors, including analysis of operation being performed and option settings. FA automatically executes [Lua code] sections from your fa.ini.

Part of program options are already implemented by the built-in Lua code, and you can browse this code in order to learn how to add your own options.

--lua-filter option allows to select files to process with arbitrary Lua predicate.

Prefetching

12 years ago FreeArc pioneered reading-ahead technique that significantly improved speed, when lots of small files are compressed, by prefetching them into large buffer (so-called read-ahead cache) in parallel with compression operation.

Now FA improves this technique by prefetching directly into OS cache. This avoids extra memory copy operation, avoids allocation of extra buffer in application and allows to read-ahead as much data as OS cache can hold. My benchmarks show that new technique allows FA to outperform fastest NanoZip compression modes in real SSD operation.

Default option setting is equivalent to --prefetch:1:256mb, i.e. 1 thread prefetching up to 256 MB. --prefetch option without parameters is equivalent to --prefetch:8:2gb. Overall, 1 thread is optimal for HDD, 8 threads - for SSD, and lookahead amount should fit into OS cache available during operation. Automatic tuning of this option may be a good application of Lua programming.

Prefetching can be disabled with --prefetch- option.

FA also includes the traditional read-ahead cache, that can be controlled via options --cache, --io-threads and --bufsize. Essentially, cache_size = num_io_threads * buffer_size. So, if you are going to disable prefetching to OS cache, you may want to raise size of the read-ahead cache.