News‎ > ‎

How to use Glacier to store backups

posted Sep 26, 2013, 10:36 AM by Rene Stach   [ updated Jan 31, 2014, 5:19 AM by Kenneth Skovhede ]

The new storage engine for Duplicati 2.0 was designed with support for Amazon Glacier in mind. Supporting Glacier is a challenging task as Glacier does not allow to read files without a major delay of several hours, which more or less means, that you cannot read files at all. That is why we introduced a local database with the new storage engine for Duplicati 2.0. This local database contains all information about your backup so that it is not required to read any backup files; at least not the files that contain the actual backup data which are the dblock files.

Although we do not have a connector for Glacier at the moment, it is already possible with the new storage engine in Duplicati 2.0 to store the large backups files on Glacier using the S3-to-Glacier service. And here is a short description how it works:

  • Get the experimental version of Duplicati 2.0 (command line only at the moment).
  • Configure your backup so that it works with Amazon S3. Try it out, run it a few times, restore some files, get happy with it.
  • When your backup is set up properly, you will find three different file types in S3: dblock files which contain the actual data, dindex files which describe what is stored in the dblock files, and dlist files which actually describe the backup itself. 99% of the data is occupied by dblock files. The goal is not to move these large files to Glacier (next step).
  • Set up a prefix-filter that moves files from S3 to Glacier regularly. The prefix-filter should look for all files that match “duplicati-b*”. When this prefix-filter is executed, only the small dlist and dindex files will remain on S3 which is totally sufficient for Duplicati 2.0 to work reliably. (Updated 31-jan-2014: filter is "duplicati-b*", previously stated "d").
  • You now add the switches --no-backend-verification and --no-auto-compact to your backup command. Duplicati will not check for files on S3 anymore but use all required information from the local database. Furthermore, it will not compact your files. Compacting means that dblock files that contain a lot of deleted data get replaced by new dblock files that do not contain any deleted data anymore. The new dblock files require less space and allow faster restores. This is turned off for Glacier to avoid that files need to be read.

When you need to restore files from your backup, just run the restore command. It will complain about missing files (which are available on Glacier). Just copy the missing files from Glacier to S3 and run the restore command again.