Clarification on incremental / hard links logic

I just wanted to double check on the behavior of the incremental / hard links feature.

Does this properly describe how it works when pushing files from acrosync to a remote server (running openssh on linux/unix):
The first time a backup is run, if and only if it completes successfully, a "latest" symlink is made to the backup directory. Let's say the directory symlinked to is called 2015-1216-20" for this example.
The next time a backup runs, let's say for "2015-1217-20", it checks the "latest" symlink to determine which folder to use to create the hard link set, which is then used as the --link-dest parameter on the server side (--link-dest=2015-1216-20 in this example).
If and only if a backup is completed successfully, it replaces the "latest" symlink, pointing it to "2015-1217-20".

Thanks.

Comments

  • edited December 2015
    Ok, after investigation I'm fairly convinced that it does not work as i described above. Acrosync seems to use some other logic for finding the directory to pass to --link-dest. Perhaps you're checking the last modified directory, or just stepping backward date-wise for a directory until you finds one that exists.

    The problem with this approach is that there is no guarantee that the directory it finds using this method actually corresponds to a successfully completed backup. If it points to a backup that failed partway through, you end up backing up a ton of stuff all over again, without using hard links to existing files.

    For me, this resulted in me backing up hundreds of GBs of files that had already been backed up, because there was a failed backup between today and the last successful backup directory. (the failed backups are because acrosync is crashing on occasion. I'm still trying to track that down as indicated in my other thread)

    I would suggest:
    Ensuring that you only create the "latest" symlink when the entire backup is totally successfully complete ( I think you already do this ).
    On next backup, check the target of the "latest" symlink to decide what to pass to --link-dest.

    This way, you can be assured that the directory passed to --link-dest is to a successful, full backup.
  • What you described in your first post is right.  Acrosync only creates/updates the 'latest' file after a completed backup.  If the backup fails, the 'latest' file will not be created/modified.  If you run the backup again, Acrosync will use multiple previous backups as the basis, starting from the one pointed to by 'latest' and ending with the last (incomplete) one.  However, the 'latest' one will be ignored if it is old than 5 backups -- in other words, at most 5 previous backups will be used as the basis.

    You can turn on Verbose Logging to see all the --link-dest options passed to the remote rsync.

  • edited December 2015
    To be clear, are you saying that 2015-1210-20 (the directory pointed to by latest) won't get passed to --link-dest in this case, assuming the destination looks like below and I'm making a new incremental backup (will become 2015-1218-20)?
    drwxr--r--  32 user  group  40 Dec 10 20:03 2015-1210-20/ <--- last good backup
    drwxr--r--  32 user  group  40 Dec 11 20:03 2015-1211-20/ <--- failure
    drwxr--r--  32 user  group  40 Dec 12 20:03 2015-1212-20/ <--- failure
    drwxr--r--  32 user  group  40 Dec 13 20:03 2015-1213-20/ <--- failure
    drwxr--r--  32 user  group  40 Dec 14 20:03 2015-1214-20/ <--- failure
    drwxr--r--  32 user  group  40 Dec 15 20:03 2015-1215-20/ <--- failure
    drwxr--r--  32 user  group  40 Dec 16 20:03 2015-1216-20/ <--- failure
    drwxr--r--  32 user  group  40 Dec 17 20:03 2015-1217-20/ <--- failure
    lrw-r--r--   1 user  group  12 Dec 10 20:32 latest@ -> 2015-1210-20
    If so, I don't really care for that behavior2015-1210-20 represents the last good backup, and should always be included in a --link-dest. I believe I basically had this happen to me, and because 2015-1210-20 holds about 300GB of relatively unchanging files, and all the backups after that were failures, I ended up re-cloning about 300GB worth of files that for the most part could have been hard links into 2015-1210-20. Consider this my vote for having the directory pointed to by "latest" always be included in a link-dest parameter, or at least have an option somewhere to enable this behavior.
    As things stand, I will have to constantly babysit acrosync to make sure it doesn't end up using only link-dest parameters to failed backup directories.
  • You're right.  My decision to discard the 'latest' backup that is too old was based on the thought that even partial backups have their values -- they may contain big changes already uploaded so discarding them will cause subsequent backups to upload these changes again.

    It is now apparent that Acrosync should keep the latest backup, always, and then 5 other backups.  I'll make the change in the next update.
  • edited December 2015
    Great, thanks! If you make a beta build at some point in the future and have a moment, please drop a line in this thread or elsewhere so I can give it a shot. Thanks again.
  • Think we might see a new build soon, even a beta? Thanks, hope you had a good holiday.
  • edited January 2016

    You can try this new build:

            32 bit: https://acrosync.com/win32/acrosync_installer_win32_1.5_532.exe

            64 bit: https://acrosync.com/win32/acrosync_installer_win64_1.5_532.exe

            XP:     https://acrosync.com/win32/acrosync_installer_winxp_1.5_532.exe'


    The backup pointed to by 'latest' will always be added to --link-dest.  In addition, if there are backups newer than 'latest' they will be added too.  If there are more than 5 such backups, only 5 most recent ones will be added.

  • I have a lot of incomplete backups and I want to remove them. But I don't know the right way to distinguish between complete and incomplete backups. I have a look on the folder count on top level on the rsync destination, but this in not a really good way.

    How can I see which backup was successfully completed and which was incomplete? is there something like a rsync log on the rsync server? Are there some flags on rsync target folder which can be read out?
  • I can't think of a reliable method of telling which backup is complete.  Perhaps at the end of backup Acrosync should upload a blank file to indicate the success.
  • If you're using Acrosync for backup, I strongly suggest that you take a look at Duplicacy, a new network/cloud backup tool I have been working on.  Duplicacy has many advantages over a rsync-based solution, such as better deduplication, client-side encryption, cloud support, etc.  The only drawback of Duplicacy is that it doesn't simply mirror the directory structure to the server, but rather stores backups in a format that can be opened only by Duplicacy.  If this is not critical to you, then you should give it a try.
  • I am using Acrosync for backup. In case of a computer crash I am able to browse the backup on my NAS very easy directly with NAS file Explorer or remote from any computer. What should I do if the computer where Duplicacy is installed will crash? How easy it will be to browse the Duplicacy backups?
  • You can run Duplicacy on a different computer to browse or restore previous backup.  Or you can run the CLI version of Duplicacy directly on your NAS -- the CLI version is written in go which has amazing support for cross-platform.  If you don't see a version that is compatible with your NAS please let me know.
Sign In or Register to comment.