Bontmia (Backup Over Network To Multiple Incremental Archives)

Bontmia was written by John Enok Vollestad in april 2003 to merge the functionality of glastree and rsync in one application with a more flexible selection of long term storage. It has later gone through some changes to enhance usability

Why tapes is not a good idea anymore

The only reason for using tapes instead of backup to disks over network is if no network or extra machine is available.

Cost
Tapes cost more than disks and require a tape-drive and possibly also a tape robot.
Reliability
Disks have a higher MTBF than tapes and they require a tape drive of the same type. Disks can increase MTBF by being connected together to form RAID. Tapes can not. Incremental backup on a disk or a RAID volume does not change the MTBF while incremental backup on tape drastically lower the MTBF. Sometimes tapes written in one drive is difficult to read in another drive. This is a silent error if you do not test the tapes no and then. No such thing on file systems like ReiserFS.
Location
Tapes is removable media and thereby avoiding the risk of localized disasters. A network connection does the same.
Time
Copying just the changes gives a fast and efficient backup. Tape drives are very slow. Often slower than the network. Remote backup will therefore usually go faster to disk. Faster backup and efficient storage means one can do backup more often. Once every hour is usually not a problem. This also means rapid restore which is even more important.
Space
Tapes split the data needlessly since they can not be used with RAID to form bigger volumes as disks can. Using hard-links on a file system gives very good storage efficiency and availability as ordinary file systems with random access.

There is a lot of remote backup software out there. I wrote Bontmia since I could not find one that did all that I wanted.

Bontmia Functionality

NB File in the backup directores should NOT be changed. It is outside the scope of Bontmia to protect against this. You can mount the backup device read-only to handle this.

NB Please note that it does not, just as any other program I know, in a sane manner support copying files that is currently changing. So for databases one should not copy the databasefiles but rather use the builtin functions to extract a backup and then backup those files with Bontmia.

NB When copying files from a remote host, please note that user id and group id is not necessary the same user so keep the backup directory unaccessable from ordinary users or maintain synchronization or user ids and passwords between the hosts. For the latter. PAM and LDAP is an exellent choice for small to mid-range sites.

NB Bontmia uses rsync and rsync in its current form uses quite a lot of RAM when used on large directory structures. It might speed things up considerably to divide the backup into several smaller ones.

Archive structure

The archives is placed in a directory structure like this:

2003/05/06/04:00/
2003/05/07/04:00/
2003/05/08/04:00/
2003/05/08/05:00/
2003/05/08/06:44/

which is YYYY/MM/DD/HH:MM/ when the backup was issued. Since the granularity of the backup archives is one minute there is not possible to run two different backups within the same minute.

Under these directories the archived files and directories is stored within a directory named the hostname backed up and absolute path. Like this:

2003/05/06/04:00/foo:/home/jev
                /bar:/home/jev
                /baz:/home/jev
2003/05/07/04:00/foo:/home/jev
                /bar:/home/jev
                /baz:/home/jev

Operation

The following shows an example of how to do backup once every day and keep the last 7 days, 4 weeks, 12 monthly and 2 years. The hostname is changed.

$ bontmia --dest ./backup  --rotation \
                  0minutes0hours7days4weeks12month2years \
                  foo.bar.com:/home/jev \
                  foo.bar.com:/etc \
                  foo.bar.com:/usr/local \
                  foo.bar.com:/var
When this is run it outputs the following on my system (the hostname is changed). Since the computer have not been on all the time not all the backups have been run but the last x backups is saved for each filter. Which filter that is active for each backup is shown. The one removed is not longer filtered to be saved.
Making a hard-link replication of the last backup
  (/backup/2003/09/20/00:00)

Backing up by modifying the replication
  foo.bar.com:/home/jev
  foo.bar.com:/etc
  foo.bar.com:/usr/local
  foo.bar.com:/var

Deletes files that should not be in the latest snapshot

Moving the complete backup into the backup archive
  (/backup/unfinished_backup -> /backup/

Calculates which backups to save
  Saving /backup/2003/06/29/19:07 by filters:  month
  Saving /backup/2003/07/20/10:00 by filters:  month
  Saving /backup/2003/08/26/22:30 by filters:  weeks month
  Saving /backup/2003/09/07/00:00 by filters:  weeks
  Removing /backup/2003/09/13/00:00
  Saving /backup/2003/09/14/00:00 by filters:  days weeks
  Saving /backup/2003/09/15/00:00 by filters:  days
  Saving /backup/2003/09/16/00:00 by filters:  days
  Saving /backup/2003/09/17/00:00 by filters:  days
  Saving /backup/2003/09/18/00:00 by filters:  days
  Saving /backup/2003/09/19/00:00 by filters:  days
  Saving /backup/2003/09/20/00:00 by filters:  days weeks month years

As one can see the last 7 days is saved, the last 4 weeks is saved and the last 3 month is saved. The backup have only run for the last 3 month and therefore there is no more month backups. Similar for the year backup.

NB! If you want to make copies of several directories then do not run Bontmia multiple times against the same '--dest' but instead list each source directory as arguments on one command or the incremental storage becomes impossible to maintain and one end up copying all the data all the time.

How the rotation work

A filter will save one backup per unit for the last x units. A unite can be minute, hour, day, week, month or year. Where x is specified by the user. It is the last backup within the unit that is saved.

Example 1
In the following example the filter A have a unite size indicated by the separation between the |'s. The filter A is saving the last 4 backups.
backups     b b b b   b b       b                 b            b   b  b
filter A   |         |  s      |s        |        s|         |        s|
Example 2
This example shows how different filters work together. Filter A saves the last 4 backups, filter B saves the last 3 and filter C saves the last 2.
backups   b b b b   b   b       b         b     b b   b     b b b b   b
filter A   | | | | | | | | | | | | | | | | | | | | | | | | | |s|s|s| |s|
filter B   |         |         |s        |        s|        s|        s|
filter C   |                    s        |                            s|
------------------------------------------------------------------------
To save                         s                 s         s s s s   s

Short FAQ

Why is this written as a shell script and not in language X?
This program relies heavily on rsync. It would be meaningless to reimplement rsync. It works ok.
How stable is this?
I have been using it at a couple of sites since late April 2003 with minor changes and it have worked for me but I do not take any responsibility for loss of data.
When I try to copy the destination of a symbolic link with '/.' I get a long list of error messages from rsync.
Rsync does not support such notation and thereby neither do bontmia.
How do I avoid typing in the password each time the backup is run?
This is documented on the ssh client and servers manual pages. On the localhost, which is the one you store the bakcup on, you have to generate a private/public keypair and append the public to the authorized hosts list on the remote computer. For OpenSSH this is done by append the public key file ~/.ssh/id_dsa.pub from the local computer onto the ~/.ssh/authorized_keys file on the remote computer where local computer means the computer one stores the backup on and remote computer means the computer one backs up.

Implementation

It is a single small shell script using cp -lR and rsync to maintain incremental backups and some additional shell-code to store old backups for a selected time in the same manner as ordinary tape rotation schemes. Files that does not change since the last backup become a hard-link to the version in the last file to save disk space. Making hard-links also increase the speed of the backup. Making a hard-link of a 2GB file is done equally fast as for a file of 1KB and thus avoiding transfer of files accross the network saves a lot of time when changes is rare.

Changes

0.14 (Latest release)
Added feature
0.13
Bug fix
0.12
Some changes in parameters and code rewrite
0.11
This is a clean-up of how bontmia does things.

Get Bontmia

The program itself is a single script and is available for download. To run it you have to have a Posix shell like Bash (sh on Solaris does not work, I will make Bontmia work with it when time), GNU date, GNU cp and ssh for remote access.

Contact

If you have any comment or suggestion you might send them to john.enok@vollestad.no or visit my webpage.

You might also rate this program by visiting the corresponding FreshMeat page.


Valid HTML 4.0! Last updated 2004-07-26.