The term Migration, as used in the context of Bacula, means moving data from one Volume to another. In particular it refers to a Job (similar to a backup job) that reads data that was previously backed up to a Volume and writes it to another Volume. As part of this process, the File catalog records associated with the first backup job are purged. In other words, Migration moves Bacula Job data from one Volume to another by reading the Job data from the Volume it is stored on, writing it to a different Volume in a different Pool, and then purging the database records for the first Job.
The section process for which Job or Jobs are migrated can be based on quite a number of different criteria such as:
The details of these selection criteria will be defined below.
To run a Migration job, you must first define a Job resource very similar to a Backup Job but with Type = Migrate instead of Type = Backup. One of the key points to remember is that the Pool that is specified for the migration job is the only pool from which jobs will be migrated, with one exception noted below. In addition, the Pool to which the selected Job or Jobs will be migrated is defined by the Next Pool = ... in the Pool resource specified for the Migration Job.
Bacula permits pools to contain Volumes with different Media Types. However, when doing migration, this is a very undesirable condition. For migration to work properly, you should use pools containing only Volumes of the same Media Type for all migration jobs.
The migration job normally is either manually started or starts from a Schedule much like a backup job. It searches for a previous backup Job or Jobs that match the parameters you have specified in the migration Job resource, primarily a Selection Type (detailed a bit later). Then for each previous backup JobId found, the Migration Job will run a new Job which copies the old Job data from the previous Volume to a new Volume in the Migration Pool. It is possible that no prior Jobs are found for migration, in which case, the Migration job will simply terminate having done nothing, but normally at a minimum, three jobs are involved during a migration:
If the Migration control job finds a number of JobIds to migrate (e.g. it is asked to migrate one or more Volumes), it will start one new migration backup job for each JobId found on the specified Volumes. Please note that Migration doesn't scale too well since Migrations are done on a Job by Job basis. This if you select a very large volume or a number of volumes for migration, you may have a large number of Jobs that start. Because each job must read the same Volume, they will run consecutively (not simultaneously).
The following directives can appear in a Director's Job resource, and they are used to define a Migration job.
For the OldestVolume and SmallestVolume, this Selection pattern is not used (ignored).
For the Client, Volume, and Job keywords, this pattern must be a valid regular expression that will filter the appropriate item names found in the Pool.
For the SQLQuery keyword, this pattern must be a valid SELECT SQL statement that returns JobIds.
The following directives can appear in a Director's Pool resource, and they are used to define a Migration job.
When you specify a Migration Job, you must specify all the standard directives as for a Job. However, certain such as the Level, Client, and FileSet, though they must be defined, are ignored by the Migration job because the values from the original job used instead.
As an example, suppose you have the following Job that you run every night. To note: there is no Storage directive in the Job resource; there is a Storage directive in each of the Pool resources; the Pool to be migrated (File) contains a Next Pool directive that defines the output Pool (where the data is written by the migration job).
# Define the backup Job Job { Name = "NightlySave" Type = Backup Level = Incremental # default Client=rufus-fd FileSet="Full Set" Schedule = "WeeklyCycle" Messages = Standard Pool = Default } # Default pool definition Pool { Name = Default Pool Type = Backup AutoPrune = yes Recycle = yes Next Pool = Tape Storage = File LabelFormat = "File" } # Tape pool definition Pool { Name = Tape Pool Type = Backup AutoPrune = yes Recycle = yes Storage = DLTDrive } # Definition of File storage device Storage { Name = File Address = rufus Password = "ccV3lVTsQRsdIUGyab0N4sMDavui2hOBkmpBU0aQKOr9" Device = "File" # same as Device in Storage daemon Media Type = File # same as MediaType in Storage daemon } # Definition of DLT tape storage device Storage { Name = DLTDrive Address = rufus Password = "ccV3lVTsQRsdIUGyab0N4sMDavui2hOBkmpBU0aQKOr9" Device = "HP DLT 80" # same as Device in Storage daemon Media Type = DLT8000 # same as MediaType in Storage daemon }
Where we have included only the essential information -- i.e. the Director, FileSet, Catalog, Client, Schedule, and Messages resources are omitted.
As you can see, by running the NightlySave Job, the data will be backed up to File storage using the Default pool to specify the Storage as File.
Now, if we add the following Job resource to this conf file.
Job { Name = "migrate-volume" Type = Migrate Level = Full Client = rufus-fd FileSet = "Full Set" Messages = Standard Pool = Default Maximum Concurrent Jobs = 4 Selection Type = Volume Selection Pattern = "File" }
and then run the job named migrate-volume, all volumes in the Pool named Default (as specified in the migrate-volume Job that match the regular expression pattern File will be migrated to tape storage DLTDrive because the Next Pool in the Default Pool specifies that Migrations should go to the pool named Tape, which uses Storage DLTDrive.
If instead, we use a Job resource as follows:
Job { Name = "migrate" Type = Migrate Level = Full Client = rufus-fd FileSet="Full Set" Messages = Standard Pool = Default Maximum Concurrent Jobs = 4 Selection Type = Job Selection Pattern = ".*Save" }
All jobs ending with the name Save will be migrated from the File Default to the Tape Pool, or from File storage to Tape storage.
Although Recycling and Backing Up to Disk Volume have been discussed in previous chapters, this chapter is meant to give you an overall view of possible backup strategies and to explain their advantages and disadvantages.
Probably the simplest strategy is to back everything up to a single tape and insert a new (or recycled) tape when it fills and Bacula requests a new one.
This system is very simple. When the tape fills and Bacula requests a new tape, you unmount the tape from the Console program, insert a new tape and label it. In most cases after the label, Bacula will automatically mount the tape and resume the backup. Otherwise, you simply mount the tape.
Using this strategy, one typically does a Full backup once a week followed by daily Incremental backups. To minimize the amount of data written to the tape, one can do (as I do) a Full backup once a month on the first Sunday of the month, a Differential backup on the 2nd-5th Sunday of the month, and incremental backups the rest of the week.
If you use the strategy presented above, Bacula will ask you to change the tape, and you will unmount it and then remount it when you have inserted the new tape.
If you do not wish to interact with Bacula to change each tape, there are several ways to get Bacula to release the tape:
#!/bin/sh /full-path/console -c /full-path/console.conf <<END_OF_DATA release storage=your-storage-name END_OF_DATA
In this example, you would have AlwaysOpen=yes, but the release command would tell Bacula to rewind the tape and on the next job assume the tape has changed. This strategy may not work on some systems, or on autochangers because Bacula will still keep the drive open.
#!/bin/sh /full-path/console -c /full-path/console.conf <\<END_OF_DATA unmount storage=your-storage-name END_OF_DATA # the following is a shell command mt eject /full-path/console -c /full-path/console.conf <<END_OF_DATA mount storage=your-storage-name END_OF_DATA
This scheme is quite different from the one mentioned above in that a Full backup is done to a different tape every day of the week. Generally, the backup will cycle continuously through 5 or 6 tapes each week. Variations are to use a different tape each Friday, and possibly at the beginning of the month. Thus if backups are done Monday through Friday only, you need only 5 tapes, and by having two Friday tapes, you need a total of 6 tapes. Many sites run this way, or using modifications of it based on two week cycles or longer.
The simplest way to "force" Bacula to use a different tape each day is to define a different Pool for each day of the the week a backup is done. In addition, you will need to specify appropriate Job and File retention periods so that Bacula will relabel and overwrite the tape each week rather than appending to it. Nic Bellamy has supplied an actual working model of this which we include here.
What is important is to create a different Pool for each day of the week, and on the run statement in the Schedule, to specify which Pool is to be used. He has one Schedule that accomplishes this, and a second Schedule that does the same thing for the Catalog backup run each day after the main backup (Priorities were not available when this script was written). In addition, he uses a Max Start Delay of 22 hours so that if the wrong tape is premounted by the operator, the job will be automatically canceled, and the backup cycle will re-synchronize the next day. He has named his Friday Pool WeeklyPool because in that Pool, he wishes to have several tapes to be able to restore to a time older than one week.
And finally, in his Storage daemon's Device resource, he has Automatic Mount = yes and Always Open = No. This is necessary for the tape ejection to work in his end_of_backup.sh script below.
For example, his bacula-dir.conf file looks like the following:
# /etc/bacula/bacula-dir.conf # # Bacula Director Configuration file # Director { Name = ServerName DIRport = 9101 QueryFile = "/etc/bacula/query.sql" WorkingDirectory = "/var/lib/bacula" PidDirectory = "/var/run" SubSysDirectory = "/var/lock/subsys" Maximum Concurrent Jobs = 1 Password = "console-pass" Messages = Standard } # # Define the main nightly save backup job # Job { Name = "NightlySave" Type = Backup Client = ServerName FileSet = "Full Set" Schedule = "WeeklyCycle" Storage = Tape Messages = Standard Pool = Default Write Bootstrap = "/var/lib/bacula/NightlySave.bsr" Max Start Delay = 22h } # Backup the catalog database (after the nightly save) Job { Name = "BackupCatalog" Type = Backup Client = ServerName FileSet = "Catalog" Schedule = "WeeklyCycleAfterBackup" Storage = Tape Messages = Standard Pool = Default # This creates an ASCII copy of the catalog RunBeforeJob = "/usr/lib/bacula/make_catalog_backup -u bacula" # This deletes the copy of the catalog, and ejects the tape RunAfterJob = "/etc/bacula/end_of_backup.sh" Write Bootstrap = "/var/lib/bacula/BackupCatalog.bsr" Max Start Delay = 22h } # Standard Restore template, changed by Console program Job { Name = "RestoreFiles" Type = Restore Client = ServerName FileSet = "Full Set" Storage = Tape Messages = Standard Pool = Default Where = /tmp/bacula-restores } # List of files to be backed up FileSet { Name = "Full Set" Include = signature=MD5 { / /data } Exclude = { /proc /tmp /.journal } } # # When to do the backups # Schedule { Name = "WeeklyCycle" Run = Level=Full Pool=MondayPool Monday at 8:00pm Run = Level=Full Pool=TuesdayPool Tuesday at 8:00pm Run = Level=Full Pool=WednesdayPool Wednesday at 8:00pm Run = Level=Full Pool=ThursdayPool Thursday at 8:00pm Run = Level=Full Pool=WeeklyPool Friday at 8:00pm } # This does the catalog. It starts after the WeeklyCycle Schedule { Name = "WeeklyCycleAfterBackup" Run = Level=Full Pool=MondayPool Monday at 8:15pm Run = Level=Full Pool=TuesdayPool Tuesday at 8:15pm Run = Level=Full Pool=WednesdayPool Wednesday at 8:15pm Run = Level=Full Pool=ThursdayPool Thursday at 8:15pm Run = Level=Full Pool=WeeklyPool Friday at 8:15pm } # This is the backup of the catalog FileSet { Name = "Catalog" Include = signature=MD5 { /var/lib/bacula/bacula.sql } } # Client (File Services) to backup Client { Name = ServerName Address = dionysus FDPort = 9102 Catalog = MyCatalog Password = "client-pass" File Retention = 30d Job Retention = 30d AutoPrune = yes } # Definition of file storage device Storage { Name = Tape Address = dionysus SDPort = 9103 Password = "storage-pass" Device = Tandberg Media Type = MLR1 } # Generic catalog service Catalog { Name = MyCatalog dbname = bacula; user = bacula; password = "" } # Reasonable message delivery -- send almost all to email address # and to the console Messages { Name = Standard mailcommand = "/usr/sbin/bsmtp -h localhost -f \"\(Bacula\) %r\" -s \"Bacula: %t %e of %c %l\" %r" operatorcommand = "/usr/sbin/bsmtp -h localhost -f \"\(Bacula\) %r\" -s \"Bacula: Intervention needed for %j\" %r" mail = root@localhost = all, !skipped operator = root@localhost = mount console = all, !skipped, !saved append = "/var/lib/bacula/log" = all, !skipped } # Pool definitions # # Default Pool for jobs, but will hold no actual volumes Pool { Name = Default Pool Type = Backup } Pool { Name = MondayPool Pool Type = Backup Recycle = yes AutoPrune = yes Volume Retention = 6d Accept Any Volume = yes Maximum Volume Jobs = 2 } Pool { Name = TuesdayPool Pool Type = Backup Recycle = yes AutoPrune = yes Volume Retention = 6d Accept Any Volume = yes Maximum Volume Jobs = 2 } Pool { Name = WednesdayPool Pool Type = Backup Recycle = yes AutoPrune = yes Volume Retention = 6d Accept Any Volume = yes Maximum Volume Jobs = 2 } Pool { Name = ThursdayPool Pool Type = Backup Recycle = yes AutoPrune = yes Volume Retention = 6d Accept Any Volume = yes Maximum Volume Jobs = 2 } Pool { Name = WeeklyPool Pool Type = Backup Recycle = yes AutoPrune = yes Volume Retention = 12d Accept Any Volume = yes Maximum Volume Jobs = 2 } # EOF
Note, the mailcommand and operatorcommand should be on a single line each. They were split to preserve the proper page width. In order to get Bacula to release the tape after the nightly backup, he uses a RunAfterJob script that deletes the ASCII copy of the database back and then rewinds and ejects the tape. The following is a copy of end_of_backup.sh
#! /bin/sh /usr/lib/bacula/delete_catalog_backup mt rewind mt eject exit 0
Finally, if you list his Volumes, you get something like the following:
*list media Using default Catalog name=MyCatalog DB=bacula Pool: WeeklyPool +-----+-----------+-------+--------+-----------+-----------------+-------+------+ | MeId| VolumeName| MedTyp| VolStat| VolBytes | LastWritten | VolRet| Recyc| +-----+-----------+-------+--------+-----------+-----------------+-------+------+ | 5 | Friday_1 | MLR1 | Used | 2157171998| 2003-07-11 20:20| 103680| 1 | | 6 | Friday_2 | MLR1 | Append | 0 | 0 | 103680| 1 | +-----+-----------+-------+--------+-----------+-----------------+-------+------+ Pool: MondayPool +-----+-----------+-------+--------+-----------+-----------------+-------+------+ | MeId| VolumeName| MedTyp| VolStat| VolBytes | LastWritten | VolRet| Recyc| +-----+-----------+-------+--------+-----------+-----------------+-------+------+ | 2 | Monday | MLR1 | Used | 2260942092| 2003-07-14 20:20| 518400| 1 | +-----+-----------+-------+--------+-----------+-----------------+-------+------+ Pool: TuesdayPool +-----+-----------+-------+--------+-----------+-----------------+-------+------+ | MeId| VolumeName| MedTyp| VolStat| VolBytes | LastWritten | VolRet| Recyc| +-----+-----------+-------+--------+-----------+-----------------+-------+------+ | 3 | Tuesday | MLR1 | Used | 2268180300| 2003-07-15 20:20| 518400| 1 | +-----+-----------+-------+--------+-----------+-----------------+-------+------+ Pool: WednesdayPool +-----+-----------+-------+--------+-----------+-----------------+-------+------+ | MeId| VolumeName| MedTyp| VolStat| VolBytes | LastWritten | VolRet| Recyc| +-----+-----------+-------+--------+-----------+-----------------+-------+------+ | 4 | Wednesday | MLR1 | Used | 2138871127| 2003-07-09 20:2 | 518400| 1 | +-----+-----------+-------+--------+-----------+-----------------+-------+------+ Pool: ThursdayPool +-----+-----------+-------+--------+-----------+-----------------+-------+------+ | MeId| VolumeName| MedTyp| VolStat| VolBytes | LastWritten | VolRet| Recyc| +-----+-----------+-------+--------+-----------+-----------------+-------+------+ | 1 | Thursday | MLR1 | Used | 2146276461| 2003-07-10 20:50| 518400| 1 | +-----+-----------+-------+--------+-----------+-----------------+-------+------+ Pool: Default No results to list.
Note, I have truncated a number of the columns so that the information fits on the width of a page.
Kern Sibbald 2008-01-31