Bacula Enterprise Edition Documentation text image transdoc
Search

Main


The Job

Definition

The Job is the basic unit in Bacula and is run by the Director. It ties together the following items:
  • Who: The Client, the machine to backup
  • What: The File Set, which files to backup and not to backup
  • Where:
    • Storage: what physical device to backup to
    • Pool: which set of Volumes to use
    • Catalog: where to keep track of files
  • When: Schedule: when to run the Job

Types

There are several types of jobs in Bacula:
  • Backup
  • Restore
  • Admin
  • Verify
  • Copy
  • Migration
  • Archive
  • Console connection
  • Internal system job

Backup Job Description

As shown in figure (here), when running a backup job:
  • the Bacula Director will connect to the File Daemon and Storage Daemon,
  • the Director will then send to the File Daemon the necessary information to actually realize the backup job, like the Level, the File Set, the Storage Daemon to contact, etc.
  • then the File Daemon will contact the Storage Daemon, identify the Data to be sent and send them as well as the related meta data (file attributes) to the Storage Daemon
  • the Storage Daemon will put them into one or more Bacula Volumes, according to the Pool Resource chosen for the Job
  • the Storage Daemon will send metadata back to the Director.
The moment Bacula will run a Backup Job depends on how this job is started:
  • manually through a Console, e.g with bconsole: run job=name-of-the-job command
  • automatically by a defined and referenced Schedule
  • automatically by an external script or command like
    bconsole -c bconsole.conf << EOF
    run job=name-of-the-job ...
    EOF
    command

Admin Jobs

An Admin Job is a job which does not backup or move any data, but which can be used to execute scripts through the Run Script resource and Console or Command directives. The following definition will launch the script /opt/bacula/scripts/my-script.pl located on the Director machine according to the Schedule AdminSchedule
  Job {
    Type = Admin
    Run Script {
      Runs When = Before
      Runs on Client = No
      Command = "/opt/bacula/scripts/my-script.pl"
    }
    Schedule = AdminSchedule
    ...
  }

Migration, Copy and Virtual Full Jobs

Bacula provides three kinds of Jobs that do not actually backup data, but move data around. They are doing very similar things:
  • Migration Jobs allow the migration of job data from one volume to another one; This is usually interesting when implementing a Disk-to-Disk-to-Tape strategy, or, more general, any sort of multi tiered backup storage system.
  • Copy Jobs copy a Job's data to another volume. This is usually used when the requirements include at least two places where backup data is being kept.
  • Virtual Full Job which provide a way to consolidate several jobs into one, without requesting any data from the client, based only on existing backup data.
All these kind of Jobs rely on a Next Pool directive.

Virtual Full Consideration

Virtual Full backups are a way to do backups in an Incremental forever style while continuing to provide Full backups at the same time.

When backing up data at incremental or differential levels, Bacula (by default) does not do anything regarding removed or moved files or directories. Which implies that the result of a restoration could be different than the latest state of the machine. For that reason, we advise using Accurate mode which is enabled by the directive of the same name. When set to yes in a Job, Bacula will record removed missing files or directories, and depending on additional configuration, Bacula will also consider more criteria than just time stamps to determine if a file needs to be backed up. In this case, Bacula will restore the machine to the exact same state (from a backup content point of view) that it was in during the backups.

Administrators will understand that the Accurate mode takes additional resources and time when running backups.

To improve performance, you may use the Accurate directive when using Virtual Full backups only for the last incremental before the Virtual Full itself. Activating this directive for every incremental backup would be even better but could increase the backup time.

#
# A usual job definition
Job {
  Name = "j-bacula"
  JobDefs = "DefaultJob"
  FileSet = "BaculaFileSet"
  Client = client-fd
  Schedule = s-Data2Disk
  Max Full interval = 19 days
  Run Script {
      Runs When = after
      Runs on Client = no
      Runs on Failure = yes
      Command = "/opt/bacula/scripts/run_copyjob.pl %l %i %b t-rdx j-copy-full" ;
      # Launching the copy job as soon as the backup is done
      # %l is job level
      # %i is job id
      # %b is job bytes
      # j-copy-full is the job name (see below)
      # The script "run_copyjob.pl" issues a shell command like
      # bconsole -c bconsole.conf << END_OF_DATA
      #          run jobid=%i job=j-copy-full storage=t-rdx yes
      #          quit
      # END_OF_DATA
  }
}
And here is the Copy job to copy the job from Disk to RDX
Job {
    Name = "j-copy-full"
    Type = Copy
    Level = Full
    Client = client-fd
    File Set = "Empty Set"
    Messages = Standard
    Pool = PoolVFull
    Maximum Concurrent Jobs = 1
}
And the Schedule used to run the several levels
  Schedule {
         Name = s-Data2Disk
         Run = Level=incremental monday-thursday,saturday at 21:00
         Run = Level=incremental accurate=yes friday at 12:30
         Run = Level=VirtualFull priority=15 friday at 12:35
}

Pruning

Pruning by default occurs at the end of a Job. When doing some tests or if you only have few jobs a day, this could be fine. But as soon as the number of jobs is growing, you would prefer to manage pruning another way to let your Catalog do its best for the backup jobs, not for the database administrative tasks.

In such a situation, you will configure your clients not to activate the pruning algorithm, using the Auto Prune Directive such as the following:

  Client {
    Name = client-fd
    Address = bacula.example.com
    FDPort = 9102
    Catalog = Catalog
    Password = "do-you-want-a-very-strong-password?"
    File Retention = 15 days
    Job Retention = 50 days
    AutoPrune = no
         # No automatic pruning at the end of a job for this client
  }
If you don't do anything more, your Catalog will grow infinitely. To keep it at its best, you should define an Admin job, like the following:
  Job {
     Name = "admin-manual-pruning"
     Type = Admin
     JobDefs = "DefaultJob"
     RunScript {
        Runs When = Before
        # below command relies on proper PATH!
        Command = "/bin/sh -c \"echo prune expired volume yes\" | bconsole"
        Runs On Client = no
     }
     Schedule = s-Prune
  }
As a Bacula volume can contain one or more jobs (or parts of jobs) and a job contains one or more files, the pruning process will have side effects:
  • when pruning a Volume, all the jobs related to the Volume are pruned
  • when pruning a Job, all the files related to this jobs are pruned
That is the reason why you should have the following inequality:

File Retention ≤ Job Retention ≤ Volume Retention