Bacula Enterprise Edition Documentation text image transdoc

Main

New Features in 7.2.0

This chapter presents the new features that have been added to the various versions of Bacula.

In various places such as RunScripts, you have now access to %E to get the number of non-fatal errors for the current Job and %R to get the number of bytes read from disk or from the network during a job.

Enable/Disable commands

The bconsole enable and disable commands have been extended from enabling/disabling Jobs to include Clients, Schedule, and Storage devices. Examples:

   disable Job=NightlyBackup Client=Windows-fd

will disable the Job named NightlyBackup as well as the client named Windows-fd.

   disable Storage=LTO-changer Drive=1

will disable the first drive in the autochanger named LTO-changer.

Please note that doing a reload command will set any values changed by the enable/disable commands back to the values in the bacula-dir.conf file.

The Client and Schedule resources in the bacula-dir.conf file now permit the directive Enable = yes or Enable = no.

Bacula 7.2

Snapshot Management

Bacula 7.2 is now able to handle Snapshots on Linux/Unix systems. Snapshots can be automatically created and used to backup files. It is also possible to manage Snapshots from Bacula's bconsole tool through a unique interface.

Snapshot Backends

The following Snapshot backends are supported:

BTRFS
ZFS
LVMnoteSome restrictions described in (here) applies to the LVM backend

By default, Snapshots are mounted (or directly available) under .snapshots directory on the root filesystem. (On ZFS, the default is .zfs/snapshots).

The Snapshot backend program is called bsnapshot and is available in the bacula-enterprise-snapshot package. In order to use the Snapshot Management feature, the package must be installed on the Client.

The bsnapshot program can be configured using /opt/bacula/etc/bsnapshot.conf file. The following parameters can be adjusted in the configuration file:

trace=<file> Specify a trace file
debug=<num> Specify a debug level
sudo=<yes/no> Use sudo to run commands
disabled=<yes/no> Disable snapshot support
retry=<num> Configure the number of retries for some operations
snapshot_dir=<dirname> Use a custom name for the Snapshot directory. (.SNAPSHOT, .snapdir, etc...)
lvm_snapshot_size=<lvpath:size> Specify a custom snapshot size for a given LVM volume

# cat /opt/bacula/etc/bsnapshot.conf
trace=/tmp/snap.log
debug=10
lvm_snapshot_size=/dev/ubuntu-vg/root:5%

Application Quiescing

When using Snapshots, it is very important to quiesce applications that are running on the system. The simplest way to quiesce an application is to stop it. Usually, taking the Snapshot is very fast, and the downtime is only about a couple of seconds. If downtime is not possible and/or the application provides a way to quiesce, a more advanced script can be used. An example is described on (here).

New Director Directives

The use of the Snapshot Engine on the FileDaemon is determined by the new Enable Snapshot FileSet directive. The default is no.

FileSet {
  Name = LinuxHome

  Enable Snapshot = yes

  Include {
    Options = { Compression = LZO }
    File = /home
  }
}

By default, Snapshots are deleted from the Client at the end of the backup. To keep Snapshots on the Client and record them in the Catalog for a determined period, it is possible to use the Snapshot Retention directive in the Client or in the Job resource. The default value is 0 secconds. If, for a given Job, both Client and Job Snapshot Retention directives are set, the Job directive will be used.

Client {
   Name = linux1
   ...

   Snapshot Retention = 5 days
}

To automatically prune Snapshots, it is possible to use the following RunScript command:

Job {
   ...
   Client = linux1
   ...
   RunScript {
      RunsOnClient = no
      Console = "prune snapshot client=%c yes"
      RunsAfter = yes
   }
}

In RunScripts, the AfterSnapshot keyword for the RunsWhen directive will allow a command to be run just after the Snapshot creation. AfterSnapshot is a synonym for the AfterVSS keyword.

Job {
 ...
  RunScript {
    Command = "/etc/init.d/mysql start"
    RunsWhen = AfterSnapshot
    RunsOnClient = yes
  }
  RunScript {
    Command = "/etc/init.d/mysql stop"
    RunsWhen = Before
    RunsOnClient = yes
  }
}

Job Output Information

Information about Snapshots are displayed in the Job output. The list of all devices used by the Snapshot Engine is displayed, and the Job summary indicates if Snapshots were available.

JobId 3:    Create Snapshot of /home/build
JobId 3:    Create Snapshot of /home/build/subvol
JobId 3:    Delete snapshot of /home/build
JobId 3:    Delete snapshot of /home/build/subvol
...
JobId 3: Bacula 127.0.0.1-dir 7.2.0 (23Jul15):
  Build OS:               x86_64-unknown-linux-gnu archlinux 
  JobId:                  3
  Job:                    Incremental.2015-02-24_11.20.27_08
  Backup Level:           Full
...
  Snapshot/VSS:           yes
...
  Termination:            Backup OK

New “snapshot” Bconsole Commands

The new snapshot command will display by default the following menu:

*snapshot
Snapshot choice:
     1: List snapshots in Catalog
     2: List snapshots on Client
     3: Prune snapshots
     4: Delete snapshot
     5: Update snapshot parameters
     6: Update catalog with Client snapshots
     7: Done
Select action to perform on Snapshot Engine (1-7):

The snapshot command can also have the following parameters:

[client=<client-name> | job=<job-name> | jobid=<jobid>]
 [delete | list | listclient | prune | sync | update]

It is also possible to use traditional list, llist, update, prune or delete commands on Snapshots.

*llist snapshot jobid=5
 snapshotid: 1
       name: NightlySave.2015-02-24_12.01.00_04
 createdate: 2015-02-24 12:01:03
     client: 127.0.0.1-fd
    fileset: Full Set
      jobid: 5
     volume: /home/.snapshots/NightlySave.2015-02-24_12.01.00_04
     device: /home/btrfs
       type: btrfs
  retention: 30
    comment:

* snapshot listclient
Automatically selected Client: 127.0.0.1-fd
Connecting to Client 127.0.0.1-fd at 127.0.0.1:8102
Snapshot      NightlySave.2015-02-24_12.01.00_04:
  Volume:     /home/.snapshots/NightlySave.2015-02-24_12.01.00_04
  Device:     /home
  CreateDate: 2015-02-24 12:01:03
  Type:       btrfs
  Status:     OK
  Error:

With the Update catalog with Client snapshots option (or snapshot sync), the Director contacts the FileDaemon, lists snapshots of the system and creates catalog records of the Snapshots.

*snapshot sync
Automatically selected Client: 127.0.0.1-fd
Connecting to Client 127.0.0.1-fd at 127.0.0.1:8102
Snapshot      NightlySave.2015-02-24_12.35.47_06:
  Volume:     /home/.snapshots/NightlySave.2015-02-24_12.35.47_06
  Device:     /home
  CreateDate: 2015-02-24 12:35:47
  Type:       btrfs
  Status:     OK
  Error:
Snapshot added in Catalog

*llist snapshot
 snapshotid: 13
       name: NightlySave.2015-02-24_12.35.47_06
 createdate: 2015-02-24 12:35:47
     client: 127.0.0.1-fd
    fileset:
      jobid: 0
     volume: /home/.snapshots/NightlySave.2015-02-24_12.35.47_06
     device: /home
       type: btrfs
  retention: 0
    comment:

LVM Backend Restrictions

LVM Snapshots are quite primitive compared to ZFS, BTRFS, NetApp and other systems. For example, it is not possible to use Snapshots if the Volume Group (VG) is full. The administrator must keep some free space in the VG to create Snapshots. The amount of free space required depends on the activity of the Logical Volume (LV). bsnapshot uses 10% of the LV by default. This number can be configured per LV in the bsnapshot.conf file.

[root@system1]# vgdisplay
  --- Volume group ---
  VG Name               vg_ssd
  System ID
  Format                lvm2
...
  VG Size               29,81 GiB
  PE Size               4,00 MiB
  Total PE              7632
  Alloc PE / Size       125 / 500,00 MiB
  Free  PE / Size       7507 / 29,32 GiB
...

It is also not advisable to leave snapshots on the LVM backend. Having multiple snapshots of the same LV on LVM will slow down the system.

Debug Options

To get low level information about the Snapshot Engine, the debug tag “snapshot” should be used in the setdebug command.

* setdebug level=10 tags=snapshot client
* setdebug level=10 tags=snapshot dir

Minor Enhancements

Storage Daemon Reports Disk Usage

The status storage command now reports the space available on disk devices:

...
Device status:

Device file: "FileStorage" (/bacula/arch1) is not open.
    Available Space=5.762 GB
==

Device file: "FileStorage1" (/bacula/arch2) is not open.
    Available Space=5.862 GB

Data Encryption Cipher Configuration

Bacula Enterprise version 8.0 and later now allows configuration of the data encryption cipher and the digest algorithm. Previously, the cipher was forced to AES 128, but it is now possible to choose between the following ciphers:

AES128 (default)
AES192
AES256
blowfish

The digest algorithm was set to SHA1 or SHA256 depending on the local OpenSSL options. We advise you to not modify the PkiDigest default setting. Please, refer to the OpenSSL documentation to understand the pros and cons regarding these options.

  FileDaemon {
    ...
    PkiCipher = AES256
  }

New Option Letter “M” for Accurate Directive in FileSet

Added in version 8.0.5, the new “M” option letter for the Accurate directive in the FileSet Options block, which allows comparing the modification time and/or creation time against the last backup timestamp. This is in contrast to the existing options letters “m” and/or “c”, mtime and ctime, which are checked against the stored catalog values, which can vary accross different machines when using the BaseJob feature.

The advantage of the new “M” option letter for Jobs that refer to BaseJobs is that it will instruct Bacula to backup files based on the last backup time, which is more useful because the mtime/ctime timestamps may differ on various Clients, causing files to be needlessly backed up.

  Job {
    Name = USR
    Level = Base
    FileSet = BaseFS
...
  }

  Job {
    Name = Full
    FileSet = FullFS
    Base = USR
...
  }

  FileSet {
    Name = BaseFS
    Include {
      Options {
        Signature = MD5
      }
      File = /usr
    }
  }

  FileSet {
    Name = FullFS
    Include {
      Options {
        Accurate = Ms      # check for mtime/ctime of last backup timestamp and Size
        Signature = MD5
      }
      File = /home
      File = /usr
    }
  }

Read Only Storage Devices

This version of Bacula allows you to define a Storage deamon device to be read-only. If the Read Only directive is specified and enabled, the drive can only be used for read operations. The Read Only directive can be defined in any bacula-sd.conf Device resource, and is most useful for reserving one or more drives for restores. An example is:

Read Only = yes

New Resume Command

The new resume command does exactly the same thing as a restart command, but for some users the name may be more logical because in general the restart command is used to resume running a Job that was incomplete.

New Prune “Expired” Volume Command

In Bacula Enterprise 6.4, it is now possible to prune all volumes (from a pool, or globally) that are “expired”. This option can be scheduled after or before the backup of the catalog and can be combined with the Truncate On Purge option. The prune expired volme command may be used instead of the manual_prune.pl script.

* prune expired volume

* prune expired volume pool=FullPool

To schedule this option automatically, it can be added to the Catalog backup job definition.

 Job {
   Name = CatalogBackup
   ...
   RunScript {
     Console = "prune expired volume yes"
     RunsWhen = Before
   }
 }

New Job Edit Codes %P %C

In various places such as RunScripts, you have now access to %P to get the current Bacula process ID (PID) and %C to know if the current job is a cloned job.

Enhanced Status and Error Messages

We have enhanced the Storage daemon status output to be more readable. This is important when there are a large number of devices. In addition to formatting changes, it also includes more details on which devices are reading and writing.

A number of error messages have been enhanced to have more specific data on what went wrong.

If a file changes size while being backed up the old and new size are reported.

Miscellaneous New Features

Allow unlimited line lengths in .conf files (previously limited to 2000 characters).
Allow /dev/null in ChangerCommand to indicated a Virtual Autochanger.
Add a -fileprune option to the manual_prune.pl script.
Add a -m option to make_catalog_backup.pl to do maintenance on the catalog.
Safer code that cleans up the working directory when starting the daemons. It limits what files can be deleted, hence enhances security.
Added a new .ls command in bconsole to permit browsing a client's filesystem.
Fixed a number of bugs, includes some obscure seg faults, and a race condition that occurred infrequently when running Copy, Migration, or Virtual Full backups.
Upgraded to a newer version of Qt4 for bat. All indications are that this will improve bat's stability on Windows machines.
The Windows installers now detect and refuse to install on an OS that does not match the 32/64 bit value of the installer.

FD Storage Address

When the Director is behind a NAT, in a WAN area, to connect tothe StorageDaemon, the Director uses an “external” ip address, and the FileDaemon should use an “internal” IP address to contact the StorageDaemon.

The normal way to handle this situation is to use a canonical name such as “storage-server” that will be resolved on the Director side as the WAN address and on the Client side as the LAN address. This is now possible to configure this parameter using the new directive FDStorageAddress in the Storage or Client resource.

Storage {
     Name = storage1
     Address = 65.1.1.1
     FD Storage Address = 10.0.0.1
     SD Port = 9103
     ...
}

 Client {
      Name = client1
      Address = 65.1.1.2
      FD Storage Address = 10.0.0.1
      FD Port = 9102
      ...
 }

Note that using the Client FDStorageAddress directive will not allow to use multiple Storage Daemon, all Backup or Restore requests will be sent to the specified FDStorageAddress.

Maximum Concurrent Read Jobs

This is a new directive that can be used in the bacula-dir.conf file in the Storage resource. The main purpose is to limit the number of concurrent Copy, Migration, and VirtualFull jobs so that they don't monopolize all the Storage drives causing a deadlock situation where all the drives are allocated for reading but none remain for writing. This deadlock situation can occur when running multiple simultaneous Copy, Migration, and VirtualFull jobs.

The default value is set to 0 (zero), which means there is no limit on the number of read jobs. Note, limiting the read jobs does not apply to Restore jobs, which are normally started by hand. A reasonable value for this directive is one half the number of drives that the Storage resource has rounded down. Doing so, will leave the same number of drives for writing and will generally avoid over committing drives and a deadlock.

Incomplete Jobs

During a backup, if the Storage daemon experiences disconnection with the File daemon during backup (normally a comm line problem or possibly an FD failure), under conditions that the SD determines to be safe it will make the failed job as Incomplete rather than failed. This is done only if there is sufficient valid backup data that was written to the Volume. The advantage of an Incomplete job is that it can be restarted by the new bconsole restart command from the point where it left off rather than from the beginning of the jobs as is the case with a cancel.

The Stop Command

Bacula has been enhanced to provide a stop command, very similar to the cancel command with the main difference that the Job that is stopped is marked as Incomplete so that it can be restarted later by the restart command where it left off (see below). The stop command with no arguments, will like the cancel command, prompt you with the list of running jobs allowing you to select one, which might look like the following:

*stop
Select Job:
     1: JobId=3 Job=Incremental.2012-03-26_12.04.26_07
     2: JobId=4 Job=Incremental.2012-03-26_12.04.30_08
     3: JobId=5 Job=Incremental.2012-03-26_12.04.36_09
Choose Job to stop (1-3): 2
2001 Job "Incremental.2012-03-26_12.04.30_08" marked to be stopped.
3000 JobId=4 Job="Incremental.2012-03-26_12.04.30_08" marked to be stopped.

The Restart Command

The new Restart command allows console users to restart a canceled, failed, or incomplete Job. For canceled and failed Jobs, the Job will restart from the beginning. For incomplete Jobs the Job will restart at the point that it was stopped either by a stop command or by some recoverable failure.

If you enter the restart command in bconsole, you will get the following prompts:

*restart
You have the following choices:
     1: Incomplete
     2: Canceled
     3: Failed
     4: All
Select termination code:  (1-4):

If you select the All option, you may see something like:

Select termination code:  (1-4): 4
+-------+-------------+---------------------+------+-------+----------+-----------+-----------+
| jobid | name        | starttime           | type | level | jobfiles |
jobbytes  | jobstatus |
+-------+-------------+---------------------+------+-------+----------+-----------+-----------+
|     1 | Incremental | 2012-03-26 12:15:21 | B    | F     |        0 |
    0 | A         |
|     2 | Incremental | 2012-03-26 12:18:14 | B    | F     |      350 |
4,013,397 | I         |
|     3 | Incremental | 2012-03-26 12:18:30 | B    | F     |        0 |
    0 | A         |
|     4 | Incremental | 2012-03-26 12:18:38 | B    | F     |      331 |
3,548,058 | I         |
+-------+-------------+---------------------+------+-------+----------+-----------+-----------+
Enter the JobId list to select:

Then you may enter one or more JobIds to be restarted, which may take the form of a list of JobIds separated by commas, and/or JobId ranges such as 1-4, which indicates you want to restart JobIds 1 through 4, inclusive.

Job Bandwidth Limitation

The new Job Bandwidth Limitation directive may be added to the File daemon's and/or Director's configuration to limit the bandwidth used by a Job on a Client. It can be set in the File daemon's conf file for all Jobs run in that File daemon, or it can be set for each Job in the Director's conf file. The speed is always specified in bytes per second.

For example:

FileDaemon {
  Name = localhost-fd
  Working Directory = /some/path
  Pid Directory = /some/path
  ...
  Maximum Bandwidth Per Job = 5Mb/s
}

The above example would cause any jobs running with the FileDaemon to not exceed 5 megabytes per second of throughput when sending data to the Storage Daemon. Note, the speed is always specified in bytes per second (not in bits per second), and the case (upper/lower) of the specification characters is ignored (i.e. 1MB/s = 1Mb/s).

You may specify the following speed parameter modifiers: k/s (1,000 bytes per second), kb/s (1,024 bytes per second), m/s (1,000,000 bytes per second), or mb/s (1,048,576 bytes per second).

For example:

Job {
  Name = locahost-data
  FileSet = FS_localhost
  Accurate = yes
  ...
  Maximum Bandwidth = 5Mb/s
  ...
}

The above example would cause Job localhost-data to not exceed 5MB/s of throughput when sending data from the File daemon to the Storage daemon.

A new console command setbandwidth permits to set dynamically the maximum throughput of a running Job or for future jobs of a Client.

* setbandwidth limit=1000 jobid=10

Please note that the value specified for the limit command line parameter is always in units of 1024 bytes (i.e. the number is multiplied by 1024 to give the number of bytes per second). As a consequence, the above limit of 1000 will be interpreted as a limit of 1000 * 1024 = 1,024,000 bytes per second.

Always Backup a File

When the Accurate mode is turned on, you can decide to always backup a file by using then new A Accurate option in your FileSet. For example:

Job {
   Name = ...
   FileSet = FS_Example
   Accurate = yes
   ...
}

FileSet {
 Name = FS_Example
 Include {
   Options {
     Accurate = A
   }
   File = /file
   File = /file2
 }
 ...
}

This project was funded by Bacula Systems based on an idea of James Harper and is available with the Bacula Enterprise Edition.

Setting Accurate Mode at Runtime

You are now able to specify the Accurate mode on the run command and in the Schedule resource.

* run accurate=yes job=Test

Schedule {
  Name = WeeklyCycle
  Run = Full 1st sun at 23:05
  Run = Differential accurate=yes 2nd-5th sun at 23:05
  Run = Incremental  accurate=no  mon-sat at 23:05
}

It can allow you to save memory and and CPU resources on the catalog server in some cases.

These advanced tuning options are available with the Bacula Enterprise Edition.

Additions to RunScript variables

You can have access to JobBytes, JobFiles and Director name using %b, %F and %D in your runscript command. The Client address is now available through %h.

RunAfterJob = "/bin/echo Job=%j JobBytes=%b JobFiles=%F ClientAddress=%h Dir=%D"

LZO Compression

LZO compression was added in the Unix File Daemon. From the user point of view, it works like the GZIP compression (just replace compression=GZIP with compression=LZO).

For example:

Include {
   Options { compression=LZO }
   File = /home
   File = /data
}

LZO provides much faster compression and decompression speed but lower compression ratio than GZIP. It is a good option when you backup to disk. For tape, the built-in compression may be a better option.

LZO is a good alternative for GZIP1 when you don't want to slow down your backup. On a modern CPU it should be able to run almost as fast as:

your client can read data from disk. Unless you have very fast disks like SSD or large/fast RAID array.
the data transfers between the file daemon and the storage daemon even on a 1Gb/s link.

Note that bacula only use one compression level LZO1X-1.

The code for this feature was contributed by Laurent Papier.

Purge Migration Job

The new Purge Migration Job directive may be added to the Migration Job definition in the Director's configuration file. When it is enabled the Job that was migrated during a migration will be purged at the end of the migration job.

For example:

Job {
  Name = "migrate-job"
  Type = Migrate
  Level = Full
  Client = localhost-fd
  FileSet = "Full Set"
  Messages = Standard
  Storage = DiskChanger
  Pool = Default
  Selection Type = Job
  Selection Pattern = ".*Save"
...
  Purge Migration Job = yes
}

This project was submitted by Dunlap Blake; testing and documentation was funded by Bacula Systems.

Changes in the Pruning Algorithm

We rewrote the job pruning algorithm in this version. Previously, in some users reported that the pruning process at the end of jobs was very long. It should not be longer the case. Now, Bacula won't prune automatically a Job if this particular Job is needed to restore data. Example:

JobId: 1  Level: Full
JobId: 2  Level: Incremental
JobId: 3  Level: Incremental
JobId: 4  Level: Differential
.. Other incrementals up to now

In this example, if the Job Retention defined in the Pool or in the Client resource causes that Jobs with Jobid in 1,2,3,4 can be pruned, Bacula will detect that JobId 1 and 4 are essential to restore data at the current state and will prune only JobId 2 and 3.

Important, this change affect only the automatic pruning step after a Job and the prune jobs Bconsole command. If a volume expires after the VolumeRetention period, important jobs can be pruned.

Ability to Verify any specified Job

You now have the ability to tell Bacula which Job should verify instead of automatically verify just the last one.

This feature can be used with VolumeToCatalog, DiskToCatalog and Catalog level.

To verify a given job, just specify the Job jobid in argument when starting the job.

*run job=VerifyVolume jobid=1 level=VolumeToCatalog
Run Verify job
JobName:     VerifyVolume
Level:       VolumeToCatalog
Client:      127.0.0.1-fd
FileSet:     Full Set
Pool:        Default (From Job resource)
Storage:     File (From Job resource)
Verify Job:  VerifyVol.2010-09-08_14.17.17_03
Verify List: /tmp/regress/working/VerifyVol.bsr
When:        2010-09-08 14:17:31
Priority:    10
OK to run? (yes/mod/no):