Subsections

How Bacula Works

Introduction

The Job

The Job is probably one of the most important terms to understand when working with Bacula. There are several types of Jobs, the most known and the most usednoteGoing by the number of Jobs! is the _left_quote_Backup_right_quote_ Job. The most important is, in our opinion, the _left_quote_Restore_right_quote_ Job. Some others, like _left_quote_Admin_right_quote_ Jobs are also quite useful. All of them are described in more detail in the chapter (here).

The Volume

Another important term is the Volume: a Volume is the place where Bacula stores backed up data. When backing up to tapes, a Volume is identical to a tape and when backing up to disks, a Volume is a file. To fully understand Volumes, please read the chapter (here) dedicated to their description.

Good to Know

Bacula backs up files
By default, when working without special plugins, Bacula backs up files and nothing else.

Failed Jobs
When running a backup Job, the Bacula client sends data and metadata to the Storage Daemon. In some situations a backup Job may fail. It is important to know that one can restore the already saved data from failed Jobs.

Jobs Step by Step

Opening Sessions

Bacula is network based and uses TCP/IP connections to exchange commands, information and data between its components. It needs to open sessions and these sessions are opened from one component to another one as shown figure (here).

TCP/IP session _left_quote_directions_right_quote_

Data Stream

When running a backup Job, the stream of data is sent from the File Daemon to the Storage Daemon (data plus metadata). At the end of a backup Job, the Storage daemon sends the metadata to the Director to be inserted into the Catalog (see figure (here)).

Data and metadata streams

Catalog Database Management

As mentioned earlier, Bacula stores its metadata in an SQL Catalog. PostgreSQL, MySQL and SQLite are the database engines that are supported by Bacula. We recommend to use SQLite only for basic testing purposes.

The Catalog management is directly related to keeping backup Jobs data available, and it is often automatically triggered by finishing Jobs.

Retention periods

There are various kinds of retention periods that Bacula uses: File Retention Period, Job Retention Period, and Volume Retention Period. Each of these retention periods defines the minimal time that specific records will be kept in the Catalog database. This should not be confused with the time that the data saved to a Volume is valid and available for restore - in many cases, data will be available much longernoteActually the data remains on the Volume until the Volume is recycled or truncated. than any of the Retention Periods configured.

The File Retention Period
determines the time that File records are kept in the Catalog database. This period is important for two reasons: the first is that as long as File records remain in the database, you can _left_quote_browse_right_quote_ the database with a console btool and restore any individual file. Once the File records are removed or pruned from the database, the individual files of a backup Job can no longer be _left_quote_browsed_right_quote_.

The second reason for carefully choosing the File Retention Period is because the database File records typically use the most storage space in the database. As a consequence, you must ensure that regular _left_quote_pruning_right_quote_ of the database file records is done to keep your database from growing too big.

The Job Retention Period
is the length of time that Job records will be kept in the database. Note, all the File records are tied to the Job that saved those files. The File records can be purged, leaving the Job records. In this case, information will be available about the Jobs that ran, but not the details of the files that were backed up. When pruning a Job, Bacula will purge all its File records.

The Volume Retention Period
is the minimum time following the last write that a Volume will be kept until the Volume can be reused. Bacula will normally not overwrite a Volume that contains data still inside its Retention period, but if Bacula runs out of usable Volumes, it can select any Volume which is out of its Retention time for recycling, at this time automatically removing the related Job and File information from the Catalog.

Pruning

To keep the Catalog to a manageable size, the backup information should be removed (pruned) from the Catalog after the defined File and Job Retention Periods. Bacula by default automatically prunes the catalog database entries according to the retention periods defined.

Purging

Once all the database records that concern a particular Volume have been _left_quote_pruned_right_quote_ as described above respecting the retention periods, the Volume is said to be _left_quote_purged_right_quote_ (i.e. has no more catalog entries).


It is, however, possible to submit commands to Bacula to purge specific information, which will not respect configured retention periods. Naturally, this is something that should only be done with the greatest care.

Good to Know

Getting space back
Bacula will try to keep your data safe as long as possible, thus purging a volume will not automatically reclaim the used space. If you want to reuse space, you must configure Bacula accordinglynoteSee chapter (here) .


Bacula Users and Administrators

When considering enterprise backup and recovery environments, it is often useful to distinguish between different types or classes of people interacting with the backup and recovery tool:
Administrators
can control all aspects of Bacula, and modify its configuration.
Operators
interact with Bacula, following defined procedures, are responsible for certain operational aspects, but do not touch the configuration.
Users
or end-users are people who have no access to the Bacula configuration or all Bacula's features, but may access certain of its functions. A typical example is that a user can restore their own data, to their own computer, but can not see other backup data or access computers for which they are not responsible.

Bacula itself does not have the concept of users or administrators, but has _left_quote_Consoles_right_quote_ (see section (here)) that are designed to allow some users to have limited permissions regarding Bacula. However the BWeb Management Suite, the Bacula Enterprise Web-based tool for management and configuration, includes the concepts of users, groups and permissions.


Consoles

Overview

To allow interaction from administrators or users, Bacula uses Consoles. The Bacula Console (sometimes called the User Agent) is a btool that allows the user or the System Administrator to interact with the Bacula Director Daemon while the daemon is running. Note that, even when managing storage or checking client status, the Console interacts with the Director only, which in turn contacts the other daemons as needed.

The current Bacula Console comes in multiple versions:

All permit the administrator or authorized users to interact with Bacula. You can determine the status of a particular Job, examine the contents of the Catalog as well as perform certain tape manipulations with the Console btool. Running Jobs is one of the more central tasks done with the Console.

Since the Console btool interacts with the Director through the network, the Console and Director btools do not necessarily need to run on the same machine. In fact, in an installation containing a single tape drive, a certain minimal knowledge of the Console btool may be needed in order for Bacula to be able to write on more than one Volume, because when Bacula requests a new one, it waits until the user, via the Console btool, indicates that the new Volume is mounted or labeled to be used.

Integration

Because the Console is like other Bacula components, it requires configuration. To know more about Console configuration and/or management, please see the Console Configuration chapter of the Bacula main manual (chapter 19).

User Restrictions

If you want to give access to a particular user (not overall Bacula administrator), you need to configure a console for him/her. In particular, it may be desirable to implement specific Access Control Lists to prevent users from accessing data they are not authorized for. This part is covered in the Console Configuration chapter of the Bacula main manual (chapter 19).

Non Bacula Enterprise Consoles

There are many other GUI consoles available designed and proposed by the Bacula community. Here is a non-exhaustive list, ask the community if you want more information on one or another.

Configuration

Installation Directory tree

When installing the Bacula Enterprise versionnoteBacula Community version has its own installation process and default locations the following directory tree is creatednoteSome additional directories under /opt/bacula exist and are not presented here :

Configuration Files

Bacula's configuration is stored in plain text files, and a configuration file can include other files noteActually, it is also possible to create configuration parts by btool or script, while the daemon reads its configuration or while a Job is executed. , so it is possible to have a structured configuration file repository.

Each daemon has its own configuration consisting of a set of Resource definitions. These resources are very similar from one service to another, but may contain different Directives (records) depending on the service. For example, in the Director's resource file, the Director directive resource defines the name of the Director, a number of global parameters and the password needed to access it from a Console. In the File Daemon configuration file, the Director directive resource specifies which Directors are permitted to use the File Daemon.

The configuration files must be written as plain UTF-8-encoded text files, which implies that any ASCII file is suitable.

Director

The configuration file defines a lot of resources such as: Director definition:
  Director {
     Name = the-name-of-the-director-dir
     DIRport = 9101
     QueryFile = "/opt/bacula/scripts/query.sql"
     WorkingDirectory = "/opt/bacula/working"
     PidDirectory = "/opt/bacula/working"
     Maximum Concurrent Jobs = 10
     Password = "password-for-the-console-to-access-the-director"
     Messages = Daemon
     Heartbeat Interval = 10
  }

A Schedule definition to specify one full backup on the Sunday of the 2nd week of every second month starting in January, one differential on each Sunday except the Full's ones and an incremental backup six days a week:

  Schedule {
     Name = "NightlyCycle"
     Run = Full         jan,mar,may,jul,sep,nov 2nd     sunday  at 21:00
     Run = Differential feb,apr,jun,aug,oct,dec 2nd     sunday  at 21:00
     Run = Differential                     1st,3rd-5th sunday  at 21:00
     Run = Incremental                                  mon-sat at 21:00
  }
FileSet definition, backing up /etc, /opt, /home, etc. excluding some directories
  FileSet {
     Name = "fs-websites"
     Include {
        Options {
           Signature = MD5
           Compression = GZIP
        }
        File = /etc
        File = /opt
        File = /root
        File = /home
        File = /var/log
     }
     Exclude {
        File = /home/websites/tmp
        File = /home/websites/www/tmp/cache
        File = /opt/bacula/working
        File = /opt/bacula/archive
        File = /.journal
        File = /.fsck
     }
  }

A Storage definition, as required by the Director, i.e. a name, an address, a port, and a media type. The director does not know about the hardware, only the storage daemon does:

  Storage {
     Name = Remote-Disk-Storage
     Address = sd.bacula6.org
     SDPort = 9103
     Password = "password-for-the-director-to-access-the-storage"
     Device = disk-autochanger
     Media Type = da-mt
     Maximum Concurrent Jobs = 50
     Autochanger = Remote-Disk-Storage
  }

A client to back up:

  Client {
     Name = client-to-back-up-fd
     Address = client.bacula6.org
     FDPort = 9102
     Catalog = BaculaCatalog
     Password = "password-for-the-director-to-access-the-client"
     File Retention = 10 days
     Job Retention = 25 days
     AutoPrune = no
  }

A pool definition:

  Pool {
     Name = the-pool-name
     Pool Type = Backup
     Recycle = yes
     AutoPrune = no
     Volume Retention = 30 days
     Label Format = "pooldef-"
     Maximum Volume Bytes = 8G
     Maximum Volumes = 6
     Storage = Remote-Disk-Storage
  }

A JobDef to handle common definitions for several Jobs:

  JobDefs {
     Name = "common-job-definitions"
     Type = Backup
     Level = Incremental
     Messages = Standard
     Schedule = NightlyCycle
     Priority = 10
     #
     # The following setting saves some time sending
     # all the metadata at the end of the job
     Spool Attributes = yes
     #
     # A way to keep all of your BSR (bootstrap) files
     # in one place with the same naming conventions
     Write Bootstrap = "/opt/bacula/bsr/%c_%n.bsr"
  }

And then a Job using the above JobDefs:

  Job {
     Name = "back-up-job"
     JobDefs = "common-job-definitions"
     #
     # The storage is defined inside the Pool Resource
     # this is a best practice.
     Pool = the-pool-name
     Client = client-to-back-up-fd
     FileSet = "fs-websites"
     Schedule = NightlyCycle
     #
     # Below you have an example of how to include a file
     # notice the "@" sign as first character
     @/opt/bacula/etc/included-configuration-file.conf
}

File Daemon

If you read carefully you will notice, on the Bacula client side, a Director Resource similar to this one:
  Director {
     Name = the-name-of-the-director-dir
     Password = "password-for-the-director-to-access-the-client"
  }
which authorizes the Director the-name-of-the-director-dir, knowing the client password to access to the client below
  FileDaemon {
     Name = client-to-back-up-fd
     FDport = 9102
     WorkingDirectory = /opt/bacula/working
     Pid Directory = /opt/bacula/working
     Maximum Concurrent Jobs = 20
  }

Storage Daemon

Same thing here, the Director Resource
  Director {
     Name = the-name-of-the-director-dir
     Password = "password-for-the-director-to-access-the-storage"
  }
authorizes the Director the-name-of-the-director-dir, knowing the storages password to access the storage daemon
  Storage {                             # definition of myself
     Name = bacula-storage-daemon-definition-sd
     SDPort = 9103                  # Director's port      
     WorkingDirectory = "/opt/bacula/working"
     Pid Directory = "/opt/bacula/working"
     Maximum Concurrent Jobs = 200
     Heartbeat Interval = 10
  }
and therefore to use the devices defined on the Storage Daemon side for example
  #
  # This is a two drive autochanger definition
  Autochanger {
     Name = disk-autochanger
     Device = drive1, drive2
     # No changer command needed for virtual autochangers
     Changer Command = ""
     # No changer device needed either
     Changer Device = "/dev/null"
  }
  #
  # Drive 1 definition
  Device {
     Name = drive1
     Archive Device = /bacula/da1
     Media Type = da-mt
     Drive Index = 0
     Label Media = yes
     Random Access = yes
     AutomaticMount = yes
     RemovableMedia = no
     AlwaysOpen = no
     Maximum Concurrent Jobs = 20
  }
  #
  # Drive 2 is pretty much the same in this case
  Device {
     Name = drive2
     #
     # This is the same archive device as in drive1 definition
     Archive Device = /bacula/da1
     Media Type = da-mt
     # Another drive, another drive index
     Drive Index = 1
     Label Media = yes
     Random Access = yes
     AutomaticMount = yes
     RemovableMedia = no
     AlwaysOpen = no
     Maximum Concurrent Jobs = 20
  }

bconsole

The following bconsole.conf content allows the Console to connect to the following Director:
  Director {
     Name = the-name-of-the-director-dir
     DIRport = 9101
     address = localhost
     Password = "password-for-the-console-to-access-the-director"
  }

BWeb Management Suite considerations

The BWeb Management Suite (BMS) interface allows the administrator to manage the Bacula configuration through a Web interface. When using BMS for this purpose, the Bacula configuration files are split into pieces, all of them under the conf.d directory (/opt/bacula/etc/conf.d).

Figure (here) presents an example of a conf.d directory organization while the following is a tree representation of the same example.

conf.d/
|--- Console
|   `-- v8-dir
|       |--- bconsole.conf
|       `-- Director
|           `-- v8-dir.cfg
|--- Director
|   `-- v8-dir
|       |--- bacula-dir.conf
|       |--- Catalog
|       |   `-- MyCatalog.cfg
|       |--- Client
|       |   |--- v8-c1-fd.cfg
|       |   |--- v8-c2-fd.cfg
|       |   `-- v8-c3-fd.cfg
|       |--- Console
|       |   `-- v8-mon.cfg
|       |--- Director
|       |   `-- v8-dir.cfg
|       |--- Fileset
|       |   |--- Catalog.cfg
|       |   |--- fs-postgres.cfg
|       |   `-- fs-tls.cfg
|       |--- Job
|       |   |--- job-catalog.cfg
|       |   |--- job-postgres.cfg
|       |   |--- job-RestoreFiles.cfg
|       |   `-- job-tls.cfg
|       |--- JobDefs
|       |   |--- defaultjob.cfg
|       |   `-- common.cfg
|       |--- Messages
|       |   |--- Daemon.cfg
|       |   `-- Standard.cfg
|       |--- Pool
|       |   |--- Dedup.cfg
|       |   |--- File.cfg
|       |   |--- common.cfg
|       |   `-- Scratch.cfg
|       |--- Schedule
|       |   |--- FiveDays.cfg
|       |   |--- WeeklyCycleAfterBackup.cfg
|       |   `-- WeeklyCycle.cfg
|       `-- Storage
|           |--- Remote-Disk-Storage.cfg
|           `-- File.cfg
|--- FileDaemon
|   |--- v8-c1-fd
|   |   |--- bacula-fd.conf
|   |   |--- Director
|   |   |   `-- v8-dir.cfg
|   |   |--- FileDaemon
|   |   |   `-- v8-c1-fd.cfg
|   |   |   `-- v8-mon.cfg
|   |   `-- Messages
|   |       `-- Standard.cfg
|   |--- v8-c2-fd
|   |   |--- bacula-fd.conf
|   |   |--- Director
|   |   |   |--- v8-dir.cfg
|   |   |   `-- v8-mon.cfg
|   |   |--- FileDaemon
|   |   |   `-- v8-c2-fd.cfg
|   |   `-- Messages
|   |       `-- Standard.cfg
|   `-- v8-c3-fd
|       |--- bacula-fd.conf
|       |--- Director
|       |   `-- v8-dir.cfg
|       |--- FileDaemon
|       |   `-- v8-c3-fd.cfg
|       `-- Messages
|           `-- Standard.cfg
`-- Storage
    `-- v8-sd
        |--- Autochanger
        |   `-- disk-autochanger.cfg
        |--- bacula-sd.conf
        |--- Device
        |   |--- drive1.cfg
        |   |--- drive2.cfg
        |   `-- File.cfg
        |--- Director
        |   |--- v8-dir.cfg
        |   `-- v8-mon.cfg
        |--- Messages
        |   `-- Standard.cfg
        `-- Storage
            `-- v8-sd.cfg
conf.d directory typical graphical representation example

Good to Know

Installing, understanding and managing Bacula configuration files is covered in-depth during the Bacula Administration Course I. Even if this concept guide is here to help and give a quick overview of all concepts, attending this course, given in the US, Europe (Switzerland, Belgium, France) and in Japan is a strong recommendation.

Encryption

Bacula can encrypt the data it backs up (Data Encryption), and independently of that, it can encrypt the network connections it uses (Transport Encryption).

Both Data and Transport Encryption make use of industry-standard x509 Public Key Infrastructure, but the requirements differ a bit. In general, Data Encryption needs a minimum of infrastructure and configuration, while Transport Encryption may require much more effort of the Backup Administrator.

Data Encryption

Data Encryption happens on the client and needs a key pair in files stored on the client. Thus, the encrypted data may be inaccessible even to the backup administrators. If a Master Key is desirable, so that data can be decrypted even if the client machine owner loses or has changed the keys, Bacula can be configured accordingly.

The client machine owner or administrator can be the only person responsible for the key files, which implies that additional tasks come into their realm of responsibility.

It is important to be aware that, while data itself is encrypted, metadata that is stored in the Catalog remains unencrypted - file and path names, in particular, are easily visible to third parties with access to the network or the catalog database.

Transport Encryption

Transport Encryption, on the other hand, requires a full-blown, managed Public Key Infrastructure including a trusted and securely operated Certification Authority, secure deployment of keys and certificates, explicit configuration of trust relationships and a few lines of configuration in each Bacula configuration resource that refers to any Bacula component. Experience shows that configuring transport encryption for the first time is a challenge sometimes even to experienced Bacula administrators.noteFortunately, experience also shows that once over the first hurdle, things go considerably more smoothly!

The overhead to operate and set up such an environment can be considerable and is, in general, not an area of expertise for backup and storage administrators. However, if a maximum of security is desired, it provides industry-standard protection against any sort of unauthorized data snooping and as such is particularly important when backups are done through insecure networks like the internet.

Good to Know

Data and Transport Encryption can be freely combined, but keep in mind that each layer of encryption adds CPU overhead and can thus decrease throughput and increase resource consumption.

Bacula can also use certificate-based authentication of its components, which is particularly useful if there are already certificates deployed for the involved computers.


bsbuild 2024-02-12