Record Header

Each binary data record is preceded by a Record Header. The Record Header is fixed length and fixed format, whereas the binary data record is of variable length. The Record Header is written using the Bacula serialization routines and thus is guaranteed to be in machine independent format.

The format of the Record Header (version 1.27 or later) is:

  int32_t FileIndex;   /* File index supplied by File daemon */
  int32_t Stream;      /* Stream number supplied by File daemon */
  uint32_t DataSize;   /* size of following data record in bytes */

This record is followed by the binary Stream data of DataSize bytes, followed by another Record Header record and the binary stream data. For the definitive definition of this record, see record.h in the src/stored directory.

Additional notes on the above:

The VolSessionId

is a unique sequential number that is assigned by the Storage Daemon to a particular Job. This number is sequential since the start of execution of the daemon.

The VolSessionTime

is the time/date that the current execution of the Storage Daemon started. It assures that the combination of VolSessionId and VolSessionTime is unique for every jobs written to the tape, even if there was a machine crash between two writes.

The FileIndex

is a sequential file number within a job. The Storage daemon requires this index to be greater than zero and sequential. Note, however, that the File daemon may send multiple Streams for the same FileIndex. In addition, the Storage daemon uses negative FileIndices to hold the Begin Session Label, the End Session Label, and the End of Volume Label.

The Stream

is defined by the File daemon and is used to identify separate parts of the data saved for each file (Unix attributes, Win32 attributes, file data, compressed file data, sparse file data, ...). The Storage Daemon has no idea of what a Stream is or what it contains except that the Stream is required to be a positive integer. Negative Stream numbers are used internally by the Storage daemon to indicate that the record is a continuation of the previous record (the previous record would not entirely fit in the block).

For Start Session and End Session Labels (where the FileIndex is negative), the Storage daemon uses the Stream field to contain the JobId. The current stream definitions are:

#define STREAM_UNIX_ATTRIBUTES    1    /* Generic Unix attributes */
#define STREAM_FILE_DATA          2    /* Standard uncompressed data */
#define STREAM_MD5_SIGNATURE      3    /* MD5 signature for the file */
#define STREAM_GZIP_DATA          4    /* GZip compressed file data */
/* Extended Unix attributes with Win32 Extended data.  Deprecated. */
#define STREAM_UNIX_ATTRIBUTES_EX 5    /* Extended Unix attr for Win32 EX */
#define STREAM_SPARSE_DATA        6    /* Sparse data stream */
#define STREAM_SPARSE_GZIP_DATA   7
#define STREAM_PROGRAM_NAMES      8    /* program names for program data */
#define STREAM_PROGRAM_DATA       9    /* Data needing program */
#define STREAM_SHA1_SIGNATURE    10    /* SHA1 signature for the file */
#define STREAM_WIN32_DATA        11    /* Win32 BackupRead data */
#define STREAM_WIN32_GZIP_DATA   12    /* Gzipped Win32 BackupRead data */
#define STREAM_MACOS_FORK_DATA   13    /* Mac resource fork */
#define STREAM_HFSPLUS_ATTRIBUTES 14   /* Mac OS extra attributes */
#define STREAM_UNIX_ATTRIBUTES_ACCESS_ACL 15 /* Standard ACL attributes on UNIX */
#define STREAM_UNIX_ATTRIBUTES_DEFAULT_ACL 16 /* Default ACL attributes on UNIX */

The DataSize

is the size in bytes of the binary data record that follows the Session Record header. The Storage Daemon has no idea of the actual contents of the binary data record. For standard Unix files, the data record typically contains the file attributes or the file data. For a sparse file the first 64 bits of the file data contains the storage address for the data block.

The Record Header is never split across two blocks. If there is not enough room in a block for the full Record Header, the block is padded to the end with zeros and the Record Header begins in the next block. The data record, on the other hand, may be split across multiple blocks and even multiple physical volumes. When a data record is split, the second (and possibly subsequent) piece of the data is preceded by a new Record Header. Thus each piece of data is always immediately preceded by a Record Header. When reading a record, if Bacula finds only part of the data in the first record, it will automatically read the next record and concatenate the data record to form a full data record.

Kern Sibbald 2010-08-30