Some Notes on DAQ Data Formats¶
Versioning Data Formats¶
We currently do not have an explicit versioning scheme in place, however we are planning on having a "version" field as the first member of each struct (after an optional "magic bytes" string), so that the version can always be read from a struct without knowing which version the specific instance is.
Variable-Size Data in DAQ Data Formats¶
For data types where the size of the payload is variable (e.g. Fragment
, TriggerRecordHeader
), the class is represented internally as a flat array of bytes. The start of this array is a header object (e.g. FragmentHeader
or TriggerRecordHeaderData
), which contains the information about the object and a size member which describes the size of the rest of the memory block. This allows us to store these objects as flat HDF5 DataSets directly.
Padding and Alignment in Struct Definitions¶
Problem Statement¶
According to the C++ Standard (See https://en.cppreference.com/w/cpp/language/object subsection Alignment), members of objects are aligned to addresses based on their size (e.g. an int
will be aligned to a 4-byte address), and padding will be inserted between members as necessary to satisfy this requirement.
For example:
struct MyStruct {
int mem001; // 4 bytes
char mem002; // 1 byte, with 3 bytes padding added
int mem003; // 4 bytes
};
static_assert(sizeof(MyStruct) == 12, "MyStruct size different than expected");
Objects will also be padded so that their size is a multiple of the size of their largest member, for most DUNE-DAQ structs, this means 64 bits.
Implications for DUNE-DAQ Code¶
These considerations have some implications for the data structures that may be persisted to disk from DUNE-DAQ:
-
All padding should be explicitly specified in structs
-
Structs should have members sorted so that there is no internal padding
-
Developers should statically assert the expected size of their structs, and update that assertion when the struct is updated
-
All struct members should have explicit sizes (i.e.
enum class MyEnum : uint16_t {};
)
For sorting members, the following is fine:
struct MySortedStruct {
uint16_t version; // These three fields together form a 64-bit block
uint16_t first_field;
uint32_t second_field;
uint64_t third_field;
uint32_t fourth_field;
uint16_t fifth_field;
uint8_t sixth_field;
uint8_t padding;
};
static_assert(sizeof(MySortedStruct) == 24, "MySortedStruct size different than expected");
In the general case, sorting members from largest to smallest will guarantee no internal padding is inserted by the compiler.
In the above struct, if first_field
and second_field
are swapped, the compiler will insert 8 bytes of padding: 2 bytes between version
and second_field
to align second_field
to a 32-bit address, then 6 bytes after first_field
to align third_field
to a 64-bit address.
Last git commit to the markdown source of this page:
Author: Eric Flumerfelt
Date: Mon Feb 28 15:48:51 2022 -0600
If you see a problem with the documentation on this page, please file an Issue at https://github.com/DUNE-DAQ/daqdataformats/issues