2.1 ACC dataset preparation and behaviour labels
Segments of continuous ACC data will need to be translated into
meaningful behaviours. For raw ACC data segmentation, there are two
choices: even-length segmentation and variable-length segmentation (Bom,
Bouten, Piersma, Oosterbeek, & van Gils, 2014). Variable-length
segmentation requires an algorithm to detect behaviour change points and
may thus be prone to error. Even-length segmentation does not require
these additional calculations and is therefore much easier to implement.
However, even-length ACC segments will inevitably contain behaviour
change points (and thus multiple behaviours) affecting down the line
processing and behaviour classification. An ACC segment should be
sufficiently long to contain enough data to be representative of a
behaviour (and, thus, interpretable as a specific behaviour type),
whereas its length should be limited to avoid inclusion of multiple
behaviours as much as possible. Regarding the inevitable segments where
behaviour transitions take place, we recommend retaining these segments
in the model training. Although these data might decrease the accuracy
of the classification model, they will make the model more robust and
avoid overestimating model performance. The rabc package only supports
even-length segmentation data. The input data should be a data.frame or
tibble containing raw ACC data including the behaviour associated with
the ACC data. For tri-axial ACC data, each row of equal length should be
arranged as ”x,y,z,x,y,z,…,behaviour”, where “behaviour” is the
(primary) behaviour observed during that segment. For dual-axial ACC
data, it should be arranged as ”x,y,x,y,…,behaviour” and for
single-axial ACC data as ”x,x,…,behaviour”.
The here used tri-axial ACC demo dataset from white stork (Ciconia
ciconia ) (data accessible from the AcceleRater website:
http://accapp.move-ecol-minerva.huji.ac.il/, see Resheff et al., 2014)
was measured at 10.54 Hz. Forty tri-axial measurements, totalling 3.8
seconds, were used to form a behaviour segment. The dataset includes
1746 segments each forming a row in the dataset. Each row contains 121
columns. The first 120 columns are ACC measurements from three
orthogonal axes, arranged as x,y,z,x,y,z,…,x,y,z. The final column is
of type character containing the corresponding behaviour. The dataset
contains 5 different behaviours including ”A_FLIGHT” - active flight
(77 cases), ”P_FLIGHT” - passive filght (96), ”WALK” - walking (437),
”STND” - standing (863), ”SITTING” - sitting (273).