Data Preprocessing
For a better performance, two data preprocessing techniques were employed in this study.
First, we removed the duplicate data in the dataset as some GPS points were recorded more than once
due to recording errors on the GPS device. Second, according to common sense, we removed some
outlier trajectories, which were deemed abnormal. For instance, if the average speed of a trajectory
marked as “walking” exceeded 10 m/s, or if the average speed of a trajectory marked as “biking”
exceeded 25 m/s, we identified them as abnormal trajectories and removed them from the dataset.
Feature Extraction
Feature engineering plays an important role in recognizing the transportation modes. Feature
engineering is the process of using domain knowledge to create features to enable machine learning
methods to work well; to achieve differentiated results in the recognition of transportation modes and
to extracted a large number of features from the processed GPS trajectories. These features can then be
categorized into global features and local features. Global features refer to descriptive statistics for
the entire trajectory, which makes trajectories more comparable, and the local features extracted by
profile decomposition reveal more detail in movement behavior. The following subsections explain
each category feature in detail.
(1) Mean: This is a measure of the central tendency of the data sets.
(2) Stand deviation: This is a measure that is used to describe the dispersion of a set of data values.
(3) Mode: The mode represents the value that occurs most frequently in the sets; the parameters are
first rounded into integers as the parameters are continuous values.
(4) Top three value, Minimum three value: These parameters are to reduce the error brought by the
abnormal point with positional errors.
(5) Value ranges: The maximum value minus the minimum value.
(6) Percentile: A measure that represents the value below a given percentage of observations which is
used in statistics.we select the 25th percentile (lower quartiles) and 75th percentile
(upper quartiles).
(7) Interquartile range: This parameter is equal to the difference between the lower and
upper quartiles.
(8) Skewness: This is a measure of asymmetry of the probability distribution of a real-valued random
variable about its mean, which can be used to describe the distribution of movement parameters.
(9) Kurtosis: A measure of the “tailedness” of the probability distribution of a real-valued random
variable and can also be used to describe the distribution of movement parameters compared to
the normal distribution.
(10) Coefficient of variation: This is a standardized measure of dispersion of probability distribution or
frequency distribution, when the measurement scale difference is too large, it is inappropriate to
use standard deviation to measure the parameters as the standard deviation is always understood
in the context of the mean of the data but the CV is independent of the unit. Therefore, it can
handle the problem of different unit or large measurement scale.
(11) Autocorrelation coefficient: This is the cross-correlation of a signal with itself at different points
in time (what the cross stand for).
Secondly, in addition to these statistical features, we used other features proposed by
Zheng et al. [15], which proved robust to traffic conditions and are listed as follows:
(1) Heading change rate: This measure is a discrimination feature to distinguish different transportation
modes as proposed by Zheng et al. [15]. It can be considered as the frequency in which people
change their direction to certain threshold within unit distance, which can be used to distinguish
motorized and non-motorized transportation modes.
(2) Stop rate: The stop rate represents for the number of points at which the velocity of the user
under a certain threshold in unit distance mentioned by Zheng et al. [15].
(3) Velocity change rate: This feature mentioned by Zheng et al. [15]
(4) Trajectory length: The total distance of the trajectory.
Local Features
We adopted the profile decomposition algorithm mentioned by Dodge [23] to generate several
local features. As the movement parameters (velocity, acceleration, and turning angle) change over
time when a person is moving in the space, this becomes a profile or a function. First, the movement
parameter is expressed as a time series, where the amplitude and change frequency will provide proof
to describe the movement behavior and physics of the person. Second, we used two measures to
decompose the movement parameters, where a deviation from the central line represents the amplitude
variations over time. Sinuosity reflects the frequency variations with time change. These points were
then divided into four categories including “low sinuosity and low deviation”, “high sinuosity and low
deviation”, “low sinuosity and high deviation” and “high sinuosity and high deviation”, which are
labeled 0, 1, 2, and 3, respectively. More details can be found in Reference [23].
Next, a series of features were extracted from the profile decomposition algorithm. This included
the mean and standard statistics of segment length per decomposition class and per parameter;