Discussion
We found including school absences in seasonal models improved
community-level confirmed influenza predictions over multiple seasons
within Allegheny County. All-school absence models subtly improved
predictions, reducing MAE by 5% across multiple validations, but
school- and grade-specific absence models had better predictions,
reflecting underlying age-specific differences in infections. Elementary
school absence (K to 5th grades) models decreased MAEs
by 1-16% compared to 6-12th grades, suggesting
younger student absences were illness-related and older children’s
absences were non-influenza and non-illness related. From school cohort
data, ILI- and all-cause absences performed better in single season
(2007-2008 and 2012-2013) validations and when pooled across seasons.
Elementary school, K-5th grade-specific all-cause
absences, and potentially ILI-specific absences, may serve surveillance
indicators for the larger community.
Compared to seasonal models, those including all-cause absences improved
MAE and R2 estimates, and suggests that after
accounting for seasonal factors, school absences improved influenza
predictions. Our analysis is one of few using weekly all-cause absences
at various administrative levels (i.e., school type and grades) to
predict influenza. Whereas other studies used cause-specific absences to
detect elementary school influenza outbreaks(6), ours evaluated how
different school and grade all-cause absences performed as predictors.
As evidenced by higher R2 and lower relMAEs from
elementary school absence models, absences from younger school-aged
children better reflect infections during the influenza season and are a
proxy to the younger age groups that experience higher infections and
increased susceptibility(5, 25, 26). In contrast, middle and high
schools’ absences were noisier prediction signals, possible because
older students had more non-influenza related absences (consistent with
the overall higher absenteeism rates observed in these schools over
time). Lower relMAEs from lower individual grade
(K-5th grades) absence models from multiple
validations further support our findings. Hence, elementary school
absences could be useful for influenza surveillance.
ILI-specific absences predicted influenza better than all-cause absences
when evaluating predictions from weekly all-cause and ILI-specific
absence models (using school-based cohort studies), based on lower MAEs
and higher R2 for specific seasons and when pooled.
Other studies also found ILI-specific absences were a proxy for
influenza when evaluating vaccine impacts(27), suggesting ILI-specific
absences likely capture actual influenza infections. We could not
conduct cause-specific absence surveillance for more than one influenza
season for each study nor could we perform school-type and
grade-specific comparisons of all-cause and ILI-specific absences due to
small time-period, but these may also be important predictors of
influenza incidence.
Our study has some limitations. We did not evaluate our predictions
during the 2009 pandemic because our county absence data were either
limited to single seasons, or available after 2009 because participating
schools’ electronic absence surveillance began after 2009. Similarly,
cohort studies were funded for and conducted during the 2007, 2012, and
2015 seasons, therefore we could not assess predictions during the 2009
pandemic. In the school-based cohort studies, not all absences were
identified due to challenges contacting parents regarding absences and
our studies may underestimate the number of all-cause absences, and
possibly, ILI-specific absences. Our predictions used school-based data
from school districts within Allegheny County only, therefore our
results may not be generalizable to influenza transmission in other US
counties. Additional data from other Pennsylvania counties or a
representative sampling from other state counties would improve the
generalizability of our predictions.
Recently, others, like those participating in the CDC FluSight Challenge
– an influenza prediction competition – have used climate data, past
influenza incidence and other data streams in recent efforts. In the CDC
FluSight Challenge, external research teams predict weekly influenza
cases, and evaluation metrics include the mean absolute scaled error, a
measure of forecast accuracy(28, 29). Our MAE decreased by 5% when
using county-level all-cause absences models and is equivalent an
additional 8 weeks of data included in a nowcast model, like those used
in the FluSight Challenge. This equates to a 5% reduction in mean
absolute scaled error(30). Our results suggest that models including
lower grades’ absences may improve predictions, as seen by the 10% MAE
decrease, and may improve predictions more when incorporated into
ensemble models, like those used in FluSight(29).
Our findings suggest models using absences of younger students improves
predictive performance. Real-time, day-to-day absence data are easy to
collect, readily available in many schools, and can provide more
accurate predictions than other surveillance mechanisms reliant on
virologic confirmation, and susceptible to laboratory testing delays.
Future studies could apply absence data to other prediction
methodologies, like ensemble methods and machine-learning algorithms,
which may improve prediction accuracy and identify absence-related
patterns not considered here. We demonstrate grade-specific all-cause
absences predict community level influenza one-week forward, when
influenza- or cause-specific absences are unavailable and suggest
elementary school or lower grade absenteeism during the influenza season
can reflect influenza circulation. Using school indicators can inform
influenza surveillance and control efforts, including annual
vaccination; antiviral treatment or prophylaxis; and promotion of
everyday preventive measures (i.e., staying home when sick, respiratory
hygiene, and hand hygiene) to reduce school- and community-level
influenza transmission.