Background
In healthcare, many complex interventions are designed with the aim of
changing the behaviour of individuals or groups of individuals. When
designing new interventions, it is helpful to know which behaviour
change techniques are most effective, and in which context. The
behaviour change technique taxonomy1 has identified and
classified 93 different behaviour techniques, and each of these may be
used alongside other techniques to form a complex intervention. The
types of behaviours that these interventions could be targeting are
numerous and varied; health behaviours such as eating a low calorie
diet, or ceasing smoking; or clinical behaviours such as following
government guidelines, prescribing drugs or washing hands. When
summarising behaviour change research, one option would be to consider
the effect of a specific intervention on a specific behaviour, but the
large number of targeted behaviours would lead to a huge number of
potential systematic reviews (or comparisons within a systematic
review); each aiming to answer a different question, but unlikely to
have enough statistical power to do so. In addition, it may be hard to
interpret evidence from multiple systematic reviews of similar
interventions that report conflicting conclusions, and present evidence
in different ways. As described by Melandez-Torez2, there are situations
where it makes sense to group together interventions as ‘clinically
meaningful units’ with a similar expected ‘theory of change’. In terms
of behaviour change techniques, it can be informative to combine
evidence to answer a broad question about how well a particular
behaviour change technique (or group of techniques) has performed, on
average, on any type of behaviour, and to use this information to
identify which techniques are effective. This can be supplemented with
analysis of effect moderators, to identify the contexts in which the
technique is more or less effective.
Interventions to change healthcare professional (HCP) behaviour can be
designed to target the individual HCP, or team of HCPs. Trials of this
type of intervention can vary in terms of the unit of randomisation
which can be either the individual HCP, or a group of HCPs such as those
working within the same site (surgery, nursing home, ward, hospital).
The unit of analysis in these trials can also vary and is not
necessarily the same as the unit of randomisation; for example with
randomisation at the level of GP surgery but with data recorded for each
individual patient. The outcomes could be measured using a variety of
denominators, such as the individual patient (e.g. binary measure of
whether a test was ordered), individual HCPs (e.g. number of tests
ordered per GP), or at site-level (e.g. proportion of patients with an
appropriate test order on a hospital ward). These multiple and varied
layers need to be considered in terms of adjustment for clustering,
combination of data and interpretation of results.
There are several proposed methods of summarising mixed measures of
behavioural outcome. Higgins et al.3 provide an overview of
methods to synthesise quantitative evidence in systematic reviews of
complex health interventions. They describe and compare a number of
graphical methods to combine different outcomes; as well as synthesis
methods using effect size estimates, which are suitable for complex
interventions. One approach is to combine effect sizes (standardised
mean differences (SMDs)) using standard errors to derive weights for the
studies in meta-analysis. In addition to allowing for different
measurements (both binary and continuous) to be combined, this approach
can accommodate a mixture of individually randomised and cluster
randomised trials using weights based on adjusted standard errors. In
some systematic reviews, binary measures of the same outcome are
analysed and reported separately from continuous ones4,5.
An alternative approach6 to using weights based
on standard errors is to use study weights based on the number of health
professionals included in the study. The albatross plot7 is a graphical method
which allows synthesis of summary data in a variety of formats, using
only p-values plotted against sample size; this can be used to assess
the consistency of results visually and allows estimation of average
effect sizes.