Background
In healthcare, many complex interventions are designed with the aim of changing the behaviour of individuals or groups of individuals. When designing new interventions, it is helpful to know which behaviour change techniques are most effective, and in which context. The behaviour change technique taxonomy1 has identified and classified 93 different behaviour techniques, and each of these may be used alongside other techniques to form a complex intervention. The types of behaviours that these interventions could be targeting are numerous and varied; health behaviours such as eating a low calorie diet, or ceasing smoking; or clinical behaviours such as following government guidelines, prescribing drugs or washing hands. When summarising behaviour change research, one option would be to consider the effect of a specific intervention on a specific behaviour, but the large number of targeted behaviours would lead to a huge number of potential systematic reviews (or comparisons within a systematic review); each aiming to answer a different question, but unlikely to have enough statistical power to do so. In addition, it may be hard to interpret evidence from multiple systematic reviews of similar interventions that report conflicting conclusions, and present evidence in different ways. As described by Melandez-Torez2, there are situations where it makes sense to group together interventions as ‘clinically meaningful units’ with a similar expected ‘theory of change’. In terms of behaviour change techniques, it can be informative to combine evidence to answer a broad question about how well a particular behaviour change technique (or group of techniques) has performed, on average, on any type of behaviour, and to use this information to identify which techniques are effective. This can be supplemented with analysis of effect moderators, to identify the contexts in which the technique is more or less effective.
Interventions to change healthcare professional (HCP) behaviour can be designed to target the individual HCP, or team of HCPs. Trials of this type of intervention can vary in terms of the unit of randomisation which can be either the individual HCP, or a group of HCPs such as those working within the same site (surgery, nursing home, ward, hospital). The unit of analysis in these trials can also vary and is not necessarily the same as the unit of randomisation; for example with randomisation at the level of GP surgery but with data recorded for each individual patient. The outcomes could be measured using a variety of denominators, such as the individual patient (e.g. binary measure of whether a test was ordered), individual HCPs (e.g. number of tests ordered per GP), or at site-level (e.g. proportion of patients with an appropriate test order on a hospital ward). These multiple and varied layers need to be considered in terms of adjustment for clustering, combination of data and interpretation of results.
There are several proposed methods of summarising mixed measures of behavioural outcome. Higgins et al.3 provide an overview of methods to synthesise quantitative evidence in systematic reviews of complex health interventions. They describe and compare a number of graphical methods to combine different outcomes; as well as synthesis methods using effect size estimates, which are suitable for complex interventions. One approach is to combine effect sizes (standardised mean differences (SMDs)) using standard errors to derive weights for the studies in meta-analysis. In addition to allowing for different measurements (both binary and continuous) to be combined, this approach can accommodate a mixture of individually randomised and cluster randomised trials using weights based on adjusted standard errors. In some systematic reviews, binary measures of the same outcome are analysed and reported separately from continuous ones4,5. An alternative approach6 to using weights based on standard errors is to use study weights based on the number of health professionals included in the study. The albatross plot7 is a graphical method which allows synthesis of summary data in a variety of formats, using only p-values plotted against sample size; this can be used to assess the consistency of results visually and allows estimation of average effect sizes.