Strength and Limitations
This is the first study that provides a practical tool to assess usefulness of clinical research. At the same time, we have been able to demonstrate the use of the tool in 350 clinical trials in PTB. This assessment not only demonstrates the practical use of this tool, but also provided a very relevant overview on usefulness in the field of PTB research.
There are some limitations that need to be addressed. First, RCTs included in Cochrane reviews do not represent all RCTs on PTB prevention. However, pregnancy is the earliest field systematically addressed by Cochrane and its coverage of relevant trials is probably very high.
Second, usefulness data collection is dependent on the complete and faithful reporting of those features in published articles. One can, for example, argue that ‘value for money’ considerations might be described by the research group in their funding application and not in their published articles. Therefore, an underestimation of the prevalence of this item is possible. Conversely, some items may be over-estimated, e.g. power calculations may have been added post-hoc and some multi-center, unmasked trials of existing interventions may still violate pragmatism, contrary to authors’ claims, and therefore our estimate of the proportion of pragmatic trials is an upper bound.
Third, the usefulness features are not meant as a ‘checkbox’ to ensure high quality and low bias. A study scoring ‘high’ in all usefulness items can still provide highly biased or even false data. Also, some usefulness items are not always ‘good’ or ‘bad’. One such example is pragmatism. Not all clinical research questions require a pragmatic trial design (12) and typically, it is reasonable to do some explanatory trials before venturing into proving usefulness through pragmatism.
Fourth, for information gain we used an approach focused on power calculations and use of relevant outcomes. However, one can also measure how extensively the results of a study change prior perceptions of the evidence (“entropy change”).(16) A well-powered study may not change our prior evidence much, if it fully agrees with what we already knew before running the study and if the evidence was already conclusive before the new study was run.
Fifth, we have operationalized the eight criteria of usefulness with the aim of applying them in a specific field, in this case PTB trials, for demonstration purposes. For most of the eight items, the same operationalized definitions can be applied to any other clinical research field. The one exception is definition of problem base. Depending on the clinical problem, different problem-specific and field-specific definitions would need to be conceived.
Finally, we did not yet examine how the 13 items are correlated to each other. Providing a total usefulness score might therefore not be appropriate as all individual criteria provide their own perspective of usefulness information and they are not interchangeable.