Skip navigation
Link to Back issues listing | Back Issue Listing with content Index | Subject Index

Mindstretcher - being indirect

Indirect method?
dataset 1: P carinii pneumonia [1]
dataset 2: Newer antipsychotics for schizophrenia [2]
dataset 3: Paracetamol and codeine [3]

One of the toughest tasks we face is evaluating the place of new treatments. It is relatively straightforward when we have no current treatment and a new one comes along. But when we have several treatments and we want to know how a new one compares, then it is not so easy. What we would like, of course, is large, randomised, valid, head-to-head comparisons. There are almost never available, but what we have is a series of trials with different comparators. What can we do with this?

Indirect method?

Back in 1997 a group from McMaster came up with a method that allowed the calculation of odds ratios or relative risk of A versus B when we have only A versus C and B versus C trials [1]. Essentially it takes the ratio of the log odds of A versus C and B versus C studies.

There are lots of equations, and it is not easy to get a simple brain around it. Even though it looks sensible, much still depends on the data sets to which methods might be applied. Statistics can't rescue us from inadequate or insufficient evidence. So three examples to look at how we might expand our thinking on indirect comparisons.

1 P carinii pneumonia [1]

This was the original data set on which the indirect calculation of odds ratios was based. The setting was a systematic review of antibiotic regimens for the prevention of P carinii pneumonia in patients with HIV infection. There were two experimental therapies, trimethoprim-sulphamethoxazole (TMP) and dapsone/pyrimethamine (DP) and a standard therapy of aerosolised pentamidine (AP).

Results of the trials in direct comparisons is shown in Table 1. TMP was better than AP, DP was no different from AP, and TMP was better than DP. Odds ratios calculated using the new method from indirect comparisons was close to those from direct trials. Figure 1 shows an abacus plot of the single treatment arms for TMP, DP and AP. Overall, P carinii pneumonia occurred in 5.5% (95% CI 4.4 to 6.7%) of 1484 patients taking TMP, 9.0% (7.6 to 10.4%) of 1547 patients taking DP and 9.9% (8.3 to 11.5%) of 1331 patients taking AP.

Table 1: Information on P carinii pneumonia in direct and indirect randomised comparisons of TMP, DP and AP

Number/total (%) with outcome
Comparison Number of trials Treatment Comparator Relative risk NNT
TMP vs AP 9 26/681
0.35 (0.23 to 0.53) 12 (9 to 19)
DP vs AP 5 51/732
0.89 (0.62 to 1.27) N/A
TMP vs DP 8 56/803
0.66 (0.48 to 0.90) 26 (15 to 96)
N/A = not applicable

Figure 1: Percent with P Carinii pneumonia with AP, TMP and DP in direct and indirect randomised trials

The authors rightly spent some time worrying about whether direct comparison trials were different from other trials, because, for instance, they were longer. Most trials were small and had few events, some none. The indirect comparison, the direct comparison, and a simple graphical representation of the trial arm data all gave the same overall conclusion, that on this measure of efficacy TMP was the clear winner.

2 Newer antipsychotics for schizo phrenia [2]

This topic is a great deal more difficult. For a start, efficacy in schizophrenia trials is measured using several different scales, few of which make much sense or are interpretable for everyday clinical practice. In consequence, outcomes like discontinuation (for adverse events or lack of efficacy) is often the most useful measure. Then there is the question of dose or older or newer antipsychotics. Doses are often titrated, but trials may used fixed doses, some of which are less effective than older antipsychotics used at fixed dose. And there are many nuances that make this even more complicated.

Anyway, the issue here was between risperidone and olanzapine, both of which had been compared with haloperidol in a number of trials, and for which there was one direct comparison. Table 2 shows the data for withdrawal due to lack of efficacy.

Table 2: Discontinuation because of lack of efficacy in randomised comparisons of risperidone, olanzapine and haloperidol

Number/total (%) with outcome
Comparison Number of trials Treatment Comparator Relative risk NNT
Risperidone v haloperidol 7 225/1573
0.81 (0.63 to 1.03) N/A
Olanzapine v haloperidol 3 375/1860
0.68 (0.59 to 0.78) 10 (7 to 15)
Risperidone v olanzapine 1 24/172
0.83 (0.50 to 1.37) N/A
N/A = not applicable

In the direct comparison for this outcome, risperidone failed to beat haloperidol, while olanzapine did beat it. There was no difference in the direct comparison. A difficulty was that the rate of discontinuations for lack of efficacy with haloperidol in the olanzapine trials was quite a lot higher than in the risperidone trials.

An abacus plot of data from all treatment arms (Figure 2) emphasises the dependency on some large trials. Overall, lack of efficacy withdrawal occurred in 14% (13 to 16%) of 1745 patients on risperidone, 20% (18 to 22%) of 2027 patients on olanzapine and 26% (23 to 28%) of patients on haloperidol.

Figure 2: Percent discontinued because of lack of efficacy in randomised comparisons of risperidone, olanzapine and haloperidol

The review [2] concluded that over all outcomes the indirect and direct comparisons gave the same answer. That is probably correct. Is olanzapine better than risperidone? That's more difficult, though if for this example anticholinergic drug use had been chosen as an outcome, it would have shown more use with risperidone than olanzapine.

3 Paracetamol and codeine [3]

Now a subtly different examination of indirect comparisons, using the combination of paracetamol 1000 mg plus codeine 60 mg. This combination has been shown to have a low (good) NNT in standard high-quality acute pain trials, but we have only three trials with under 200 patients, only 114 of whom had paracetamol and codeine. How can this result be buttressed?

One way is to look at other paracetamol/codeine combinations in a similar setting (Table 3). Demonstrating a dose-response relationship is helpful, and we get a similar dose-response if we look at an abacus plot (Figure 3) from all placebo controlled trials.

Table 3: Comparisons of different combinations of paracetamol and codeine with placebo in acute pain studies, with outcome of at least 50% pain relief over 4-6 hours

Number/total (%) with outcome
Comparison Number of trials Treatment Comparator Relative risk NNT
1000 mg + 60 mg 3 65/114
4.8 (2.6 to 8.8) 2.2 (1.7 to 2.9)
600/650 mg + 60 mg 13 191/398
2.5 (2.0 to 3.1) 3.4 (2.8 to 4.3)
300 mg + 30 mg 4 56/215
3.2 (1.8 to 5.6) 5.6 (4.0 to 9.8)

Figure 3: Percent with outcome of at least 50% pain relief over 4-6 hours for placebo, paracetamol 300 + codeine 30, paracetamol 600 + codeine 60 and paracetamol 1000 + codeine 60

Other things that could be done would include assessing how accurate we can be with this level of efficacy and amount of data (92% confident that we are within ±0.5 of the true NNT). We'd need data from 100 more patients to be at least 95% confident.

In three trials with active controls there was information on another 117 patients given paracetamol/codeine at the doses interesting to us, and they had a similar rate of pain relief. In another six trials omitted from the meta-analysis because of technical problems with measurement scales rather than design issues that could lead to bias, the combination of paracetamol and codeine was better than placebo or comparator on at least one measure.

So the sparse data in indirect studies can be supplemented with considerable amounts of information from other high-quality trials.


Indirect comparisons make for the biggest problems and arguments. The bottom line is that there is no doubt but that best information will come from large, properly constructed randomised trials, using valid outcomes, and done in a way that is meaningful to clinical practice. In most cases this is nothing more than baying for the moon.

When we need to make decisions now based on the information we have, we will be forced to look at indirect comparisons. The simple rule is that quality cannot be compromised. Data from trials prone to bias because of faulty design won't help us, and may drive us to an incorrect conclusion. Then we have to use outcomes that make sense. And we need sufficient numbers of patients and events to overcome any random effects.

After that, we're probably on our own, though indirect odds ratio calculations may be useful [1] in some circumstances. Abacus plots of single trial arms can be useful back-ups, but have the problem of losing the advantage of randomisation unless there is excellent clinical homogeneity to begin with (and even then use them with caution until we know more).


  1. HC Bucher et al. The results of direct and indirect treatment comparisons in meta-analysis of randomized controlled trials. J Clin Epidemiol 1997 50: 683-691.
  2. L Sauriol et al. Meta-analysis comparing newer antipsychotic drugs for the treatment of schizophrenia: evaluating the indirect approach. Clinical therapeutics 2001 23: 942-956.
  3. LA Smith et al. Using evidence from different sources: an example using paracetamol 1000 mg plus codeine 60 mg. BMC Medical Research Methodology 2001 1:1 ( ).
previous story in this issue