Skip navigation
Link to Back issues listing | Back Issue Listing with content Index | Subject Index

Funnel plots and heterogeneity

Methods
Results
Heterogeneity
Methods
Results
Comment

Systematic reviews and meta-analyses look for evident sources of bias or other reasons for suspecting the veracity of results of clinical trials. Funnel plots are thought to detect publication bias. Heterogeneity is thought to detect fundamental differences between studies. New evidence suggests that both of these common beliefs are badly flawed.

Bandolier generally eschews heavy stuff on complex methodological issues in systematic reviews and meta-analysis. The reasons involve their complexity, and the fact that mostly the details interest only the anoraks among us.

The trouble is that some folks seem to think that these methodological topics are cast in stone, and that any meta-analysis that had an asymmetric funnel plot or demonstrates statistical heterogeneity is so badly flawed that the treatment it reviews must be dismissed. What's worse is that these are often reasons used to advise against prescribing treatments that are effective. So this month a light diversion into the heavy clay of maths and stats.

Funnel plots


A funnel plot is where some trial specific effect (odds ratio, relative risk) is plotted against some measure of its precision. Precision may be defined in different ways. Commonly used are the number of subjects in a trial, or some function of the standard error. If the plot is symmetric, like an inverted V, this is interpreted as demonstrated that there is probably no publication bias. If the plot is asymmetric, the interpretation is that publication bias is likely.

There is no empirical evidence to support this notion. Yet the interpretation of asymmetrical plots is often that there must be unpublished negative trials that would serve to negate the positive findings of a meta-analysis if only they could be found. Philosophers have a word for it. Falsification means that you can't prove that its stupid though its stupidity stares you in the face. So some evidence that funnel plots are not what they seem is welcome [1].

Methods


Researchers in Hong Kong [1] examined 198 meta-analyses from the Cochrane library from 1998, excluding those with continuous variables or fewer than five trials. They then produced different funnel plots using size and using standard error. In other words, the same information from the same meta-analyses was used to produce funnel plots for each meta-analysis, but in two different ways.

Results


Of the 198 meta-analyses, 43 (22% of the total) could be construed as showing publication bias because of their asymmetry. Of these 43, 37 (86%) had a symmetrical funnel plot when a different method of plotting the results was used. As the number of trials in a meta-analysis fell, the proportion in which one of the two methods of plotting showed symmetry rose, to 100% in those with six trials. There was also a suggestion that asymmetry in at least one of the plots was present in those meta-analyses where it was least likely to be expected.

Heterogeneity


If you have a number of trials with the same sort of patients, with the same disease severity, the same intervention, and the same outcome over the same time, then what you have is a homogeneous data set. Right? Well no, actually, because heterogeneity defined statistically would call this heterogeneous 10% of the time. That's how heterogeneity tests are set up. So if you have a homogeneous data set how well do the tests for heterogeneity hold up?

Methods


Researchers in Oxford used individual patient data to simulate clinical trials, and also to simulate 10,000 meta-analyses using different numbers of trials per meta-analysis and using different event rates. This was done for five commonly used ways of calculating heterogeneity.

Results


The most commonly used statistical tests did not produce the expected level of heterogeneity (that is, 10%) in a truly homogenous data set. They either over-estimated the level of heterogeneity (finding up to 20% of the meta-analyses heterogeneous) or underestimated it (less than 1% when the test was set to find 10%). When heterogeneity was introduced, they couldn't detect it until the data sets were very heterogeneous indeed.

The conclusion was that homogeneity tests (what the tests really should be measuring) were of very limited use. They couldn't detect homogeneity nor could they detect heterogeneity. The fallback is to use fixed and clearly defined inclusion criteria and fixed and clearly defined outcomes.

Comment


All types of tools have their place as we try and make the best and most appropriate sense of clinical trial data in systematic reviews and meta-analyses. But the emphasis must be on tools, not rules. Funnel plots and heterogeneity tests are not fixed in stone. They have their problems.

It is not their use that is the issue, because that may well be legitimate in any particular circumstance. The worry is that people with fixed agendas (and perhaps fixed budgets) use them as rules, and inappropriately, to try and influence prescribing decisions.

If there is any rule here, it is that clinical common sense is the best indicator of whether reviews and meta-analyses make sense. Are the patients in the trials like yours? Are the inclusion criteria sensible? Do the outcomes make sense? Are they useful? And so on. No exercise of evidence gathering was harmed by adding clinical common sense.

References:

  1. J Tang, JL Liu. Misleading funnel plot for detection of bias in meta-analysis. Journal of Clinical Epidemiology 2000 53: 477-484.
  2. DJ Gavaghan, RA Moore, HJ McQuay. An evaluation of homogeneity tests in meta-analyses in pain using simulations of individual patient data. Pain 2000 85: 415-424.
previous or next story in this issue