Skip navigation
Link to Back issues listing | Back Issue Listing with content Index | Subject Index

HRT and fracture risk

Nonvertebral fractures [1]
Vertebral fractures [2]

Hormone replacement therapy in older women increases bone mineral density, and therefore should help prevent fractures. This is important because nonvertebral fractures occur in 1% of women every year in the decade after their 65 th birthday, increasing to 5% a year in women over 85 years ( Bandolier 94 ). Strengthening the skeleton should help. We know that HRT increases bone mineral density. Other treatments that increase bone mineral density reduce fractures. Therefore HRT should reduce fractures. A large case control survey seemed to confirm this ( Bandolier 62 ) with the key message that HRT protects against hip fracture while it is being taken and for a few years afterwards. Continued protection needs continued use.

What happens when we look at two meta-analyses of randomised trials for nonvertebral and vertebral fractures [1,2]? The message is about the same, though perhaps not quite as strong.


Both reviews came from the same team at York. Both had exemplary searching, including quizzing folk at international conferences to try and ensure that all published and unpublished trials were found. The key inclusion criteria was that women had been randomised to at least 12 months of treatment, with control of placebo, no treatment, or calcium with or without vitamin D. Most trials reported only mineral density, and fracture data was sought from investigators in most cases.

A sensibly conservative analysis was done with pre-specified sensitivity analyses because of clinical heterogeneity concerning nature of HRT preparation (for instance with or without progestins), dose and duration, as well as nature of women randomised.


Nonvertebral fractures [1]

Twenty-two randomised trials with 556 nonvertebral fractures in 8,776 women were found, eight with unpublished fracture data, and only one setting out to measure fractures as an outcome. Individual trials included 23 to 2,763 women and trial quality was generally good. Duration was 12 to 120 months. There was a wide range of results (Figure 1), with some trials having more fractures with control than HRT (below the line of equality in Figure 1), some having the same, and others having more fractures with HRT than control (above the line of equality). Some trials had no fractures in HRT or control. A data set that some might say was heterogeneous in results.

Figure 1: L'Abbé plot of nonvertebral fracture rates for HRT and control.

Size of symbol represents the size of the study

Using all trials there was a reduced risk with HRT (Table 1), but a significant result was seen only in trials with a mean age of less than 60 years when starting HRT. Pre-specified analysis of placebo-controlled trials showed a significant reduction in risk, as did trials with hip or wrist fractures. Published fracture data showed a significant reduction in fracture risk, while unpublished fracture data did not.

Table 1: Summary of results for nonvertebral and vertebral fractures

Fractures (%) with
Data set Number of trials Number of women HRT Control Relative risk
(95% CI)
Non vertebral fractures
All data 22 8776 5.3 8.0 0.72 (0.56 to 0.93) 36 (26 to 59)
Trials with at least one fracture per group 19 9351 5.5 8.2 0.74 (0.57 to 0.95) 37 (26 to 62)
Trials with at least 500 women 5 5832 6.1 8.4 0.50 (0.36 to 0.71) 43 (27 to 104)
Vertebral fractures
All data 13 6726 1.3 2.0 0.67 (0.45 to 0.98) 130 (72 to 645)
Trials with at least one fracture per group 8 4692 1.6 2.6 0.64 (0.40 to 1.01) not calculated
Trials with at least 500 women 2 3766 1.1 1.2 1.05 (0.37 to 2.94) not calculated

Additional sensitivity analysis with trials with at least one fracture per treatment group, or trials with more than 500 women, also showed a significant reduction in the risk of nonvertebral fractures (Table 1).

Vertebral fractures [2]

Here 13 trials were found, with 98 vertebral fractures in 6,726 women; one study was an abstract. Trial quality was generally good. Duration was 12 to 60 months.

Overall there was a statistically significant reduction in vertebral fractures (Table 1). Most of the effect was in three trials of women with established osteoporosis (though with small numbers), but there was no effect in women without osteoporosis. In five trials with women with a mean age of more than 60 years there was a significant reduction in risk, but not in women starting HRT aged less than 60 years.

Additional sensitivity analysis with trials with at least one fracture per treatment group, or trials with more than 500 women, showed no significant reduction in the risk of nonvertebral fractures (Table 1).


These are two really excellent systematic reviews tackling an important yet difficult problem. When the rate at which events happen is low, even really effective interventions will need trials with huge numbers to properly demonstrate statistical and clinical significance. This is hardly going to be likely with trials of 23 women, however long the trials.

In the absence of large, carefully designed and executed randomised trials, meta-analysis of small trials is our second best approach. Where there is obvious clinical heterogeneity, as here, we end up pooling information about different (or differing) treatments over different durations. Then trials may be of different quality, and women entering the trials may be different (with or without established osteoporosis, perhaps, or at different ages). The result can be that we get heterogenous results.

But the main problem is size (Figure 2). Low fracture rates in small studies means that event rates are all over the place. Only when the number of women is well over 1000 and we have about 100 fractures can any sense be made of it.

Figure 2: Fracture rate with control (placebo) against number of women given placebo

The dilemma is between having sufficient information to pool, and accept a degree of heterogeneity, or salami slice to clinical identity and have too little information to make sense. Sensitivity analysis can help, because it can tell us whether the same order of effect exists (or does not exist) whatever we do. This is one of the best current examples. The authors use some prespecified criteria for sensitivity analysis, and in Table 1 we suggest some others. Together these form a good example for teachers. For those interested in different ways of calculating NNTs, it also shows how clinically heterogeneous trials can sometimes give different results using different methods.

All very academic, but not much use to clinicians making decisions today. What is the answer? Does HRT reduce fractures or doesn't it? On balance it probably does, but the best guess is that about 40 women have to be treated for one to ten years to prevent a fracture in one of them.


  1. DJ Torgerson & SE Bell-Syers. Hormone replacement therapy and prevention of nonvertebral fractures. JAMA 2001 285: 2891-2897.
  2. DJ Torgerson & SE Bell-Syers. Hormone replacement therapy and prevention of vertebral fractures: a meta-analysis or randomised trials. BMC Musculoskeletal Disorders 2001 2: 7. (



Thanks for your very complementary reivew of our HRT reviews. However, I think there may be a problem with the NNTs in your article. One issue that may make your NNTs too conservative is that in our meta-analyses we did not include trials where there were no events. Excluding such studies will not effect the relative risk of an event but it will affect the NNTs if you used the numbers of fractures in our meta-analyses to calculate them. Some of the trials we excluded will not have had any events (some will but the data was not either collected or available).

By including only trials with reported events (whether published or unpublished) will overestimate the actual event rate and give a too low an NNT. Had we included all the trials this would not have solved the problem because some trials had events but did not collect the data so this would have given too high an NNT. In this instance the best method, I think, for estimating the NNT is to apply the relative risks to epidemiological data of fracture rates for given populations.

This is the reason we did not calculate NNTs in our papers.

Yours sincerely

David Torgerson
Dept of Health Sciences & Centre for Health Economics
University of York
York YO10 5DD

previous or next story in this issue