INTRODUCTION

Twenty-three years represents one generation. Women who entered menopause in 2002, facing debilitating vasomotor symptoms, sleep disruption, and genitourinary syndrome, are now in their seventies, many having suffered preventable cardiovascular disease, osteoporotic fractures, and cognitive decline. On November 10, 2025, FDA Commissioner Marty Makary acknowledged the hormone replacement therapy (HRT) debacle as possibly one of the greatest medical errors of modern times.1,2 This admission, while overdue, demands interrogation: How did a single flawed study override decades of clinical experience and biological plausibility? Why did the correction require 23 years when methodological critiques emerged within 24 months? What systemic failures enabled this catastrophe?

The quantifiable toll staggers comprehension. Between 2002 and 2012 alone, as HRT prescriptions plummeted from 90 million to 21 million annually in the United States, an estimated 91,000 postmenopausal women aged 50-59 died prematurely.3 Approximately 50 million women were denied treatment.1 Current evidence demonstrates unequivocally that HRT initiated within 10 years of menopause reduces fatal cardiovascular events by 25-50%, cognitive decline by 64%, and Alzheimer’s disease by 35%.1 With the exception of antibiotics and vaccines, few interventions in modern medicine offer comparable population-level benefit for older women.1

This was not merely an error; it was a systemic failure. This analysis examines the structural, statistical, and psychological barriers that prevented timely correction, documents the human cost of regulatory paralysis, and proposes mandatory reforms to prevent recurrence.

The original error: extrapolation without justification

The 2002 Women’s Health Initiative (WHI) study fundamentally altered medical practice worldwide within months of publication.4 The study reported increased cardiovascular disease and breast cancer in postmenopausal women receiving combined HRT. Media coverage transformed hazard ratios into categorical contraindications. Within six months, the FDA implemented class-wide black-box warnings. Prescription rates collapsed.

The methodological flaws in the study were immediately apparent to reproductive endocrinologists. In 2004, we published comprehensive critiques identifying the fatal extrapolation error.5,6 The WHI tested one specific formulation (conjugated equine estrogen combined with medroxyprogesterone acetate, a synthetic progestogen now rarely used) in women averaging 63 years old, more than a decade past menopause. Yet regulators extrapolated these findings to all hormone formulations, all doses, all routes of administration, and all patient ages. This violated basic principles of evidence-based medicine.

Biological plausibility contradicted the interpretation. Women undergoing bilateral oophorectomy before natural menopause had long demonstrated increased cardiovascular risk, suggesting estrogen’s cardioprotective role. The “timing hypothesis” (that HRT benefits depend critically on age at initiation and time since menopause) had substantial pre-WHI support. The WHI’s population, predominantly women in their seventh decade, represented the worst possible cohort to assess therapy normally initiated in women’s fifties. Administering HRT years after permanent vascular changes occur differs fundamentally from initiating it during the perimenopausal transition, when the vascular endothelium remains responsive.

The timing of initiation was identified as crucial from the outset, yet this critical factor disappeared in regulatory translation.6 Table 1 demonstrates the staggering magnitude of the extrapolation error.

Table 1.Women’s Health Initiative Study Design vs. Regulatory Application
Aspect WHI Study Design → FDA Regulatory Action
Formulation One: CEE + MPA → All estrogen formulations
Dose Fixed dose → All doses
Age Mean 63 years → All ages
Time from menopause >10 years → All timepoints

CEE = conjugated equine estrogen; MPA = medroxyprogesterone acetate. This table demonstrates the extrapolation error: Warnings are applied to all HRTs based on one formulation in one population.

Three systemic barriers to timely correction

Statistical misrepresentation as a policy tool

The presentation of WHI findings exemplifies how statistical framing shapes or distorts medical policy. A hazard ratio of 1.26 for coronary heart disease generated headlines proclaiming “26% increased risk.”4 This relative risk presentation obscured the absolute risk: 0.08% annually, translating to 8 additional cases per 10,000 women-years. For breast cancer, the hazard ratio of 1.24 represented 8 additional cases per 10,000 women annually, and critically, the increase in breast cancer cases showed no significant increase in breast cancer mortality.

The failure to communicate absolute alongside relative risk was not an inadvertent oversight; it represented a deliberate choice with catastrophic consequences. Presenting “26% increased risk” without the context that annual absolute risk increased by less than 0.1% transformed a modest, statistically marginal finding into an apparently categorical contraindication.

Context matters profoundly. The annual risk of death from motor vehicle accidents in the United States approximates 11 per 10,000 individuals, exceeding the absolute risk increase from HRT. Yet no regulatory body issued black-box warnings against automobile travel. The differential response reveals how risk perception, not risk magnitude, drove policy.

Moreover, the WHI demonstrated reduced all-cause mortality in women aged 50-59 who received therapy, a finding that should have dominated the discussion but was instead buried in supplementary analyses. A therapy that reduces total mortality while causing a non-significant increase in non-fatal events presents a profoundly different risk-benefit profile than the headlines suggested. Table 2 illustrates this risk communication failure.

Table 2.Risk Communication: Relative vs. Absolute Risk in WHI Study
Outcome Hazard Ratio (Relative Risk) Absolute Risk Increase (per 10,000 women-years)
CHD 1.26 (26% increase) 8 additional cases
Breast cancer 1.24 (24% increase) 8 additional cases
All-cause mortality (age 50-59) 0.70 (30% decrease) NS in original publication

CHD = coronary heart disease; NS = not significant. Relative risk creates different impressions than absolute risk. For context, the annual risk of death from motor vehicle accidents is approximately 11 per 10,000 individuals.

Institutional structures rewarding restriction over reassessment

Regulatory agencies face asymmetric incentives: swift action to restrict earns praise for “protecting public health,” while liberalization invites accusations of industry capture. After the January 2003 warnings, reversing course became institutionally untenable regardless of accumulating contrary evidence.

The institutional architecture lacks reassessment mechanisms. No mandate required agencies to revisit the WHI interpretation as subsequent studies emerged. No formal process existed for external methodological critique to trigger regulatory review. No sunset clauses forced periodic re-evaluation. The system structurally favors maintaining the status quo.

The extrapolation from one formulation in one population to class-wide warnings exemplifies regulatory overreach.1 The WHI tested conjugated equine estrogen with medroxyprogesterone acetate, a combination now rarely prescribed. Yet warnings applied equally to bioidentical estradiol, to different progestogens, to transdermal administration, and to low-dose vaginal preparations. Low-dose vaginal estrogen, with minimal systemic absorption and unequivocal benefit for genitourinary syndrome, carried identical black-box warnings as systemic therapy. The absurdity of this equivalence troubled no regulatory review.

Pharmaceutical companies, facing potential liability, added warnings to all estrogen-containing products. The resulting labels frightened physicians and patients alike, regardless of the specific product’s evidence base. The regulatory cascade (from one study to class-wide warnings to defensive labelling) created a climate in which offering HRT invited medicolegal risk, while denying it did not.

Cognitive biases privileging fear over evidence

Once institutional fear became established, contradictory evidence faced insurmountable barriers. The psychological phenomenon is well-documented: negative information carries disproportionate weight, and reversing an initial judgment requires far more evidence than it initially required to form. The WHI created a “common knowledge” framework that HRT causes heart disease and breast cancer, which subsequent studies could not dislodge despite superior methodology and contradictory findings.

The Danish Osteoporosis Prevention Study, published in 2012, demonstrated a 52% reduction in cardiovascular events among women randomized to HRT initiated at menopause, a finding diametrically opposite to WHI.7 This randomized controlled trial, methodologically equivalent to WHI but using subjects in the appropriate population, should have prompted immediate regulatory reassessment. It did not. The study was acknowledged, cited, and then ignored in policy formation.

Similarly, WHI re-analyses stratified by age consistently showed HRT benefit in younger women and harm in older women, precisely what the timing hypothesis predicted.8 These findings were published in high-impact journals, discussed at conferences, and incorporated into guidelines, yet the black-box warnings remained unchanged for another decade.

The medicolegal environment amplified the bias. Physicians prescribing HRT to a woman who subsequently developed breast cancer faced potential litigation; physicians declining to prescribe therapy to women who subsequently suffered cardiovascular disease or fractures faced no comparable liability. The asymmetric legal risk created a powerful incentive structure that favored non-treatment, regardless of net benefit.

An entire physician generation completed training without adequate education in menopausal medicine.2 Residency programs eliminated didactic sessions on HRT. Young physicians learned that " HRT causes cancer," full stop. The nuances of timing, formulation, route, and individualization disappeared from medical education. When regulatory reversal occurred in 2025, much of the medical workforce lacked the knowledge base to implement the change appropriately.

What finally enabled policy correction

Leadership matters profoundly in institutional change. Commissioner Makary explicitly framed the reversal as correcting an error rather than updating evidence, language that acknowledged systemic failure and created space for change.1,2 This rhetorical positioning neutralized accusations of industry influence that might otherwise have accompanied HRT prescription liberalization.

The July 2025 Expert Panel provided formal mechanisms for comprehensive evidence review and stakeholder input.1 Critically, the panel included not only regulators but also methodologists, practicing clinicians, patient advocates, and independent scientists. Public engagement via the Federal Register docket allowed affected individuals to document harm from denied treatment, testimony that quantified the human cost of regulatory paralysis.

The FDA conducted comprehensive drug utilization analyses examining prescribing patterns over two decades, documenting the collapse in HRT use and correlating it with outcomes data. Re-evaluation of WHI and post-WHI publications, with particular attention to timing, duration, and age-dependent risks, revealed the consistent pattern obscured by class-wide warnings: benefit in women treated within 10 years of menopause, neutral or harmful effects when initiated later.

Current evidence is unequivocal. HRT initiated within 10 years of menopause reduces fatal cardiovascular events by 25-50%, cognitive decline by 64%, Alzheimer’s disease by 35%, and fractures by 50-60%.1 The magnitude of benefit rivals that of statins for cardiovascular prevention and exceeds most interventions for cognitive preservation. The evidence base now includes randomized trials, large observational studies with careful confounding control, and meta-analyses demonstrating remarkable consistency when analyses properly account for the timing of initiation.

Five mandatory reforms to prevent recurrence

The HRT catastrophe reveals systemic failures requiring structural reform, not merely improved evidence interpretation.

First, proportional regulatory response. Class-wide warnings require class-wide evidence. When a single study of one formulation in one population generates restrictions across all formulations, all routes, all doses, and all patient populations, the response has exceeded the evidence. Regulatory precision must match scientific specificity. Black-box warnings should specify the exact population, formulation, dose, and clinical context in which the risk was demonstrated. Extrapolation beyond studied conditions should be explicitly flagged as such, not presented as an established fact.

Second, mandatory reassessment with sunset clauses. Every black-box warning should include a five-year sunset provision requiring formal re-evaluation. The default should be review, not permanence. Regulatory agencies must document that warnings remain evidence-based as new data accumulate. If reassessment is not performed, the warning should automatically expire, forcing either active renewal with updated justification or removal. This creates institutional incentive for ongoing evidence synthesis rather than regulatory inertia.

Third, formal mechanisms for external methodological critique. When studies drive major policy changes, regulatory agencies must establish expedited processes for independent methodological review. The 2025 Expert Panel model demonstrates effective multi-stakeholder engagement; it should become standard practice, not an exceptional response. External scientific critique, particularly when published in peer-reviewed literature, as occurred with the 2004 WHI critiques, should trigger mandatory regulatory reassessment. Scientists identifying methodological flaws in policy-driving research should have formal channels to request review, with agencies required to respond within defined timeframes.

Fourth, transparent absolute risk communication as a regulatory requirement. All regulatory communications regarding risks must present both relative and absolute risk measures, with context comparing the magnitude to familiar risks. Presenting hazard ratios without absolute risk constitutes misrepresentation, whether intentional or not. The FDA’s recent revision of HRT labels to remove statements like “lowest effective dose for shortest duration” (language that implied universal harm regardless of dose or duration) represents progress, but broader reform is needed across all drug classes.

Fifth, urgent reinvestment in the menopausal medicine subspecialty. Two decades of educational neglect created a knowledge vacuum. Medical schools must restore comprehensive training in reproductive endocrinology across the lifespan, not merely reproductive years. Residency programs should include structured curricula on menopausal physiology, HRT pharmacology, individualized risk assessment, and shared decision-making. Continuing medical education must address the knowledge deficit in practicing physicians. Professional societies should develop competency standards and certification pathways in menopausal medicine.

Clinical implications for practicing physicians

The 2025 policy reversal provides unambiguous guidance, yet implementation requires physician re-education after two decades of fear-based avoidance. HRT initiated within 10 years of menopause or before age 60 should be offered to symptomatic women after individualized discussion of benefits and risks.1 The “window of opportunity” concept (that therapy provides maximum benefit when initiated during the menopausal transition before permanent vascular and neurological changes occur) must become standard knowledge.

For women with moderate to severe vasomotor symptoms, night sweats disrupting sleep, or genitourinary syndrome of menopause, HRT represents first-line treatment with unmatched efficacy. The reduction in fracture risk (50-60%) exceeds that of bisphosphonates. The cardiovascular benefit (25-50% reduction in fatal events) in appropriately selected women rivals that of statin therapy. The cognitive protection (a 64% reduction in cognitive decline and a 35% reduction in Alzheimer’s disease) has no pharmaceutical equivalent.

Vaginal estrogen for genitourinary symptoms deserves particular emphasis. Low-dose vaginal preparations achieve tissue-level therapeutic concentrations with minimal systemic absorption. The safety profile is exceptional, yet these products carried black-box warnings for two decades. Every woman experiencing vaginal dryness, dyspareunia, or recurrent urinary tract infections deserves discussion of vaginal estrogen as first-line therapy.

Shared decision-making requires accurate risk communication. Physicians must master absolute risk presentation, contextual comparison, and acknowledgment of uncertainty. The conversation should include individual risk factors (personal and family history of breast cancer, cardiovascular disease, thromboembolism), symptom severity, patient preferences and values, and alternative treatments. But the conversation must also acknowledge the evidence: for appropriate candidates, HRT offers substantial benefit with acceptable risk.

The FDA’s removal of the recommendation to prescribe “at the lowest effective dose for the shortest duration” explicitly acknowledges that treatment decisions require clinical judgment, not algorithmic rules.1 Duration of therapy should be individualized based on ongoing symptoms, continuing benefits, and evolving risk profile, not arbitrary time limits.

The moral imperative: accountability and reform

Science self-corrects, but at what pace and at what cost? Women entering menopause in 2002 are now beyond the therapeutic window. They traversed postmenopausal life without therapy that might have prevented the fractures they suffered, the cardiovascular disease that killed their peers, or the cognitive decline they now experience. The quantifiable toll (91,000 premature deaths between 2002 and 2012, and 50 million women denied potentially life-saving treatment1,3) represents only the measurable harm. It excludes the immeasurable: decades of hot flashes, sleep disruption, sexual dysfunction, and diminished quality of life. It excludes the women who died from cardiovascular disease between 2013 and 2025. It excludes the global impact, as regulatory agencies worldwide followed FDA guidance.

This was preventable. The methodological critiques emerged in 2004, 21 years before regulatory reversal. The Danish study demonstrating cardiovascular benefit appeared in 2012, 13 years before reversal. The accumulating evidence was not ignored through ignorance but through institutional paralysis, asymmetric incentives, and risk aversion, privileging bureaucratic safety over patient welfare.

Accountability requires uncomfortable questions. Who decided that class-wide warnings based on one formulation represented an appropriate regulatory response? Who reviewed and reaffirmed these warnings annually for 23 years despite contradictory evidence? What mechanisms failed such that external scientific critique, published in peer-reviewed literature, triggered no institutional reassessment? Why did it require a new commissioner explicitly calling this “one of the greatest medical errors of modern times” before change occurred?

The medical establishment (regulatory agencies, professional societies, medical educators, practicing physicians) failed a generation of women. Commissioner Makary’s acknowledgment represents a crucial first step, but acknowledgment without structural reform ensures recurrence. The question is not whether factors delayed this change; the factors are documented above. The question is whether these factors have changed themselves, or whether the next regulatory overreach will require another 23-year correction cycle.

In medicine, time translates directly to outcomes. The therapy that prevents fractures prevents them only if administered before the fracture occurs. Cardiovascular protection matters only if provided before the myocardial infarction. Cognitive preservation requires intervention before neurodegeneration. Regulatory delay does not merely postpone benefit; it eliminates it for the cohort denied timely treatment.

Changing course when evidence indicates current policy harms more than helps is not merely scientific; it is an ethical imperative. The burden of proof should not fall exclusively on those advocating for therapy, but should be shared by those maintaining restrictions. When black-box warnings persist despite accumulating contrary evidence, the restriction itself requires justification, not merely its removal.

We must do better. The reforms proposed above (sunset clauses, mandatory reassessment, formal critique mechanisms, transparent risk communication, educational reinvestment) represent minimum requirements, not aspirational goals. Implementation should begin immediately, not await the next catastrophe. And critically, these reforms must extend beyond HRT to all areas where regulatory action shapes medical practice. The structural failures that enabled this disaster operate across therapeutic domains.

The women who should have received HRT in 2002 cannot be made whole. But we owe them, and the generations that follow, a medical regulatory system that privileges evidence over fear, that corrects course promptly when wrong, and that values human outcomes over institutional inertia.


Acknowledgements

None.

Funding Statement

This work received no specific funding from any source.

Competing Interests

The authors report no financial conflicts of interest. Z.S. published critical analyses of the Women’s Health Initiative study interpretation in 2004 and has advocated for evidence-based HRT use throughout the 23-year period discussed. Z.S. is also Editor-in-Chief of the Journal of IVF-Worldwide (JIVFww).

CRediT Authorship Contribution Statement

Conceptualization: Zeev Shoham (Equal), Ariel Weissman (Equal), Eli Y. Adashi (Equal). Investigation: Zeev Shoham (Equal), Ariel Weissman (Equal), Eli Y. Adashi (Equal). Writing – original draft: Zeev Shoham (Equal), Ariel Weissman (Equal), Eli Y. Adashi (Equal). Writing – review & editing: Zeev Shoham (Equal), Ariel Weissman (Equal), Eli Y. Adashi (Equal).

Attestation Statements

This analysis does not report original data from human or animal subjects and, therefore, did not require institutional review board approval. The authors confirm that all data and statements are derived from published peer-reviewed literature, FDA announcements, and public proceedings as cited in the references.

Data Sharing Statement

All data supporting this analysis are available in the published literature cited in the references. This manuscript contains no original research data.

Capsule

The 23-year delay in correcting hormone replacement therapy (HRT) policy caused 91,000 premature deaths; mandatory regulatory reforms are essential to prevent the recurrence of such catastrophic oversights.