IVF add-ons: Turning controversy into an agenda for evidence, equity, and action

Zeev Shoham

doi:10.46989/001c.159812

In vitro fertilization (IVF) has always been a story of innovation. From its inception, the field has relied on technological advances to turn what was once an insurmountable medical obstacle into a treatable condition for millions of patients worldwide. Yet, with the proliferation of a new generation of “add-on” technologies, including time-lapse imaging, preimplantation genetic testing for aneuploidy (PGT-A), and artificial intelligence (AI)-based embryo selection, among a growing list of laboratory and clinical interventions, the central imperative has shifted; it is no longer whether to innovate, but how to innovate responsibly, under what evidentiary standards, at what cost, and with what consequences for equity and trust.

In this issue, Weissman et al. argue that IVF exemplifies a broader “add-on crisis” in contemporary medicine, in which expensive technologies are adopted based on biological plausibility and surrogate outcomes rather than on robust demonstration of benefit in patient-important endpoints such as cumulative live-birth rate (CLBR).¹ They show that key IVF add-ons have not improved live-birth rates per initiated cycle and may, in some settings, have coincided with stagnating or declining outcomes even as costs have risen sharply.¹ In response, Bamford calls for greater nuance, suggesting that some add-ons may confer benefits not fully captured by traditional trial designs or outcome measures, including expanding pools of usable embryos, improving laboratory standardization, and potentially reducing miscarriages.²

This journal does not intend to take sides in this debate. Rather, our responsibility is to use this exchange as an opportunity to clarify which standards we should apply to emerging technologies, to delineate the outcomes that matter, and to define concrete expectations for future research and clinical practice.

What outcomes should count, and why does the field so often struggle to apply the CLBR benchmark in practice?

Much of the disagreement around add-ons reflects a deeper, unresolved question: what counts as a meaningful outcome in IVF? But beneath this ostensibly methodological dispute lies a more troubling pattern: even when the answer to this question is clear, the field has frequently fallen short of applying it consistently.

Weissman et al. place cumulative live-birth rate per initiated cycle at the center of their argument.¹ From this vantage point, they show that time-lapse imaging, universal PGT-A, and various AI-based embryo selection platforms have not translated into higher take-home baby rates despite substantial incremental cost to patients and payors. Their analysis highlights that, in some large datasets, the widespread introduction of add-ons has coincided with plateauing or declining outcomes in autologous fresh cycles, while per-cycle costs have multiplied.³ Although such ecological associations cannot establish causality at the individual patient level, they underscore the absence of demonstrable population-level benefit despite widespread adoption. This is the essence of the “paying more for no better outcome” critique.

Bamford, by contrast, cautions against judging all add-ons solely by their impact on cumulative live-birth rate in unselected populations.² He notes that time-lapse imaging has changed embryology practice by enabling continuous monitoring of embryo development, detection of abnormal cleavage patterns, and identification of fertilization events that might otherwise go unnoticed. PGT-A, while not increasing cumulative live-birth rates in randomized trials of unselected patients, may reduce miscarriage risk and optimize per-transfer outcomes in particular subgroups.^4–6 The recent committee opinion of the American Society for Reproductive Medicine and the Society for Assisted Reproductive Technology (SART) emphasizes that any potential benefits of PGT-A are population-specific and that routine use across all IVF patients is not justified.⁷ Although not yet demonstrating superiority in live-birth outcomes, AI-based embryo selection shows promising performance in predictive modelling and reducing inter-observer variability,^8,9

These observations raise a legitimate question: should we consider only cumulative live-birth rate per initiated cycle, or should validated improvements in miscarriage risk, time to pregnancy, laboratory efficiency, and psychological burden also be weighed, especially when patients explicitly value these outcomes?

The answer must be both principled and pragmatic. At the population level, cumulative live-birth rate per initiated cycle remains the primary endpoint by which IVF success should be measured, because it integrates both the probability of achieving a live birth and the impact of interventions on the number of embryos transferred and the number of cycles undertaken.^10,11 Secondary outcomes, such as miscarriage, time to pregnancy, treatment burden, and validated psychological measures, are also clinically important. They should not replace live birth as the primary endpoint; rather, they should be rigorously incorporated as secondary endpoints and reported transparently.

The more pressing question is why the CLBR benchmark, which is broadly accepted in principle, is so rarely applied in practice. The answer lies not in scientific confusion but in systemic incentives. Technologies that offer plausible biological mechanisms and improve surrogate markers are easier to sell to patients, clinics, and investors than to validate. The cost of a well-powered randomized controlled trial (RCT) falls on the developer; the cost of an unvalidated add-on falls on the patient. Until that asymmetry is corrected by regulatory expectations, professional society standards, and journal policies, the field will continue to accumulate technologies that are commercially successful yet weakly evidenced.

Evidence Gaps and Commercial Reality

Despite their differing emphases, Weissman et al. and Bamford share several points of agreement. Both acknowledge that numerous add-ons entered widespread clinical practice before robust evidence from either adequately powered RCTs or high-quality observational studies was available.^1,2 Both accept that the largest RCTs to date of time-lapse imaging systems and PGT-A in unselected populations have not demonstrated improved live-birth outcomes. Both recognize that, at present, AI-based embryo selection remains experimental with respect to live-birth outcomes, albeit promising in terms of diagnostic performance and workflow standardization.^8,9,12

Where they diverge is in what these facts imply. For Weissman et al., the failure of these technologies to improve cumulative live-birth rates, despite considerable added cost, signals a systemic failure in how innovation is evaluated and adopted, particularly in non-life-saving fields such as fertility. They underscore that IVF is uniquely well-positioned to demand higher evidentiary standards, given its clear primary outcome (live birth) and the existence of large, high-quality national registries such as the SART database in the United States and the Human Fertilisation and Embryology Authority (HFEA) Register of Information in the United Kingdom.^1,13,14

Bamford, by contrast, emphasizes that technologies such as time-lapse imaging and PGT-A may have legitimate roles in selected populations or specific clinical contexts, and that their contributions may be underestimated when trials focus narrowly on first-transfer live birth or on unselected cohorts.² He reminds us that IVF laboratories operate within complex systems where marginal standardization improvements or inter-observer variability reductions can translate into meaningful, if difficult-to-measure, benefits over time.

Overlaying these scientific disagreements is an inescapable commercial reality. The IVF sector has undergone a marked shift from physician-owned clinics to corporate and private-equity-owned networks, within which add-ons often generate new revenue streams. Frequently paying out of pocket, patients encounter direct-to-consumer marketing that frames add-ons as “cutting-edge” or “personalized,” even when the evidence for improved live-birth outcomes is weak or absent.¹⁵ This convergence of emotional vulnerability, high willingness to pay, and aggressive marketing creates fertile ground for premature adoption and persistent use of interventions that may be, at best, neutral and, at worst, harmful or wasteful.

In such environments, uncertainty cannot be resolved by wishful thinking or by appeals to biological plausibility alone. Nor can it justify a reflexive rejection of all unproven technologies. Instead, it demands transparent communication, explicit labelling of experimental interventions, and a disciplined commitment to generating high-quality evidence.

Ethical Duties: Truthfulness, Equity, and Trust

Beyond methodology, the add-on debate is fundamentally about professional ethics. Weissman et al. highlight how overselling unvalidated add-ons may ultimately erode public trust in IVF clinicians and in medical innovation more broadly.¹ When patients discover that costly interventions that they believed were essential may not improve, and may even diminish, their chances of success, their disappointment can crystallize into disillusionment, not only with individual practitioners but with the health care system as a whole. At the same time, Bamford’s response reminds us that miscarriage and treatment burden carry substantial psychological tolls, and that many patients may reasonably prioritize reductions in these risks even if the cumulative live-birth rate remains unchanged.² Ethically, both perspectives matter.

From an editorial perspective, three duties come into particular focus:

Truthful communication and labelling of uncertainty. Interventions for which evidence of benefit on cumulative live-birth rate is absent, weak, or population-specific should neither be presented as the standard of care nor marketed beyond what the data support. Treatment professionals must clearly and explicitly explain during both counselling and consent when benefits are limited to specific subgroups or secondary outcomes (e.g., miscarriage reduction in particular ages or prognosis categories).
Equity of access and opportunity cost. Many patients fund add-ons entirely out of pocket, with per-cycle costs that can rise to 20% or more of total IVF expenses. Money devoted to unvalidated add-ons may reduce patients’ ability to finance either additional cycles, if necessary, or proven interventions, thereby exacerbating socioeconomic disparities and potentially reducing overall live-birth chances for those with limited resources.

Economic evaluation must accompany clinical evaluation. The relevant metrics are not the additional fees charged per cycle, but the incremental cost per additional live birth achieved. In health economics, this is expressed as the incremental cost-effectiveness ratio (ICER), which compares the added cost of an intervention with the additional benefit it delivers. In IVF, where cumulative live birth per initiated cycle is the principal outcome, add-ons that increase per-cycle costs without improving this endpoint, by definition, raise the cost per live birth and may reduce overall access by diverting finite patient resources away from additional cycles that have proven benefit.^3,10,15 Without transparent reporting of costs alongside outcomes, claims of innovation remain incomplete, because value in medicine is defined by outcomes achieved relative to resources expended.
Vigilance regarding conflicts of interest and systemic incentives. Ownership structures and revenue models influence how strongly clinics promote particular add-ons. Observational data suggests that PGT-A utilization is higher in clinic networks than in physician-owned centers, without corresponding gains in live-birth rates.¹⁶ It is not sufficient to disclose individual conflicts of interest (such as care provider investment in technologies they are promoting); journals, professional societies, and regulators must also acknowledge and, where feasible, counteract systemic incentives that favor rapid adoption and slow de-implementation.

Addressing these ethical duties is not an academic exercise. It is central to maintaining the fragile trust that underpins the therapeutic alliance in fertility care.

A Constructive Agenda: What This Journal Will Expect

Looking forward, the choice is not between embracing innovation and rejecting it. Rather, it is between continuing a status quo that tolerates weak evidentiary standards for expensive technologies, and constructing a more disciplined, transparent, and equitable model of innovation.

IVF is uniquely placed to pioneer such a model. Live birth is a clear and measurable primary endpoint. Large registries such as the SART database and the HFEA Register already collect detailed data on cycles, procedures, and outcomes.^10,13,14 Professional societies, including ASRM and HFEA, have begun to articulate grading systems and guidance for add-ons, such as traffic-light classifications that distinguish interventions by evidence of benefit, uncertain benefit, or evidence of harm.^7,13,14

Building on these insights, this journal will encourage work that advances methodological rigor across several interconnected dimensions. Manuscript authors should define cumulative live-birth rate per initiated cycle as the primary outcome, explicitly distinguishing it from per-transfer outcomes, with secondary outcomes pre-specified accordingly. Interventions lacking robust evidence should be described as experimental, with clear justification and transparent acknowledgement of uncertainties. These guidelines apply to add-ons that have entered routine practice or have become the de facto standard through commercial pressure, yet without adequate evidence.

AI-driven tools for embryo selection, sperm assessment, and cycle prediction offer genuine promise but are susceptible to overfitting and premature deployment; accordingly, this journal will require rigorous methodological transparency and clearly defined clinical endpoints for all AI-related submissions.

In the context of AI-driven tools, additional safeguards are required beyond conventional evidentiary standards. Predictive accuracy using retrospective datasets does not equate to clinical utility. AI systems must demonstrate external validation in independent cohorts that are geographically and temporally distinct from original AI training datasets. AI systems must also undergo rigorous prospective evaluation using pre-specified clinical endpoints and evidence of improvement in cumulative live-birth rates. Gains in intermediate morphokinetic or classification metrics alone are insufficient to establish true clinical value. Transparent reporting compliant with Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis Or Diagnosis (TRIPOD)+AI and Consolidated Standards of Reporting Trials (CONSORT)-AI guidelines is mandatory; regulatory pathway clarity (e.g., AI classification as software or a medical device) should be explicitly disclosed. Post-deployment monitoring for performance drift and bias across demographic subgroups is required, particularly in a field where patient populations differ substantially in age, prognosis, or access to care. Without such safeguards, algorithmic sophistication can risk being mistaken for demonstrated clinical value.

This journal will also welcome pragmatic trials, cluster-randomized designs, and registry-based observational studies. Post-marketing surveillance will become the norm for newly introduced technologies. Manuscripts that present findings about add-ons should report direct costs and who bears them; submissions should explicitly describe relationships with device and diagnostic manufacturers alongside relevant organizational and institutional incentives. Without cost data and full transparency, meaningful value assessments remain impossible.

By articulating these expectations, the journal aims to foster a culture that rewards and sharpens genuine innovation and accelerates the identification, dissemination, and adoption of technologies that truly add value, while ensuring that those that do not are promptly de-emphasized or de-implemented.

Conclusion: IVF as a Stress-Test for Responsible Innovation

IVF sits at the junction of hope, technology, commerce, and ethics. It is an elective, non-acute, but emotionally charged, high-impact field in which patient distress is high, alternatives are limited, and patient willingness to pay for even marginal perceived advantages is substantial. These characteristics make it particularly vulnerable to an “add-on phenotype” of innovation, but they also make it an ideal testing ground for a more rigorous and compassionate model.

The debate between Weissman et al. and Bamford underscores that we cannot rely on anecdotes, enthusiasm, or marketing narratives to guide the future of IVF add-ons.^1,2 Nor should we assume that all emerging technologies are without merit or harmful. The challenge is systemic as much as scientific: as long as the commercial incentives to adopt innovations are stronger than the regulatory and professional incentives to validate them, new generations of add-ons will follow the same trajectory as those that came before them.

Therefore, our task is to insist on clarity, rigor, and transparency: about which outcomes matter, what the current evidence shows, where the uncertainties lie, what the interventions cost, and who benefits, or does not, from their use. This demands particular vigilance in the emerging AI era, in which algorithmic sophistication can be mistaken for clinical utility and where the distance between a promising validation study and widespread clinical deployment has never been shorter.

If a field with as clear an endpoint as live birth cannot align innovation with evidence, cost, equity, and transparency, it is difficult to see how other non-life-saving disciplines will do better. Conversely, if fertility medicine can pioneer robust standards for evaluating and implementing add-ons, including the next generation of AI-driven tools, it may offer a template for responsible innovation across modern health care.

This journal is committed to supporting that effort.

DECLARATION OF GENERATIVE AI AND AI-ASSISTED TECHNOLOGIES IN THE WRITING PROCESS

To prepare this manuscript, the author(s) used artificial intelligence software, including Grok, Claude AI, ChatGPT, and Open Science, to organize references. The author reviewed and edited the content as needed after using the tools and takes full responsibility for the article’s content.

FUNDING STATEMENT

No specific funding was received for this editorial.

DISCLOSURE STATEMENT

Z.S. is Editor-in-Chief of the Journal of IVF-Worldwide.

ATTESTATION STATEMENT

This editorial does not involve human participants or patient data; therefore, ethics approval by an institutional review board was not required.

The datasets analyzed for this editorial are available from the corresponding author upon reasonable request. Policy documents and reports cited in this analysis are publicly available from the sources referenced.

CREDIT AUTHORSHIP CONTRIBUTION STATEMENT

Conceptualization: Zeev Shoham. Data curation: Zeev Shoham. Formal Analysis: Zeev Shoham. Investigation: Zeev Shoham. Methodology: Zeev Shoham. Project administration: Zeev Shoham. Resources: Zeev Shoham. Validation: Zeev Shoham. Writing – original draft: Zeev Shoham. Writing – review & editing: Zeev Shoham.