Can RCTs be misleading and biased?


Frank J. Veith

Randomized controlled trials (RCTs) constitute level 1 evidence, which is widely considered the best data upon which to base medical practice. This is particularly true when the RCTs are published in leading journals like the New England Journal of Medicine or Lancet. Such trials are viewed by many as the Holy Grail of medicine and thus infallible and inviolate.

However, RCTs can have many flaws that render them obsolete, non-applicable or outright misleading. More importantly RCTs can be misinterpreted or spun by their authors or others so that they exert an effect on practice trends or standards unjustified by their data.

Possible flaws in RCTs are of two types:

1. Timeliness flaws can occur when progress is made in the treatment under evaluation arm or the control arm. Examples would be the early trials of carotid stenting (CAS) vs. carotid endarterectomy (CEA). If progress in CAS technology or patient selection occurs, a trial showing CAS inferiority becomes invalid. In contrast, the landmark trials showing CEA to be superior to medical treatment in preventing strokes have become obsolete because dramatic progress has been made with medical treatment.

2. Many design flaws can impair the validity of RCTs. These include patient selection flaws (e.g. in SAPPHIRE, patients were selected for randomization only if they were high risk for CEA). SAPPHIRE also included 71% asymptomatic patients in whom the high adverse event rates for both CEA and CAS were unjustified. Good medical treatment would have served these patients better. CREST also had patient selection flaws. It was originally designed to compare CAS and CEA only in symptomatic patients. When adequate numbers of patients could not be recruited, asymptomatic patients were added, thereby diluting the power of the study and impairing the statistical significance of some of its results.

Other design flaws include questionable competence of operators in a trial (e.g. the CAS operators in the EVA-3S and ICSS trials); problems with randomization (e.g. SAPPHIRE in which only 10% of eligible patients were randomized); and questionable applicability of RCT results to real world practice (e.g. CAS operators in CREST were highly vetted and more skilled than others performing the procedure).

There are also idiosyncratic flaws, as in the EVAR 2 trial in patients unfit for open repair. Although this trial, published in Lancet, showed EVAR to have similar mortality to no treatment, half the deaths in the group randomized to EVAR occurred from rupture during a lengthy (average 57 days) waiting period before treatment. Had these deaths been prevented by a more timely EVAR, the conclusion of EVAR 2 might have been different.

Inappropriate or questionable primary endpoints in RCTS are another design flaw that can lead to misleading conclusions. An example is the inclusion of minor myocardial infarctions (MIs) with strokes and deaths as a composite endpoint in a CAS vs. CEA trial (e.g. SAPPHIRE and CREST).

The components of the primary endpoint in the CAS and CEA arms of CREST were death, stroke, and myocardial infarction. Total stroke and minor strokes were both significantly different in the two groups in favor of CEA, and death and major strokes, although not significantly different between the two groups were both numerically higher for CAS. (See complete table oline at www.

Although it is arguable, it is hard to understand how minor MIs are the equivalent of strokes and deaths, and only when MIs were included were the adverse event rates in the two groups similar (7.2% for CAS vs 6.8% for CEA, P = .051).

So much for the flaws in RCTs. What about good trials or those with only minor weaknesses? Even these can result in misleading conclusions when the authors reach conclusions unjustified by their own data. SAPPHIRE and CREST are two recent examples.

Despite the flaws in these trials, both of which were reported in the New England Journal of Medicine, the authors concluded that “with high risk patients CAS and CEA are equivalent treatments” (SAPPHIRE) and “among patients with symptomatic and asymptomatic carotid stenosis, the risk of the composite primary end-point … did not differ significantly in the group undergoing CAS and the group undergoing CEA” (CREST).

Although the CREST authors pointed out the higher incidence of stroke with stenting, others have used the CREST study to claim equivalence of CAS and CEA. Nowhere is this more apparent than in the recent American Heart Association (AHA) Guideline on the management of patients with extracranial carotid and vertebral artery disease.

This important and influential document, which was also approved by 13 other organizations including the SVS, stated that “CAS is indicated as an alternative to CEA for symptomatic patients at average or low risk of complications associated from endovascular interventions….” In Webster’s Dictionary one definition of “alternative” is “a choice between 2 things”.

This clearly implies equivalence, and it has been so interpreted by many others, particularly those biased toward catheter based treatment. Of note, the AHA Guideline appears to be based largely on CREST, and did not even consider the findings of the ICSS trial, published in Lancet the same day as the main article reporting CREST.

Although ICSS may also have flaws, it showed, in a large group of only symptomatic patients, that CAS produced significantly more strokes and diffusion weighted MRI defects than did CEA. It is hard to understand why these ICSS results did not have more of an influence on the AHA Guideline.

Although my bias as a CAS enthusiast makes me believe that CAS will ultimately have a major role in the treatment of carotid stenosis patients, that bias is not yet sufficient for me to spin the data and believe we are now there. One has to wonder if bias more intense than mine was involved in the conclusion reached in the AHA Guideline.

Thus, it is apparent that misleading conclusions can be reached in articles reporting RCTs in leading journals. These can be the result of flaws in the RCTs and/or unrecognized author bias. More importantly, the results of even good trials can be further misinterpreted by others to guide practice standards in a way unjustified by the data.

It is important for all to recognize the possible role of bias in these misinterpretations. By recognizing the possible flaws in RCTs and that physicians, like all other people, are influenced by bias, we can exercise the judgment to use RCTs fairly to help us treat individual patients optimally.n

Frank J. Veith, MD, is professor of surgery at New York University Medical Center and professor of surgery and William J. von Liebig Chair in vascular surgery at Case Western Reserve University and The Cleveland Clinic.


Please enter your comment!
Please enter your name here