How often does correlation=causality? "Contradicted and Initially Stronger Effects in Highly Cited Clinical Research" , Ioannidis 2005:
"5 of 6 highly-cited nonrandomized studies had been contradicted or had found stronger effects vs 9 of 39 randomized controlled trials (P=.008)...Matched control studies did not have a significantly different share of refuted results than highly cited studies, but they included more studies with “negative” results.
Similarly, there is some evidence on disagreements between epidemiological studies and randomized trials. 3-5
For highly cited nonrandomized studies, subsequently published pertinent randomized trials and metaanalyses thereof were eligible regardless of sample size; nonrandomized evidence was also considered, if randomized trials were not available.
5 of 6 highly cited nonrandomized studies had been contradicted or had initially stronger effects while this was seen in only 9 of 39 highly cited randomized trials (P=.008). T ABLE 3 shows that trials with contradicted or initially stronger effects had significantly smaller sample sizes and tended to be older than those with replicated or unchallenged findings. There were no significant differences on the type of disease. The proportion of contradicted or initially stronger effects did not differ significantly across journals (P = .60).
Small studies using surrogate markers may also sometimes lead to erroneous clinical inferences. 158 There were only 2 studies with typical surrogate markers among the highly cited studies examined herein, but both were subsequently contradicted in their clinical extrapolations about the efficacy of nitric oxide 22 and hormone therapy. 42
Box 2. Contradicted and Initially Stronger Effects in Control Studies Contradicted Findings
In a prospective cohort, 91 vitamin A was inversely related to breast cancer (relative risk in the highest quintile, 0.84; 95% confidence interval [CI], 0.71-0.98) and vitamin A supplementation was associated with a reduced risk (P=.03) in women at the lowest quintile group; in a randomized trial 128 exploring further the retinoid-breast cancer hypothesis, fenretinide treatment of women with breast cancer for 5 years had no effect on the incidence of second breast malignancies.
A trial (n = 51) showed that cladribine significantly improved the clinical scores of patients with chronic progressive multiple sclerosis. 119 In a larger trial of 159 patients, no significant treatment effects were found for cladribine in terms of changes in clinical scores. 129
Initially Stronger Effects
A trial (n = 28) of aerosolized ribavirin in infants receiving mechanical ventilation for severe respiratory syncytial virus infection 82 showed significant decreases in mechanical ventilation (4.9 vs 9.9 days) and hospital stay (13.3 vs 15.0 days). A metaanalysis of 3 trials (n = 104) showed a decrease of only 1.8 days in the duration of mechanical ventilation and a nonsignificant decrease of 1.9 days in duration of hospitalization. 130
A trial (n = 406) of intermittent diazepam administered during fever to prevent recurrence of febrile seizures 90 showed a significant 44% relative risk reduction in seizures. The effect was smaller in other trials and the overall risk reduction was no longer formally significant 131 ; moreover, the safety profile of diazepam was deemed unfavorable to recommend routine preventive use.
A case-control and cohort study evaluation 92 showed that the increased risk of sudden infant death syndrome among infants who sleep prone is increased by use of natural-fiber mattresses, swaddling, and heating in bedrooms. Several observational studies have been done since, and they have provided inconsistent results on these interventions, in particular, they disagree on the possible role of overheating. 132
A trial of 54 children 95 showed that the steroid budenoside significantly reduced the croup score by 2 points at 4 hours, and significantly decreased readmissions by 86%. A meta-analysis (n=3736) 133 showed a significant improvement in the Westley score at 6 hours (1.2 points), and 12 hours (1.9 points), but not at 24 hours. Fewer return visits and/or (re)admissions occurred in patients treated with glucocorticoids, but the relative risk reduction was only 50% (95% CI, 24%-64%).
A trial (n = 55) showed that misprostol was as effective as dinoprostone for termination of second-trimester pregnancy and was associated with fewer adverse effects than dinoprostone. 96 A subsequent trial 134 showed equal efficacy, but a higher rate of adverse effects with misoprostol (74%) than with dinoprostone (47%).
A trial (n=50) comparing botulinum toxin vs glyceryl trinitrate for chronic anal fissure concluded that both are effective alternatives to surgery but botulinum toxin is the more effective nonsurgical treatment (1 failure vs 9 failures with nitroglycerin). 109 In a meta-analysis 135 of 31 trials, botulinum toxin compared with placebo showed no significant efficacy (relative risk of failure, 0.75; 95% CI, 0.32-1.77), and was also no better than glyceryl trinitrate (relative risk of failure, 0.48; 95% CI, 0.211.10); surgery was more effective than medical therapy in curing fissure (relative risk of failure, 0.12; 95% CI, 0.07-0.22).
A trial of acetylcysteine (n = 83) showed that it was highly effective in preventing contrast nephropathy (90% relative risk reduction). 110 There have been many more trials and many meta-analyses on this topic. The latest meta-analysis 136 shows a nonsignificant 27% relative risk reduction with acetylcysteine.
A trial of 129 stunted Jamaican children found that both nutritional supplementation and psychosocial stimulation improved the mental development of stunted children; children who got both interventions had additive benefits and achieved scores close to those of nonstunted children. 117 With long-term follow-up, however, it was found that the benefits were small and the 2 interventions no longer had additive effects. 137
It is possible that high-profile journals may tend to publish occasionally very striking findings and that this may lead to some difficulty in replicating some of these findings. 163 Poynard et al ["Truth Survival in Clinical Research: An Evidence-Based Requiem?" http://www.planetadoctor.com/documentos/MBE-herramienta/13.pdf ] evaluated the conclusions of hepatology-related articles published between 1945 and 1999 and found that, overall, 60% of these conclusions were considered to be true in 2000 and that there was no difference between randomized and nonrandomized studies or high- vs low-quality studies. Allowing for somewhat different definitions, the higher rates of refutation and the generally worse performance of nonrandomized studies in the present analysis may stem from the fact that I focused on a selected sample of the most noticed and influential clinical research. For such highly cited studies, the turnaround of “truth” may be faster; in particular non-randomized studies may be more likely to be probed and challenged than non-randomized studies published in the general literature."
"5 of 6 highly-cited nonrandomized studies had been contradicted or had found stronger effects vs 9 of 39 randomized controlled trials (P=.008)...Matched control studies did not have a significantly different share of refuted results than highly cited studies, but they included more studies with “negative” results.
Similarly, there is some evidence on disagreements between epidemiological studies and randomized trials. 3-5
For highly cited nonrandomized studies, subsequently published pertinent randomized trials and metaanalyses thereof were eligible regardless of sample size; nonrandomized evidence was also considered, if randomized trials were not available.
5 of 6 highly cited nonrandomized studies had been contradicted or had initially stronger effects while this was seen in only 9 of 39 highly cited randomized trials (P=.008). T ABLE 3 shows that trials with contradicted or initially stronger effects had significantly smaller sample sizes and tended to be older than those with replicated or unchallenged findings. There were no significant differences on the type of disease. The proportion of contradicted or initially stronger effects did not differ significantly across journals (P = .60).
Small studies using surrogate markers may also sometimes lead to erroneous clinical inferences. 158 There were only 2 studies with typical surrogate markers among the highly cited studies examined herein, but both were subsequently contradicted in their clinical extrapolations about the efficacy of nitric oxide 22 and hormone therapy. 42
Box 2. Contradicted and Initially Stronger Effects in Control Studies Contradicted Findings
In a prospective cohort, 91 vitamin A was inversely related to breast cancer (relative risk in the highest quintile, 0.84; 95% confidence interval [CI], 0.71-0.98) and vitamin A supplementation was associated with a reduced risk (P=.03) in women at the lowest quintile group; in a randomized trial 128 exploring further the retinoid-breast cancer hypothesis, fenretinide treatment of women with breast cancer for 5 years had no effect on the incidence of second breast malignancies.
A trial (n = 51) showed that cladribine significantly improved the clinical scores of patients with chronic progressive multiple sclerosis. 119 In a larger trial of 159 patients, no significant treatment effects were found for cladribine in terms of changes in clinical scores. 129
Initially Stronger Effects
A trial (n = 28) of aerosolized ribavirin in infants receiving mechanical ventilation for severe respiratory syncytial virus infection 82 showed significant decreases in mechanical ventilation (4.9 vs 9.9 days) and hospital stay (13.3 vs 15.0 days). A metaanalysis of 3 trials (n = 104) showed a decrease of only 1.8 days in the duration of mechanical ventilation and a nonsignificant decrease of 1.9 days in duration of hospitalization. 130
A trial (n = 406) of intermittent diazepam administered during fever to prevent recurrence of febrile seizures 90 showed a significant 44% relative risk reduction in seizures. The effect was smaller in other trials and the overall risk reduction was no longer formally significant 131 ; moreover, the safety profile of diazepam was deemed unfavorable to recommend routine preventive use.
A case-control and cohort study evaluation 92 showed that the increased risk of sudden infant death syndrome among infants who sleep prone is increased by use of natural-fiber mattresses, swaddling, and heating in bedrooms. Several observational studies have been done since, and they have provided inconsistent results on these interventions, in particular, they disagree on the possible role of overheating. 132
A trial of 54 children 95 showed that the steroid budenoside significantly reduced the croup score by 2 points at 4 hours, and significantly decreased readmissions by 86%. A meta-analysis (n=3736) 133 showed a significant improvement in the Westley score at 6 hours (1.2 points), and 12 hours (1.9 points), but not at 24 hours. Fewer return visits and/or (re)admissions occurred in patients treated with glucocorticoids, but the relative risk reduction was only 50% (95% CI, 24%-64%).
A trial (n = 55) showed that misprostol was as effective as dinoprostone for termination of second-trimester pregnancy and was associated with fewer adverse effects than dinoprostone. 96 A subsequent trial 134 showed equal efficacy, but a higher rate of adverse effects with misoprostol (74%) than with dinoprostone (47%).
A trial (n=50) comparing botulinum toxin vs glyceryl trinitrate for chronic anal fissure concluded that both are effective alternatives to surgery but botulinum toxin is the more effective nonsurgical treatment (1 failure vs 9 failures with nitroglycerin). 109 In a meta-analysis 135 of 31 trials, botulinum toxin compared with placebo showed no significant efficacy (relative risk of failure, 0.75; 95% CI, 0.32-1.77), and was also no better than glyceryl trinitrate (relative risk of failure, 0.48; 95% CI, 0.211.10); surgery was more effective than medical therapy in curing fissure (relative risk of failure, 0.12; 95% CI, 0.07-0.22).
A trial of acetylcysteine (n = 83) showed that it was highly effective in preventing contrast nephropathy (90% relative risk reduction). 110 There have been many more trials and many meta-analyses on this topic. The latest meta-analysis 136 shows a nonsignificant 27% relative risk reduction with acetylcysteine.
A trial of 129 stunted Jamaican children found that both nutritional supplementation and psychosocial stimulation improved the mental development of stunted children; children who got both interventions had additive benefits and achieved scores close to those of nonstunted children. 117 With long-term follow-up, however, it was found that the benefits were small and the 2 interventions no longer had additive effects. 137
It is possible that high-profile journals may tend to publish occasionally very striking findings and that this may lead to some difficulty in replicating some of these findings. 163 Poynard et al ["Truth Survival in Clinical Research: An Evidence-Based Requiem?" http://www.planetadoctor.com/documentos/MBE-herramienta/13.pdf ] evaluated the conclusions of hepatology-related articles published between 1945 and 1999 and found that, overall, 60% of these conclusions were considered to be true in 2000 and that there was no difference between randomized and nonrandomized studies or high- vs low-quality studies. Allowing for somewhat different definitions, the higher rates of refutation and the generally worse performance of nonrandomized studies in the present analysis may stem from the fact that I focused on a selected sample of the most noticed and influential clinical research. For such highly cited studies, the turnaround of “truth” may be faster; in particular non-randomized studies may be more likely to be probed and challenged than non-randomized studies published in the general literature."