### gwern branwen

Shared publicly -> ...For example, weightlifting enhances brain function, reverses sarcopenia, and lowers the death rate in cancer survivors. Take this last item,

[paper in question: "The Effect of Resistance Exercise on All-Cause Mortality in Cancer Survivors", Hardee et al 2014; fulltext: https://www.dropbox.com/s/vkuvrpyfftm4onm/2014-hardee.pdf / http://libgen.org/scimag/get.php?doi=10.1016%2Fj.mayocp.2014.03.018 ]

This is a bad study, but sadly the problems are common to the field. Claiming that this study shows 'weight lifting lowered death rates and aerobic exercise did not change survival' is making at least 4 errors:

1. correlation!=causation; this is simply your usual correlation study (you know, of the sort which is always wrong in diet studies?), where you look at some health records and crank out some p-values. There should be no expectation that this will prove to be causally valid; in particular, reverse confounding is pretty obvious here and should remind people of the debate about weight and mortality. (Ah, but you say that the difference they found between aerobic and resistance shows that it's

2. power: with only 121 total deaths (~4% of the sample), this is inadequate to detect any differences but comically large correlates of health, as the estimate of predicting a third less mortality indicates

3. p-hacking/multiplicity, type S errors, exaggeration factor: take a look at that 95% confidence interval for resistance exercise (which is the only result they report in the abstract), which is an HR of 0.45-0.99. In other words, if the correlate were even the tiniest bit bigger, it would no longer have the magical 'statistical significance at p<0.05'. There's at least 16 covariates and 3 full models tested. By the statistical significance filter, a HR of 0.67 will be a serious exaggeration (because only exaggerated estimates would - just barely - reach p=0.05 on this small dataset with only 121 deaths).

4. "The Difference Between 'Significant' and 'Not Significant' is Not Itself Statistically Significant" (http://www.stat.columbia.edu/~gelman/research/published/signif4.pdf): the difference between aerobic exercise and resistance exercise is

5. the fallacy of controlling for intermediate variables: in the models they fit, they include as covariates "body mass index, current smoking (yes or no), heavy drinking (yes or no), hypertension (present or not), diabetes (present or not), hypercholesterolemia (yes or no), and parental history of cancer (yes or no)." This makes no sense. Both resistance exercise and aerobic exercise will themselves influence BMI, smoking status, hypertension, diabetes, and hypercholesterolemia. What does it mean to estimate the correlation of exercise with health which excludes all impact it has on your health through BMI, blood pressure, etc? You might as well say, 'controlling for muscle percentage and body fat, we find weight lifting has no estimated benefits', or 'controlling for education, we find no benefits to IQ' or 'controlling for local infection rates, we find no mortality benefits to public vaccination'. This makes the results particularly nonsensical for the aerobic estimates if you want to interpret them as direct causal estimates - at most, the HR estimates here are an estimate of weird indirect effects ('the remaining effect of exercise after removing all effects mediated by the covariates'). Unfortunately, structural equation models and Bayesian networks are a lot harder to use and justify than just dumping a list of covariates into your survival analysis package, so expect to see a lot more controlling for intermediate variables in the future.

Any of these is sufficient. This sort of problem is why one should put more weight on meta-analyses of RCTs - for example, "Progressive resistance strength training for improving physical function in older adults" http://onlinelibrary.wiley.com/enhanced/doi/10.1002/14651858.CD002759.pub2

[above comment is still in moderation, so I'm putting it here as a copy.] #statistics #weightlifting

*lowering death rate in cancer survivors: garden-variety aerobic exercise had no effect on survival, while resistance training lowered death rates by one third*... --http://roguehealthandfitness.com/case-for-weightlifting-as-anti-aging/[paper in question: "The Effect of Resistance Exercise on All-Cause Mortality in Cancer Survivors", Hardee et al 2014; fulltext: https://www.dropbox.com/s/vkuvrpyfftm4onm/2014-hardee.pdf / http://libgen.org/scimag/get.php?doi=10.1016%2Fj.mayocp.2014.03.018 ]

This is a bad study, but sadly the problems are common to the field. Claiming that this study shows 'weight lifting lowered death rates and aerobic exercise did not change survival' is making at least 4 errors:

1. correlation!=causation; this is simply your usual correlation study (you know, of the sort which is always wrong in diet studies?), where you look at some health records and crank out some p-values. There should be no expectation that this will prove to be causally valid; in particular, reverse confounding is pretty obvious here and should remind people of the debate about weight and mortality. (Ah, but you say that the difference they found between aerobic and resistance shows that it's

**not**confounding because health bias should operate equally? Well, read on...)2. power: with only 121 total deaths (~4% of the sample), this is inadequate to detect any differences but comically large correlates of health, as the estimate of predicting a third less mortality indicates

3. p-hacking/multiplicity, type S errors, exaggeration factor: take a look at that 95% confidence interval for resistance exercise (which is the only result they report in the abstract), which is an HR of 0.45-0.99. In other words, if the correlate were even the tiniest bit bigger, it would no longer have the magical 'statistical significance at p<0.05'. There's at least 16 covariates and 3 full models tested. By the statistical significance filter, a HR of 0.67 will be a serious exaggeration (because only exaggerated estimates would - just barely - reach p=0.05 on this small dataset with only 121 deaths).

4. "The Difference Between 'Significant' and 'Not Significant' is Not Itself Statistically Significant" (http://www.stat.columbia.edu/~gelman/research/published/signif4.pdf): the difference between aerobic exercise and resistance exercise is

**not**statistically-significant in this study. The HR in model 1 for aerobic exercise is (0.63-1.32), and for aerobic exercise, (0.46-0.99). That is, the confidence intervals overlap. (Specifically, comparing the proportion of aerobic exercisers who died with the resistance exercisers who died, I get `prop.test(c(39,75), c(1251,1746))` = p=0.12; to compute a survival curve I would need more data, I think.) The study itself does not anywhere seem to directly compare aerobic with resistance but always works in a stratified setting; I don't know if they don't realize this point about the null hypotheses they're testing, or if they did do the logrank test and it came out non-significant and they quietly dropped it from the paper.5. the fallacy of controlling for intermediate variables: in the models they fit, they include as covariates "body mass index, current smoking (yes or no), heavy drinking (yes or no), hypertension (present or not), diabetes (present or not), hypercholesterolemia (yes or no), and parental history of cancer (yes or no)." This makes no sense. Both resistance exercise and aerobic exercise will themselves influence BMI, smoking status, hypertension, diabetes, and hypercholesterolemia. What does it mean to estimate the correlation of exercise with health which excludes all impact it has on your health through BMI, blood pressure, etc? You might as well say, 'controlling for muscle percentage and body fat, we find weight lifting has no estimated benefits', or 'controlling for education, we find no benefits to IQ' or 'controlling for local infection rates, we find no mortality benefits to public vaccination'. This makes the results particularly nonsensical for the aerobic estimates if you want to interpret them as direct causal estimates - at most, the HR estimates here are an estimate of weird indirect effects ('the remaining effect of exercise after removing all effects mediated by the covariates'). Unfortunately, structural equation models and Bayesian networks are a lot harder to use and justify than just dumping a list of covariates into your survival analysis package, so expect to see a lot more controlling for intermediate variables in the future.

Any of these is sufficient. This sort of problem is why one should put more weight on meta-analyses of RCTs - for example, "Progressive resistance strength training for improving physical function in older adults" http://onlinelibrary.wiley.com/enhanced/doi/10.1002/14651858.CD002759.pub2

[above comment is still in moderation, so I'm putting it here as a copy.] #statistics #weightlifting

8

3

Add a comment...