How often does correlation=causality? While I'm at it, here's an example of how not to do it... "A weight of evidence approach to causal inference", Swaen & van Amelsvoort 2009:
"_Objective_: The Bradford Hill criteria are the best available criteria for causal inference. However, there is no information on how the criteria should be weighed and they cannot be combined into one probability estimate for causality. Our objective is to provide an empirical basis for weighing the Bradford Hill criteria and to develop a transparent method to estimate the probability for causality.
Study Design and Setting: All 159 agents classified by International Agency for Research of Cancer as category 1 or 2A carcinogens were evaluated by applying the nine Bradford Hill criteria. Discriminant analysis was used to estimate the weights for each of the nine Bradford Hill criteria.
Results: The discriminant analysis yielded weights for the nine causality criteria. These weights were used to combine the nine criteria into one overall assessment of the probability that an association is causal. The criteria strength, consistency of the association and experimental evidence were the three criteria with the largest impact. The model correctly predicted 130 of the 159 (81.8%) agents.
Conclusion: The proposed approach enables using the Bradford Hill criteria in a quantitative manner resulting in a probability estimate of the probability that an association is causal."
Sounds reasonable, right? Take this IARC database, presumably of carcinogens known to be such by randomized experiment, and see how well the correlate studies predict after training with https://en.wikipedia.org/wiki/Linear_discriminant_analysis - you might not want to build a regular linear model because those tend to be weak and not too great at prediction rather than inference. It's not clear what they did to prevent overfitting, but reading through, something else strikes me:
"The IARC has evaluated the carcinogenicity of a substantial number of chemicals, mixtures, and exposure circumstances. These evaluations have been carried out by expert interdisciplinary panels of scientists and have resulted in classification of these agents or exposure conditions into human carcinogens (category 1) probable human carcinogens (category 2A), possible human carcinogens (category 2B), not classifiable agents (category 3), and chemicals that are probably not carcinogenic to humans (category 4) (IARC, 2006). Although the IARC Working Groups do not formally use the Bradford Hill criteria to draw causal inferences many of the criteria are mentioned in the individual reports. For instance, the preamble specifically mentions that the presence of a doseeresponse is an important consideration for causal inference. In this analysis, the IARC database serves as the reference database although we recognize that it may contain some disputable classifications. However, to our knowledge there is no other database containing causal inferences that were compiled by such a systematic process involving leading experts in the areas of toxicology and epidemiology."
Wait.
"These evaluations have been carried out by expert interdisciplinary panels of scientists and have resulted in classification of these agents or exposure conditions into human carcinogens"
"evaluations have been carried out by expert interdisciplinary panels"
"IARC Working Groups do not formally use the Bradford Hill criteria to draw causal inferences many of the criteria are mentioned"
Wait. So their database with causality/non-causality classifications is... based on... opinion. They got some experts together and asked them.
And the experts use the same criterion which they are using to predict the classifications.
What. So it's circular. Worse than circular, randomization and causality never even enter the picture. They're not doing 'causal inference', nor are they giving an 'overall assessment of the probability that an association is causal'. And their conclusion ("The proposed approach enables using the Bradford Hill criteria in a quantitative manner resulting in a probability estimate of the probability that an association is causal.") certainly is not correct - at best, they are predicting expert opinion (and maybe not even that well), they have no idea how well they're predicting causality.
But wait, maybe the authors aren't cretins or con artists, and have a good justification for this approach, so let's check out the Discussion section where they discuss RCTs:
"Using the results from randomized controlled clinical trials as the gold standard instead of the IARC database could have been an alternative approach for our analysis. However, this alternative approach has several disadvantages. First, only a selection of risk factors reported in the literature have been investigated by means of trials, certainly not the occupational and environmental chemicals. Second, there are instances in which randomized trials have yielded contradictory results, for instance, in case of several vitamin supplements and cancer outcomes."
You see, randomized trials are bad because sometimes we haven't done them but we still really really want to make causal inferences so we'll just pretend we can do that; and sometimes they disagree with each other, while the IARC database never disagrees with itself! Thank goodness we have official IARC doctrine to guide us in our confusion...
This must be one of the most brazen "it's not a bug, it's a feature!" moves I've ever seen.
Mon chapeau, Gerard, Ludovic; mon chapeau.
Incidentally, Google Scholar says this paper has been cited at least 40 times; looking at some, it seem the citations are generally all positive. These are the sort of people deciding what's a healthy diet and what substances are dangerous and what should be permitted or banned.
Enjoy your dinners.
"_Objective_: The Bradford Hill criteria are the best available criteria for causal inference. However, there is no information on how the criteria should be weighed and they cannot be combined into one probability estimate for causality. Our objective is to provide an empirical basis for weighing the Bradford Hill criteria and to develop a transparent method to estimate the probability for causality.
Study Design and Setting: All 159 agents classified by International Agency for Research of Cancer as category 1 or 2A carcinogens were evaluated by applying the nine Bradford Hill criteria. Discriminant analysis was used to estimate the weights for each of the nine Bradford Hill criteria.
Results: The discriminant analysis yielded weights for the nine causality criteria. These weights were used to combine the nine criteria into one overall assessment of the probability that an association is causal. The criteria strength, consistency of the association and experimental evidence were the three criteria with the largest impact. The model correctly predicted 130 of the 159 (81.8%) agents.
Conclusion: The proposed approach enables using the Bradford Hill criteria in a quantitative manner resulting in a probability estimate of the probability that an association is causal."
Sounds reasonable, right? Take this IARC database, presumably of carcinogens known to be such by randomized experiment, and see how well the correlate studies predict after training with https://en.wikipedia.org/wiki/Linear_discriminant_analysis - you might not want to build a regular linear model because those tend to be weak and not too great at prediction rather than inference. It's not clear what they did to prevent overfitting, but reading through, something else strikes me:
"The IARC has evaluated the carcinogenicity of a substantial number of chemicals, mixtures, and exposure circumstances. These evaluations have been carried out by expert interdisciplinary panels of scientists and have resulted in classification of these agents or exposure conditions into human carcinogens (category 1) probable human carcinogens (category 2A), possible human carcinogens (category 2B), not classifiable agents (category 3), and chemicals that are probably not carcinogenic to humans (category 4) (IARC, 2006). Although the IARC Working Groups do not formally use the Bradford Hill criteria to draw causal inferences many of the criteria are mentioned in the individual reports. For instance, the preamble specifically mentions that the presence of a doseeresponse is an important consideration for causal inference. In this analysis, the IARC database serves as the reference database although we recognize that it may contain some disputable classifications. However, to our knowledge there is no other database containing causal inferences that were compiled by such a systematic process involving leading experts in the areas of toxicology and epidemiology."
Wait.
"These evaluations have been carried out by expert interdisciplinary panels of scientists and have resulted in classification of these agents or exposure conditions into human carcinogens"
"evaluations have been carried out by expert interdisciplinary panels"
"IARC Working Groups do not formally use the Bradford Hill criteria to draw causal inferences many of the criteria are mentioned"
Wait. So their database with causality/non-causality classifications is... based on... opinion. They got some experts together and asked them.
And the experts use the same criterion which they are using to predict the classifications.
What. So it's circular. Worse than circular, randomization and causality never even enter the picture. They're not doing 'causal inference', nor are they giving an 'overall assessment of the probability that an association is causal'. And their conclusion ("The proposed approach enables using the Bradford Hill criteria in a quantitative manner resulting in a probability estimate of the probability that an association is causal.") certainly is not correct - at best, they are predicting expert opinion (and maybe not even that well), they have no idea how well they're predicting causality.
But wait, maybe the authors aren't cretins or con artists, and have a good justification for this approach, so let's check out the Discussion section where they discuss RCTs:
"Using the results from randomized controlled clinical trials as the gold standard instead of the IARC database could have been an alternative approach for our analysis. However, this alternative approach has several disadvantages. First, only a selection of risk factors reported in the literature have been investigated by means of trials, certainly not the occupational and environmental chemicals. Second, there are instances in which randomized trials have yielded contradictory results, for instance, in case of several vitamin supplements and cancer outcomes."
You see, randomized trials are bad because sometimes we haven't done them but we still really really want to make causal inferences so we'll just pretend we can do that; and sometimes they disagree with each other, while the IARC database never disagrees with itself! Thank goodness we have official IARC doctrine to guide us in our confusion...
This must be one of the most brazen "it's not a bug, it's a feature!" moves I've ever seen.
Mon chapeau, Gerard, Ludovic; mon chapeau.
Incidentally, Google Scholar says this paper has been cited at least 40 times; looking at some, it seem the citations are generally all positive. These are the sort of people deciding what's a healthy diet and what substances are dangerous and what should be permitted or banned.
Enjoy your dinners.