Scrapbook photo 1
gwern branwen
2,465 followers|1,440,071 views


gwern branwen

Shared publicly  - 
Fun list of papers. Lots of interesting results, with only a few stinkers.

(Specifically: the Sesame Street is obviously bogus as the discontinuity claim makes no sense - of course TV station type correlates with stuff - and is implausibly high compared to the actual RCTs on early-childhood education; and the 16 And Pregnant claim seems to be falsified by continued drops in pregnancy rates even after it went off the air )

#statistics #economics  
Many researchers consider randomized controlled trials (RCTs) to be the gold standard methodology in the social sciences. Figure 1, taken from Test, Learn Adapt (2010), shows how RCTs work. Starting with a group of people, ra...
Sergio Abriola's profile photo
Add a comment...

gwern branwen

Shared publicly  - 
One of the monuments of doujinshi culture receives a fresh update.

This is the 18th version of Touhou Lossless Music Collection torrent, current size ~1.35 TiB. Download and seed. If you have an older version, you can update it with this script. Main torrent (7z with cues and *.tta files): Touhou lossless music collection v.18 (1.35 TiB or 1 482 171 635 871 ...
Jay Dugger's profile photo
Add a comment...

gwern branwen

Shared publicly  - 
An attempt to combine the old neuroscience debate about whether there is a master trick to learning with some of the recent deep learning accomplishments and architectures.
(The image below is from a recent mysterious post to r/machinelearning, probably from a Google project that generates art based on a visualization tool used to inspect the patterns learned by convolution neural networks.  I a...
Ben Hayes's profile photo
Add a comment...

gwern branwen

Shared publicly  - 
Who saw that coming? Trippy paper.

"Hunting and Hallucinogens: The use psychoactive and other Plants to improve the Hunting ability of dogs", Bennett & Alarcón 2015 / /!zl1wmAgJ!sGzDUPS5ZPXHBA9jLamReqbp7Bj6pbRwWgBFBeylrNU ; excerpts:

"*Ethnopharmacological relevance*: Cultures throughout the world give plants to their dogs in order to improve hunting success. These practices are best developed in lowland Ecuador and Peru. There is no experimental evidence for the efficacy of these practices nor critical reviews that consider possible pharmacological effects on dogs based on the chemistry of the ethnoveterinary plants.
Aim: This review has three specific aims: 1. Determine what plants the Ecuadorian Shuar and Quichua give to dogs to improve their hunting abilities, 2. Determine what plants other cultures give to dogs for the same purpose, and 3. Assess the possible pharmacological basis for the use of these plants, particularly the psychoactive ones?
Methods: We gathered Shuar (Province of Morona-Santaigo) and Quichua (Napo and Orellano Provinces) data from our previous publications and field notes. All specimens were vouchered and deposited in QCNE with duplicates sent to NY and MO. Data presented from other cultures derived from published studies on ethnoveterinary medicine. Species names were updated, when necessary, and family assignments follow APG III (Angiosperm Phylogeny Group 2009). Chemical data were found using PubMed and SciFinder.
The Shuar and Quichua of Ecuador use at least 22 species for ethnoveterinary purposes, including all but one of their principal hallucinogens. Literature surveys identified 43 species used in other cultures to improve hunting ability. No published studies have examined the pharmacological active of these plant species in dogs. We, thus, combined phytochemical data with the ethnobotanical reports of each plant and then classified each species into a likely pharmacological category: depuratives/deodorant, olfactory sensitizer, ophthalmic, or psychoactive .
Conclusions: The use of psychoactive substances to improve a dog’s hunting ability seems counterintuitive, yet its prevalence suggests that it is both adaptive and that it has an underlying pharmacological explanation. We hypothesize that hallucinogenic plants alter perception in hunting dogs by diminishing extraneous signals and by enhancing sensory perception (most likely olfaction) that is directly involved in the detection and capture of game. If this is true, plant substances also might enhance the ability of dogs to detect explosives, drugs, human remains, or other targets for which they are valued.

In lowland areas of the Neotropics, the primary role of canines is to assist in hunting wild game. Hunting efficiency using dogs compares favorably to other forms of hunting (Koster 2009). The percentage of hunting trips that included dogs varies widely across cultures from a high of 83% (Mayangna and Miskito of Nicaragua) to 3% (Piro of Peru). Hunting success with dogs depends in large part on the targeted species. Although canines can be employed for any terrestrial species, they are particularly effective against pacas (Cuniculus paca, Fig. 2), agoutis (Dasyprocta spp.), and other animals that thrive in anthropogenic environments. The absence of dogs among some lowland cultures may be due to high mortality rates of dogs, rather than a canine aversion.

Within many cultures, hunting dogs receive particularly good care (Koster 2009). A Shuar woman, for example, may nurse a pup along with her own children (Bennett et al. 2002). In training dogs, both the Shuar and Quichua maintain the animals with a minimal diet supplemented with wild plants. While many plant species are employed to target canine illnesses, the majority are used to enhance the hunting ability of dogs. In a study that focused exclusively on ethnoveterinary practices, Jernigan (2009) identified 34 plants, that the Peruvian Aguaruna give to their dogs, most often to improve their hunting prowess. Plants are employed in baths to reduce their scent or to mask odors and thus decreasing their detectability by the targeted prey. Plants also function to clean buccal and nasal cavities to enhance olfaction (e.g., Lans et al. 2001, Sanz-Biset et al. 2009) or to enhance night vision (Wilbert 1987).

Koster (2009) notes the “occasional” use of hallucinogens, but the use psychoactive plants is actually frequent and widespread in many parts of the Old and, especially, the New World tropics (e.g., Bennett et al. 2002).

The Shuar and Quichua are the largest indigenous groups in lowland Ecuador. They mostly reside at elevations from 300 to 1,200 m in terra firme forests. This territory spans two of Holdridge's (1967) life zones, tropical moist forest and premontane tropical wet forest. Study sites were located in the Provinces of Morona-Santiago and Napo (Fig. 3). Both groups are horticulturists, growing manioc (Manihot esculenta) and plantains (Musa × paradisiaca L.) as their principle starches. Hunting (Fig. 4) and fishing supplement animal sources of protein from domesticated chickens and pigs.

The Shuar and Quichua employ at least 22 species for dogs (Table 1). The studies from which these data were drawn did not focus on ethnoveterinary medicines. It is therefore likely that more exist. In most case, the plants have corresponding human uses. However, some species or varieties are especially designated for canines. Four Shuar ethnoveterinary plants carry the name yawá, which means dog in the Shuar language: yawá kunkunari (Justicia pectoralis), yawá urints (Alternanthera paronychioides), yawá piripiri (Cyperus sp.) and yawá maikua (Brugmansia versicolor). With the exception of Brunfelsia grandiflora D. Don., all the principal Shuar hallucinogens are given to dogs.
Seven of the plants were utilized for purely medical reasons, mostly to treat botfly or other infections. The remaining plants are given to hunting dogs specifically to enhance their hunting prowess. Nine were used for the general purpose of improving hunting ability. A mixture of manioc and akapmas (Fittonia albivenis) was said to improve the ability to track game. Kunápik (Tabernaemontana sananho) and yawá piripiri (Cyperus sp.) appear to initiate hunting predilections in dogs. Quichua give payanshi (Abuta grandifolia) to their hunting dogs to keep them quiet and both the Quichua and the Shuar give the potent stimulant wais (Ilex guayusa, Fig. 5) to their hunting dogs so that “they will not be lazy.”

The use of plants to improve the hunting ability is best documented in Ecuador and Peru but examples can be found in other South American countries as well as the Caribbean, Indochina, Papua New Guinea, and the Solomon Islands (Table 2). Examples from the literature revealed 71 citations of 65 species that are used in 54 combinations. Of these, the majority (43) are said to improve hunting ability (e.g., Dendrobium pulchellum). Five enhance hunting success for specific game (e.g., Xanthosoma brasiliense for wild hogs). Four are believed to make hunting dogs more alert (e.g., Petiveria alliacea) and four are said to specifically enhance olfaction. The Secoya of Ecuador apply latex from Tabernaemontana sananho fruits to a dog’s nose so that “it can smell far.” A mixture of ginger (Zingiber officinale) and tobacco (Nicotiana tabacum, Fig. 6) is thought to enhance night vision in both hunters and their dogs. Ten plants are employed in baths for hunting dogs. A mixture that includes Tabernaemontana sananho and Brugmansia sp. is given to dogs so that they “can communicate with their masters.”
The combined data from the Shuar and Quichua data (Table 1) and the literature (Table 2) omitting those species that are not related directly to hunting or those species that have not been determined to at least the genus, reveals 71 species in 34 families that are given to dogs to improve their hunting ability (Table 3). There is some chemical data for most of the species or from close relatives. By combining the phytochemical data with the ethnobotanical reports of plants use, we classified each species into a likely pharmacological category. Twenty six are depuratives/deodorants (e.g., Siparuna guianensis), and many of these also have antimicrobial or anti-inflammatory activity. Ten species are classified as olfactory sensitizers Araceae (e.g., Caladium schomburgkii ). The largest category was psychoactives, with 25 species. Nineteen of these species are hallucinogens (e.g., Banisteriopsis caapi, Fig. 7) and most of the remaining are stimulants (e.g., Ilex guayusa). Two are opthalmic (discussed previously). The remaining are either unknown or difficult to classify.

3.3.1 Depuratives/Deodorants
More than half of the depuratives/deodorants have noticeably strong odors (Table 4). Dendropanax arboreus has a distinctive odor due to the presence of polyactetylenes. The specific epithets of Mansoa alliacea and Petiveria alliacea, together with some of their common names, refer to the plants’ garlic-like odor. Siparuna and Piper spp. possess abundant volatile terpenoids compounds that contribute to their strong and distinctive aromas. Cucurbitacins found in Momordica charantia produce its characteristic and pungent smell. Sesquiterpenes and monoterpenes in Renealmia alpinia and Zingiber officinale contribute to the distinctive ginger aroma of these plants.

3.3.3 Psychoactives
The psychoactive plants given to dogs are dominated by hallucinogens (Table 6). While most of these are well-known as hallucinogenic plants and are commonly used in shamanistic rituals (e.g., Anadenanthera peregrina, Banisteriopsis caapi) the activity of others is yet to be determined (e.g., Fittonia albivenis, Dendrobium pulchellum). The stimulants Ilex guayusa (caffeine & other methylxanthine alkaloids) and Nicotiana tabacum also are given to hunting dogs. Quichua heat tabacco leaves, then administer them through the noses of their dogs to keep them active and resilient during the hunting trips.

Dogs respond to commonly-used hallucinogens in a similar manner to humans. Vaupel et al. (1978a) showed that beta-phenethylamine and d-amphetamine increased respiration, dilated pupils and produced restlessness in chronically spinalized dogs. Frith et al. 1987 recorded circling, dilated pupils, hyperactivity, rapid breathing, and salivation in dogs given methylenedioxymethamphetamine. These effects potentially could enhance a dog’s hunting ability.

Only two species were cited as nocturnal ophthalmics – agents that improve vision. Wilbert (1987) reported that a mixture of tobacco and ginger is applied to the eyes of both hunters and their dogs to improve night vision. Few studies have examined the effects of plant extracts on night vision. Tetrahydrocannabinol from Cannabis sativa L. has been shown to enhance night visions in some studies (e.g., Russo et al. 2004). The effects of tobacco are less clear. While some studies have shown that tobacco smoke decreases night vision, others have31 shown that nicotine enhances night vision, presumably due to the stimulating effects of nicotine (Anonymous 2011). There are no studies on the effects of ginger on night vision. Though the atropine containing genus Brugmansia was one of the more frequently cited psychoactive plants given to hunting dogs, the reason for its use was never explicitly said to be related to improvement of vision. Atropine is well-known as a mydriatic and homatropine has been shown to improve nocturnal myopia (Koomen et al. 1951).

Both the Shuar and Quichua give Ilex guayusa, which contains methylxanthines, to their hunting dogs. The methylxanthine alkaloid theobromine is toxic to dogs (Strachan and Bennet, 1994). Small doses of the related alkaloid caffeine generate benign arrhythmias in dogs; higher doses cause severe arrhythmias (Mehta et al. 1997). There is clearly a dose-dependent response in canines. Small doses may induce alertness in habituated animals. Quichua deliver small nasal doses to their dogs.

Nonetheless, the practice of administering psychoactive plants to canines is well-established. Could such a practice persist if it impaired hunting success? This is unlikely as hunting is a crucial complement to subsistence practices in the lowland tropics. Vollenweider (1994) hypothesized a disruptive effect of activity of psychedelic substances on sensory gating–the filtering of redundant or superfluous stimuli. Riba et al. (2002), in contrast, suggest ayahuasca has a P50 suppressing effect on sensory gating in humans. We hypothesize that hallucinogenic plants alter perception in hunting dogs by diminishing ancillary signals and enhancing others that aid in the detection and capture of game (Fig. 8). If this is true, the implications are significant. Perhaps plant substance could enhance the ability of dogs to detect explosives, drugs, human remains, or enhance the scores of other abilities for which dogs are valued."
"The use of psychoactive substances to improve a dog׳s hunting ability seems counterintuitive, yet its prevalence suggests that it is both adaptive and that it has an underlying pharmacological explanation."  
4 comments on original post
Shae Erisson (shapr)'s profile photoWayne Radinsky's profile photogwern branwen's profile photoAda Dyrmyshi's profile photo
There aren't even any good synonyms which start with 'n' either! (I looked it up on after shapr commented, hoping I could top it.) Clearly this situation must be changed with some sort of n* neologism for dogs. 'Nosers'?
Add a comment...

gwern branwen

Shared publicly  - 
On walking down a driveway strewn thick with annelid corpses:

    earthworm ragnarök:
    a second summer storm falls
    ceaseless and careless
    another cool rain,
    uncaring and unceasing
    - earthworm ragnarök
    Another spring rain,
    uncaring and unceasing
    - earthworm ragnarök
    Ceaseless and careless
    a second summer storm falls
    - earthworm ragnarök
    Ceaseless and careless
    a second spring storm showers
    - earthworm ragnarök
    Ceaseless and careless
    a second spring storm showers
    - earthworm ragnarök
    Ceaseless and careless,
    a cool summer storm shower
    - earthworm ragnarök
    Ceaseless and careless
    is the welcome summer rain
    - earthworm ragnarök
    Cool, ceaseless, careless,
    is the welcome summer rain
    - earthworm ragnarök
    Cool and unceasing,
    we welcome the summer rain
    - earthworm ragnarök
    Cool and unceasing,
    we welcome rain, forgetting
    - earthworm ragnarök
    Cool and unceasing,
    we carelessly welcome rain
    - earthworm ragnarök
    Cool and unceasing,
    we rejoice in summer rain
    - earthworm ragnarök
    Cooling and ceaseless,
    some rejoice in summer rain
    - earthworm ragnarök
    Cooling and ceaseless,
    some rejoice in summer rain
    - earth worms' worlds' ending

    Uncaring, unceasing,
    some rejoice in summer rain
    - earth worms' worlds' ending

    earth worms' worlds' ending,
    while - cooling and ceaseless -
    some rejoice in rain

    earth worms' world ending,
    while - cooling and ceaseless -
    some greet the spring rain

    earth worms' world ending,
    while - cooling and ceaseless -
    some greet gentle rain

Dang, writing poetry is hard. 19 versions and I'm not completely satisfied with any.
Ashley Yakeley's profile photoMartin Milata's profile photogwern branwen's profile photo
Ashley: I think you read too much avant garde poetry if you read that as one long poem and not a bunch of stabs at the same core theme :)

Martin: 'Earthworm Ragnarök' would also make a good band name.
Add a comment...
Have him in circles
2,465 people
Slavomir Kaslev's profile photo
Steve Mynott's profile photo
Selem Delul's profile photo
Nishant Jain's profile photo
Brooke Jarvis's profile photo
Dag Odenhall's profile photo
MN M's profile photo
Mauro Mazzola's profile photo
Joshua Kitlas's profile photo

gwern branwen

Shared publicly  - 
The question is of course ill-defined, since “largest”, “possible”, “inhabitable” and “world” are slippery terms. But let us aim at something with maximal surface area that can be inhabited by at least terrestrial-style organic life of human size and is allowed by the known laws of physics.
3 comments on original post
Nicholas Cotter's profile photoNeike Taika-Tessaro's profile photo
Add a comment...

gwern branwen

Shared publicly  - 
"Epidemiology, genetics and the ‘Gloomy Prospect’: embracing randomness in population health research and practice", Smith 2011; excerpts:

"Epidemiologists aim to identify modifiable causes of disease, this often being a
prerequisite for the application of epidemiological findings in public health pro-
grammes, health service planning and clinical medicine. Despite successes in
identifying causes, it is often claimed that there are missing additional causes
for even reasonably well-understood conditions such as lung cancer and coronary
heart disease. Several lines of evidence suggest that largely chance events, from
the biographical down to the sub-cellular, contribute an important stochastic
element to disease risk that is not epidemiologically tractable at the individual
level. Epigenetic influences provide a fashionable contemporary explanation for
such seemingly random processes. Chance events—such as a particular lifelong
smoker living unharmed to 100 years—are averaged out at the group level. As a
consequence population-level differences (for example, secular trends or differ-
ences between administrative areas) can be entirely explicable by causal factors
that appear to account for only a small proportion of individual-level risk. In
public health terms, a modifiable cause of the large majority of cases of a disease
may have been identified, with a wild goose chase continuing in an attempt to
discipline the random nature of the world with respect to which particular indi-
viduals will succumb. The quest for personalized medicine is a contemporary
manifestation of this dream. An evolutionary explanation of why randomness
exists in the development of organisms has long been articulated, in terms of
offering a survival advantage in changing environments. Further, the basic notion
that what is near-random at one level may be almost entirely predictable at a
higher level is an emergent property of many systems, from particle physics to the
social sciences. These considerations suggest that epidemiological approaches will
remain fruitful as we enter the decade of the epigenome.

"We cannot imagine these diseases, they are called idiopathic, spontaneous in origin, but we know instinctively there must be something more, some invisible weakness they are exploiting. It is impossible to think they fall at random, it is unbearable to think it."
--James Salter, Light Years, 1975

Despite many suc-
cesses, even with respect to the most celebrated—
such as the identification of cigarette smoking as a
major cause of lung cancer and other chronic dis-
eases—it can appear that much remains to be done.
Consider Winnie, lighting a cigarette from the candles
on her centenary birthday cake, who, after 93 years
of smoking, is not envisaging giving up the habit
(Figure 1).
...loom large in the popular imagination 2 and are
reflected in the low positive predictive values and C
statistics in many formal epidemiological prediction
models. In general, epidemiologists do a rather poor
job of predicting who is and who is not going to de-
velop disease.
This apparent failing of epidemiology has long been
recognized. Writing about ischaemic heart disease
(IHD) 40 years ago, Tom Meade and Ranjan
Chakrabarti reported that ‘within any risk group, pre-
diction is poor; it is not at present possible to express
individual risk more precisely than as about a 1 in 6
chance of a hitherto healthy man developing clinical
IHD in the next 5 years if he is at high risk’. 3

I have certainly promulgated such views in the
(usually unsuccessful) pursuit of pounds or dollars,
although the exact percentage of ‘explanation’ by es-
tablished causes would fall and rise in relation to
degree of desperation. The most feted contemporary
candidate for better prediction is probably genetics.
With the perception (in my view exaggerated) that
genome-wide association studies (GWASs) have
failed to deliver on initial expectations, 5 the next
phase of enhanced risk prediction will certainly shift
to ‘epigenetics’ 6,7 —the currently fashionable response
to any question to which you do not know the

of this kind are, in the terminology popularized
within behavioural genetics, shared (or common) en-
vironmental factors. It is therefore perhaps surprising
that the groundbreaking 1987 paper by Robert Plomin
and Denise Daniels, 16 ‘Why are children in the same
family so different from one another?’, recently re-
printed with commentaries in the IJE, 17–21 has appar-
ently had little influence within epidemiology. The
implication of the paper—which expanded upon an
earlier analysis 22 —was that, genetics aside, siblings
are little more similar than two randomly selected
individuals of roughly the same age selected from
the source population that the siblings originate
from. This may be an intuitive observation for many
people who have siblings themselves or have more
than one child. Arising from the field of behavioural
genetics, the paper focused on measures of child be-
haviour, personality, cognitive function and psycho-
pathology, but, as Plomin points out, the same basic
finding is observed for many physical health out-
comes: obesity, cardiovascular disease, diabetes,
peptic ulcers, many cancers, asthma, longevity and
various biomarkers assayed in epidemiological stu-
dies. 18 These findings come from studies of twins,
adoptees and extended pedigrees, in which the vari-
ance in an outcome is partitioned into a genetic com-
ponent, the contribution of common environment (i.e.
that shared between people brought up in the same
home environment) and the non-shared environment
(i.e. exposures that are not correlated between people
brought up in the same family). The shared environ-
ment—which is the domain of many of the exposures
of interest to lifecourse epidemiologists—is reported
to make at best small contributions to the variance
of most outcomes. The non-shared environment—ex-
posures which (genetic influences apart) show no
greater concordance between siblings than between
non-related individuals of a similar age from the
same population—constitute by far the dominant
class of non-genetic influences on most health and
health-related outcomes (Box 1). Table 1 presents
data from a large collaborative twin study of 11
cancer sites, with universally large non-shared envir-
onmental influences (58–82%), heritabilities in the
range 21–42% (excluding uterine cancer, for which a
value of 0% is reported) and smaller shared environ-
mental effects, zero for four sites and ranging from
5% to 20% for the remainder. 23 Many other diseases
show a similar dominance of non-shared over shared
environmental influences. 18 Indeed, a greater non-
shared than shared environmental component ap-
pears to apply to some, 24–28 although not all, 29
childhood-acquired infections and the diseases they
cause. This is such a counter-intuitive observation
that one commentator on an earlier draft of this
paper used childhood infectious disease epidemiology
as an example of a situation in which the shared en-
vironment must be dominant.
...However, as Neven
Sesardic points out, even within behavioural genetics
the central, rather momentous, finding regarding the
apparently small or non-existent contribution of
family background to child outcomes went under-
appreciated; it was ‘an explosion without a bang’. 19

For epidemiologists, the fact that the generally small
shared environmental influences on many outcomes
appeared to get even smaller (or disappear com-
pletely) with age—as is seen, for example, with re-
spect to body mass index and obesity 31 —increases the
relevance of the message, since later life health out-
comes are often what we study. Yet, within epidemi-
ology, the impact of this work has been minimal; of
the 607 citations of the Plomin and Daniels paper on
ISI Web of Science (as of May 2011), only a handful
fall directly within the domain of epidemiology or
population health. In the recent book, Family Matters:
Designing, Analysing and Understanding Family-based
Studies in Lifecourse Epidemiology, 32 the issue is barely
touched upon; the balanced one page it receives near
the end of the 340-page book being perhaps too little,
too late. 33 Between-sibling studies as a way of con-
trolling for potential confounding have been widely
discussed within epidemiology, both in the book in
question 34 and elsewhere. 35,36 Certainly, this is a
useful method for taking into account shared aspects
of the childhood environment. But if shared environ-
ment has little impact on many outcomes then, on
the face of it, the approach might be missing the
issue of real concern—the more important non-
shared environmental factors. Despite this, the use
of sibling controls sometimes appears to uncover
substantial confounding. For example, maternal
smoking during pregnancy was found in a large
Swedish study to be associated with lower offspring
IQ, even after adjustment for many potential con-
founding factors. 37 In a between-sibs comparison,
however, there was no association of maternal smok-
ing with IQ of offspring, which the authors inter-
preted as indicating that the association seen for
unrelated individuals was due to residual confound-
ing. If shared environment is of such little import-
ance, how can it generate meaningful confounding
in epidemiological studies? We will return to this
issue later.

An extensive research programme in the behavioural
and social sciences consequent on the Plomin and
Daniels review focused on the direct assessment of
effects of the systematic aspect of the non-shared en-
vironment. Instruments were developed to collect de-
tailed data on sibling-specific parenting practice, sib–
sib interactions and the influence of schools and peer
groups, and studies including more than one child per
family were explicitly established to allow investiga-
tion of why siblings differ. However, a decade ago,
a meta-analytical overview of such studies concluded
that there was little direct evidence of important in-
fluences of specific non-shared environmental charac-
teristics on behavioural and social outcomes mainly
assessed during the first two decades of life. 38 At
best, only small proportions of the phenotypic vari-
ance attributed to the non-shared environment
related to directly measured influences. The effects
were rarely statistically robust and the median value
of the proportion of variation accounted for was 3%.
In the behavioural genetic studies, estimates of the
proportion of the overall phenotypic variance ac-
counted for by the non-shared environment are
almost always over 50%, and often substantially so;
similar findings apply to cancers (Table 1). There are
more optimistic assessments of the current status of
studies directly assessing the effects of non-shared
environment, 18,39 but in these the magnitude of the
effects appears small. In an example presented in
Plomin’s assessment of three decades of research on
this issue 18 non-shared aspects of maternal negativity
does have a statistically robust association with off-
spring depressive symptoms, but accounts for only
around 1% of the variance. 40

Systematic aspects of the non-shared
environments of adults that have large effects on dis-
ease outcomes may await identification. However, the
inability to identify such effects using intensive as-
sessments of exposure and outcomes in childhood is
sobering. Furthermore, in longitudinal twin studies,
in which twin pairs have repeat assessments, the gen-
eral finding is that the non-shared environmental
variance at one age overlaps little with that at a
later age—i.e. there appear to be unique and largely
uncorrelated factors acting at different ages. For ex-
ample, with respect to body mass index, the
non-shared environmental components at age 20,
48, 57 and 63 years are largely uncorrelated with
each other. 52 This suggests that exposures contribut-
ing to non-shared environmental influences are often
unsystematic and of a time- or context-dependent
nature. Similar findings have emerged from studies
of various other outcomes, with non-shared environ-
mental influences contributing little, if anything, to
tracking of phenotypes over time. 53 A distinction
can be drawn between the stable and unstable aspects
of the non-shared environment, with studies tending
to point to the latter as being of more statistical
importance in terms of explaining variance in the
distribution of disease risk. This is a crucial issue,
since some environmental exposures which are
partly non-shared in adulthood (such as cigarette
smoking and occupational exposures) tend to track
over time—and thus be stable components of the
non-shared environment.
Currently, there is largely an absence of evidence—
rather than evidence of absence—of directly assessed
systematic non-shared environmental influences on
health, and little active research in the biomedical
field. However, as the phenotypic decomposition of
variance shows similar patterns in the medical, be-
havioural and social domains, it seems prudent to
assume that similar causal structures exist, and
equivalent conclusions should be drawn: a large com-
ponent of variation in health-related traits cannot be
accounted for by measureable systematic aspects of
the non-shared environment.

Many features of twin study ana-
lysis can be problematic. For example, twin study
analysis often assumes that genetic contributions are
additive, and that genetic dominance (in the classic
Mendelian sense) or gene–gene interactions (epista-
sis) do not contribute to the genetic variance. Such
an assumption can lead to under-estimation of the
shared environmental component. 55–57 Conversely,
twin studies also assume no assortative mating (i.e.
parents are no more genetically similar than if ran-
domly sampled from the population) and no gene–
environment covariation, both of which can lead to
over-estimation of the shared environmental compo-
nent. 55 Different study designs for estimating compo-
nents of phenotypic variation make different
assumptions, however. Conventional twin studies,
studies of twins reared apart, extended twin-family
studies (in which other family members are
included), other extended pedigree studies and adop-
tion studies (including those in which there is quasi-
random assignment of particular adoptees) generally
come to the same basic conclusions about the relative
magnitude of these components. 58 All these designs
have been applied to the study of body mass index
and obesity, with the findings indicating roughly the
same magnitude of heritability. 55,59–64 This makes it
less likely that these are seriously biased, because dif-
ferent biases would all have to generate the same ef-
fects, which is not a plausible scenario.

With respect to the ‘missing heritability’, to take the
example of height—referred to by both Plomin 18 and
Turkheimer 25 —the estimate of the proportion of her-
itability explained by identified variants they give, of
<5%, has already increased to 410%, 65 and directly
estimated heritability (relating phenotypic similarity
to stochastic variation in the proportion of the
genome shared between siblings) indicates similar
heritabilities to those seen in twin studies. 66
Genome-wide prediction using common genetic varia-
tion across the genome also points to the effects of
measured genetic variation moving towards the
expectation from conventional heritability estimates. 67
Such data suggest there are large numbers of variants
as yet not robustly characterized that are contributing
to the heritability of height, with rare variants not
identifiable through GWAS probably accounting for
much of the remainder...In summary, it
seems improbable that heritability has been substan-
tially over-estimated at the expense of shared envir-
onment. The basic message that a larger non-shared
than shared environmental component to phenotypic
variance is the norm is unlikely to be overturned.

Shared environmental effects, although generally
small, are more substantial for some outcomes,
including musical ability 69 and criminality in adoles-
cents and young adults; 70 respiratory syncytial virus
infection, 29 anti-social behaviour, 53,71 mouth ulcers 72
and physical activity 73 in children and lung function
in adults. 74 Furthermore, findings with respect to
shared environmental contributions have face validity.
For example, in a twin study applying behavioural
genetic variance decomposition to behaviours, dis-
positions and experiences, shared environmental
effects were found for only 9 of the 33 factors inves-
tigated. 75 However, they were identified for those as-
pects of life that would appear to depend on shared
family characteristics, for example, for a child being
read to by a parent, but not for the child reading
books on their own. Similarly, the number of years
a child had music lessons had a substantial shared
environmental component, as might be expected as
this will initially depend on the parents organizing
such lessons. Continuing to play an instrument into
adulthood, however did not have an identified shared
environmental contribution.

Shared environ-
ment can be addressed through analysis of spousal
similarities in health outcomes, as environments are
shared to an extent by cohabiting couples, and these
also yield what on the face of it are rather small effect
estimates. For example, the cross-spousal correlation
for body mass index does not change from when cou-
ples initially come together (reflecting assortative
mating) over many years of them living together in
an at least partially shared environment. 61

Of most relevance to epidemiological approaches,
however, is that models generally fix the shared en-
vironmental component to zero if it is not ‘statistically
significantly’ different from zero. This is evident
in Table 1; with respect to pancreatic cancer, for ex-
ample, the shared environmental component is given
as 0, with a 95% confidence interval (CI) 0–0.35 (i.e.
the upper limit being 35% of phenotypic variance). In
many cases, it is simply stated that these studies find
no effect of shared environmental influences, even
though the findings are compatible with quite sub-
stantial contributions, but these cannot be reliably
estimated in the generally small samples available in
twin and adoption studies. Thus, a twin study of
aortic aneurysm reported that there was ‘no support
for a role of shared environmental influences’, 78 with
the 95% CI around the effect estimate being 0–27%. A
recent meta-analysis found that for various aspects of
child and adolescent psychopathology, shared envir-
onment makes a non-negligible contribution in ad-
equately powered analyses. 79 The claims of there
being ‘no shared environmental influence’, which
are often made (Box 2), might more realistically be
seen as an indication of inadequate sample size and
the fetishization of ‘statistical significance’. 80

The stochastic nature of phenotypic development is
something we should not be surprised to encounter
(Box 3). In his 1920 paper, ‘The relative importance of
heredity and environment in determining the piebald
pattern of guinea pigs’, Sewall Wright (Figure 2) pre-
sented a seminal path analysis (Figure 3), that has
frequently been cited as a source of this particular
statistical method. 87 Wright observed that ‘nearly all
tangible environmental conditions—feed, weather,
health of dam, etc., are identical for litter mates’; in
current terminology, they are part of the shared en-
vironment. Such factors were found to be of minor
importance; instead, most of the non-genetic variance
‘must be due to irregularities in development due to
the intangible sort of causes to which the word
chance is applied’. 87 Wright pointed out that meas-
urement error could not be separated from this intan-
gible variance, as is the case with non-shared
environment in current parlance. In a later paper, 88
Wright and his PhD student Herman Chase independ-
ently graded the guinea pig coat patterns, and demon-
strated that measurement error was only a minor
contributor (Figure 4). A summary table (Table 2)
included a shared environmental influence on litter-
mates—age of the mother—but the intangible vari-
ance dominated, with the estimate of the magnitude
of this being similar to estimates seen for the
contribution of the non-shared environment in rela-
tion to many human traits. 16 In humans, of course,
age of mother at conception could be a non-shared
environmental factor influencing differences between
siblings. In the inbred guinea pig strain, where gen-
etic differences were minor, heredity was not an issue,
and the intangible (‘non-shared environmental’) fac-
tors were even more dominant.

In genetically identical Caenorhabditis elegans reared in the same environments there are large differences in age-related functional declines, attributable to purely stochastic events. 89 In the case of genetically similar inbred laboratory rats, Klaus Gartner noted the failure to materially reduce variance for a wide variety of phenotypes, despite several decades of standardizing the environment. 90,91 Indeed, there was hardly any reduction in variance compared with that seen in wild living rats experiencing considerably more variable environments...Embryo splitting and transfer experiments in rodents and cattle demonstrated that the prenatal environment was also not a major source of phenotypic variation. 90,91 In genetically identical marbled crayfish raised in highly controlled environments considerable phenotypic differences emerge. 94 These and numerous other examples from over nearly a century 87,93–98 demonstrate the substantial contribution of what appear to be chance or stochastic events—which in the behavioural genetics field would fall into the category of non-shared environmental influences—on a wide range of outcomes.

If such a substantial role for chance exists in the
emergence of phenotypic (including pathological) pro-
files, why is this? One possible answer, with a long
pedigree, 121–123 is that it provides for evolutionary
bet-hedging. 124 Fixed phenotypes may be tuned to a
given environment, but in changing conditions a
phenotype optimized for propagation in one situation
may rapidly become suboptimal, 125 a proposition sup-
ported by experimental evidence. 126,12

Reflecting on their
demonstration of considerable phenotypic—including
epigenetic—differences between genetically identical
crayfish, they conclude that such variation may ‘act
as a general evolution factor by contributing to the
production of a broader range of phenotypes that
may occupy different micro-niches’. 94 The substantial
non-shared environmental contribution to many out-
comes could, therefore, include an element—perhaps
substantial—of random phenotypic noise, consequent
on stochastic epigenetic processes. At the molecular
level, the potential existence of such processes has
been observed within twin studies, with the formal
demonstration of non-shared environmental contribu-
tions to epigenetic profiles 130 and of substantial dif-
ferences in epigenetic markers between monozygotic
twins. 119
Other mechanisms can also contribute to pheno-
typic diversity, including meiotic recombination and
Mendelian assortment of genetic variants acting on
highly polygenic traits, with such genetic variants
having small individual effects. Mutation will also
increase phenotypic variation. Sibling contrast ef-
fects—siblings becoming less similar than their gen-
etic and shared environmental commonalities would
suppose—could also provide for such evolutionary
bet-hedging. 129 Although evidence supporting such a
process is sparse, it could lead to inflation of
non-shared environmental influences and deflation
of shared environment estimates from twin studies.

Most cases of lung cancer
are attributable to smoking, but many smokers do not
develop lung cancer. Thus, in the Whitehall Study of
male civil servants in London cigarette smoking ac-
counts for <10% of the variance (estimated as the
pseudo-R 2 ) 141 in lung cancer mortality. 102 At the
population level, however, smoking accounts for vir-
tually all of the variance—over 90% with respect to
lung cancer mortality over time in the USA, 142 and
virtually all of the differences in rates between areas
in Pennsylvania. 143 It is in relation to this large
contribution of smoking to the population burden of
lung cancer that <10% of variance accounted for by
cigarette smoking among individuals observed in pro-
spective epidemiological studies, and the 12% shared
environmental variance reported in Table 1, should be
considered. The shared environmental component will
in part reflect shared environmental differences in
cigarette smoking initiation. 144 The non-shared
environmental component (62% of the variance in
Table 1) will include the non-shared environmental
influence on initiation, amount and persistence of
smoking. 144 However, as discussed earlier, stable
aspects of the non-shared environment—which smok-
ing would tend to be—are generally small contribu-
tors to the total non-shared environmental effect,
and thus much of this will also reflect the substantial
contribution of the kinds of chance events—

These reflections will be unexceptional to epidemi-
ologists, as they merely illustrate a key point made by
Geoffrey Rose in his contributions to the theoretical
basis of population health 148,149 —that the determin-
ants of the incidence rate experienced by a population
may explain little of the variation in risk between
individuals within the population. Accounting for
incidence differs from understanding particular inci-
dents. Consider obesity in this regard; 150 its preva-
lence has increased dramatically over the past few
decades, yet estimates of the shared environmental
contributions to obesity are small. Clearly germline
genetic variation in the population has not changed
dramatically to produce this increase in obesity.
However, as Table 3 demonstrates, the prevalence of
obesity has increased in both genders, all ages, all
ethnic and socio-economic groups, and in both smo-
kers and non-smokers. 151 The most likely reason for
this is that there has been an across the board shift
in the ratio of energy intake to energy expenditure.
Study designs utilized to estimate heritability cannot
pick this up—twins, for example, are perfectly
matched by birth cohort. 150 Thus, although energy
balance may underlie the burden of obesity in a popu-
lation—and behind this, the social organization of
food production, distribution and promotion, together
with policies influencing transportation, urban plan-
ning and leisure opportunities—the determinants of
who, against this background, is obese within a popu-
lation could be largely dependent on a combination of
genetic factors and chance...Rose illustrated this point with the thought experi-
ment of a population in which all the individuals
smoke 20 cigarettes a day, in which ‘clinical, case–
control and cohort studies alike would lead us to
conclude that lung cancer was a genetic disease;
and in one sense that would be true, since if everyone
is exposed to the necessary agent, then the distribu-
tion of cases is wholly determined by individual sus-
ceptibility’. 134

Even Francis
Galton—the sometime bogeyman of the eugenics movement—wrote ‘Nature prevails enormously over nur-
ture when the differences of nature do not exceed what is commonly to be found among persons of the
same rank of society and in the same country’. 189 In other words, the contribution of genetic inheritance to
differences within a population is large when there is limited environmental variation between people within
a particular context. If the context were broadened, the contribution of such environmental factors would be
greater. Heritability is not a fixed characteristic, nor does high heritability within a particular situation
indicate that environmental change cannot lead to dramatic modification of outcomes. Height—the topic
of much of Galton’s own work—is both highly heritable and highly malleable, as changes over time in
height make clear. 190 Wilhelm Johannsen, the coiner of the term ‘gene’ recognized that in a genetically
highly homogeneous group ‘hereditary may be vanishingly small within the pure line’, 191 and that in this
situation ‘all the variations are consequently purely somatic and therefore non-heritable’. 191 Conversely, in a
highly standardized environment, the contribution of genetic factors will be increased. It is traditional in
epidemiological and related fields to hark back to such trusted thought experiments as how phenylketonuria
(PKU) would be expressed against the background of different levels of phenylalanine intake within popu-
lations, to demonstrate that the same outcome can be 100% heritable and 100% environmental in different
contexts. 5,192–197 The point is well made that the presence of a clear genetic predisposition does not mean
that environmental change cannot have major effects on disease risk. Perhaps reflecting the contested nature
of this area, however, public health academics are sometimes asymmetrical in their reasoning, and after
having presented the clear example of PKU they then claim that secular trends and migrant studies—with
their unambiguous demonstrations of environmental influences on disease—provide arguments against
strong genetic predisposition to common disease. 5 This is equivalent to saying that the clear demonstration
that genetic lesions underlie PKU in permissive environments argues against any major environmental
contribution to PKU.
A second popular thought experiment relates to the possession of two eyes or two legs. The reason
humans are almost always born with two of each is genetically determined. However, within a population
the trait would not be highly heritable—and certainly not 100% heritable—with loss of a leg or eye generally
reflecting accidental events. The distinction between explaining individual trajectories (genes are responsible
for the development of two eyes and two legs) and variation in a population is clear, and reflects the
distinction between ‘who?’ (why does one person have a disorder or problem rather than another?) and
‘how many’ (what proportion of the population are affected?) questions. 198

Within sociology, for example,
the perhaps under-appreciated role of chance has been emphasised, 206 illustrated with entertaining examples
from the sporting world. A striking example of what is known as Stein’s paradox in statistics is that within-
season prediction of the end of season batting averages for particular baseball players is generally better if
strongly weighted towards the average of all players at that stage in the season. 207 The best guess at what
will happen to an individual can often be made by largely discounting individual characteristics. The popular
recognition of the importance of chance in people’s lives 164 can also influence response to cultural artefacts.
Thus in films, novels or plays explanation of events is often near-deterministic, which in certain circum-
stances appears satisfying. Consider Alfred Hitchcock’s film Marnie. The behaviour of the eponymous
character—fear of thunderstorms, the colour red and men, together with her thieving and frigidity—is all
explained at the end of the film by a particular event occurring when Marnie was six. She discovered her
prostitute mother with a client during a thunderstorm and ended up killing him (in a cinematic shock of
bright red blood) with a poker. Everything seamlessly rolled on from this event. In crime stories this is often
what the reader wants. As Stephen Kern entertainingly demonstrates 208 the range of causal models in such
narratives has a similar range to epidemiology—from the long-arm of early life (or prenatal) events through
to primarily psychological and social causation. Outside of murder novels, however, the factitious nature of
such explanations can be entirely unsatisfactory. The apparent reality of the well-told narrative appears
unreal precisely because everything is tied up and explained—a notion that has resonance with David
Shield’s literary manifesto Reality Hunger. 209 To take one example, the clunking plots of the novels of Ian
McEwan—Saturday for example—revolve around such faux ‘explanations’. The work of McEwan—and simi-
lar purveyors of book club fare, such as Jonathan Franzen—appear, paradoxically, much less true than such
novels as Laurence Sterne’s Tristram Shandy, Macado de Assis’ Epitaph of a Small Winner, Blaise Cendrars’
Moravagine or Alasdair Gray’s Lanark, which are apparently not seeking such realism. In these works expla-
nations, when offered, become things to be explained, and the often random nature of the world as codified
in people’s experience is respected.

Rowe and Plomin
noted that after the birth of a second child parents are
often struck by how different their two children are,
despite upbringing being in common. In relation to
health, non-professional understanding of causes of
disease regularly identify the role of chance (or
fate) 164 and heritable factors 165 as being of consider-
able importance. Indeed I have to confess that when I
was involved in a cross-disciplinary project exploring
the construction of models of disease causation held
by the general public—which we referred to as ‘lay
epidemiology’ 2 —I was disappointed that, for the
public at large, there appeared to be a concentration
on such apparently individual factors as inheritance
and fate, rather than my preferred model of the
socio-political determinants of health. 166

One perhaps counter-intuitive
way is to embrace the findings of quantitative genet-
ics and realize they actually enhance the importance
of the insights that epidemiology brings. First, most
traits have a non-trivial genetic component. This is
good news: it means that genetic variants can be uti-
lized as instrumental variables for the near-alchemic
act of turning observational into experimental data,
and allow the strengthening of causal inference with
respect to environmentally modifiable exposures, in
the absence of randomized trials. 162,167 Indeed, we
might even enter the age of hypothesis-free causal-
ity. 163
- 163: Davey Smith G. "Random allocation in observational data: how small but robust effects could facilitate hypothesis-free causal inference". Epidemiology 2011;22: 460–63."
 Next Section Abstract Epidemiologists aim to identify modifiable causes of disease, this often being a prerequisite for the application of epidemiological findings in public health programmes, health service planning and clinical medicine. Despite successes in identifying causes, it is often claimed that there are missing additional causes for even reasonably well-understood conditions such as lung cancer and coronary heart disease. Several line...
Stephen Long's profile photo
Add a comment...

gwern branwen

Shared publicly  - 
"The Economics of Reproducibility in Preclinical Research", Freedman et al 2015 (commentary: ); excerpts:

"Low reproducibility rates within life science research undermine cumulative knowledge production and contribute to both delays and costs of therapeutic drug development. An analysis of past studies indicates that the cumulative (total) prevalence of irreproducible preclinical research exceeds 50%, resulting in approximately US$28,000,000,000 (US$28B)/year spent on preclinical research that is not reproducible—in the United States alone. We outline a framework for solutions and a plan for long-term improvements in reproducibility rates that will help to accelerate the discovery of life-saving therapies and cures.

Clearly, perfect reproducibility across all preclinical research is neither possible nor desirable. Attempting to achieve total reproducibility would dramatically increase the cost of such studies and radically curb their volume. Our assumption that current irreproducibility rates exceed a theoretically (and perhaps indeterminable) optimal level is based on the tremendous gap between the conventional 5% false positive rate (i.e., statistical significance level of 0.05) and the estimates reported below and elsewhere (see S1 Text and Fig 1).

An illustrative example is the use and misuse of cancer cell lines. The history of cell lines used in biomedical research is riddled with misidentification and cross-contamination events [29], which have been estimated to range from 15% to 36% [30]. Yet despite the availability of the short tandem repeat (STR) analysis as an accepted standard to authenticate cell lines, and its relatively low cost (approximately US$200 per assay), only one-third of labs typically test their cell lines for identity [31]. For an NIH-funded academic researcher receiving an average US$450,000, four-year grant, purchasing cell lines from a reputable vendor (or validating their own stock) and then authenticating annually will only cost about US$1,000 or 0.2% of the award. A search of NIH Reporter for projects using “cell line” or “cell culture” suggests that NIH currently funds about US$3.7B annually on research using cell lines. Given that a quarter of these research projects apparently use misidentified or contaminated cell lines, reducing this to even 10% through a broader application of the STR standard—a very realistic goal—would ensure a more effective use of nearly three-quarters of a billion dollars and ultimately speed the progress of research and the development of new treatments for disease.

Four Categories of Irreproducibility
Study Design
    Lack of proper study methodology has been identified as an ongoing challenge to research fidelity for more than 50 years. Improper study design encompasses both studies that are underpowered to yield statistically significant results, as well as those whose design lacks sufficiently rigorous statistical analysis [2]. For example, an analysis of 271 animal studies by Kilkenny et al. concluded that 13% used inappropriate statistical methods and almost 60% had problems with both the statistical analysis and the transparency of reporting [3]. In addition, the lack of rigor of the experimental design, and in particular the absence of properly blinded studies in confirmatory research, have been cited as key characteristics and contributors to studies and that ultimately are not reproducible [4].
    For our analysis, we determined an estimate for this category by evaluating research irreproducibility that can be attributed to the routine use of widely accepted statistical testing procedures. Jager and Leek’s [5] analysis of p-values from more than 75,000 papers from the medical literature concluded a rate of false discoveries among reported results at 14%. Valen Johnson’s[6] mathematical calculation of the false results, based on the assumption that 50% of tested null hypothesis are actually true, calculates a false result between 17% and 25% of the time. Using the high and low figures from these studies, we estimate that between 14% and 25% of reported results may be false positives, and used a midpoint of 19.5% for this category.

Biological Reagents and Reference Materials
    Reference material flaws are associated with the unreliable identification of source materials used in the preclinical study, particularly contaminated, mishandled, or mislabeled biological reagents like antibodies [7] or cell lines [8]. A poster child for misidentified cell lines is the adriomycin-resistant breast adenocarcinoma cell lines, MCF-7/AdrR, used in over 300 studies used before they were found to be derived from human ovarian carcinoma cells (now re-designated NCI/ADR-RES) [9]. For perspective, based on the cost of an average NIH-funded breast cancer grant (US$370k) in 2013 ( as much as US$100M of research funding may have been spent using this misidentified cell line alone. Similarly, a recent assessment of mycoplasma contamination found in the NCBI Sequence Read Archive (SRA; conservatively found that 11% of projects were contaminated [10], and estimated that hundreds of millions of dollars in NIH-funded research has been potentially affected by widespread mycoplasma contamination of continuous cell lines.
    Hughes et al. examined the prevalence of contaminated cancer cell line usage over a period of more than 20 years, and reported a wide range of contamination and mischaracterization, with only a small improvement in rates over time [11]. Excluding studies within the Hughes analysis that were outside of the US or had a sample size of <200 cell lines, the range of the reported misidentification or contamination ranged from a low of 14.9% [12] to a high of 36% [13] (midpoint 25.5%), which serves as our estimated error rate for this category.

Laboratory Protocols
    Laboratory protocol issues encompass irreproducibility that arises during the preparation and execution of the experiment. Although no study estimating the prevalence of the error rate within preclinical laboratory protocol was identified, analysis within the clinical environment has shown laboratory error rates in the range of 0.3% to 0.5% [14,15]. The error rate within preclinical environment—where there is less use of controls, blinding, and broadly accepted standards and best practices—was assumed to be significantly higher than in clinical trials [16]. To estimate the extent to which this assumption holds, we evaluated one study that compared the error rates of other factors in the preclinical to clinical environment, where estimates of the error rates in the preclinical laboratory were found to be as high as 19 times the clinical rate [17]. Applying this multiplier against the error rates for the preclinical laboratory setting generates an estimate of 5.7% to 9.5%, with a midpoint of 7.6%.

Data Analysis and Reporting
    The fourth category of contributing factors to preclinical irreproducibility is the analysis and reporting of data. Data sharing and reporting has been recognized by the NIH [18] as an essential part of the translational research process, and together with the rise of post-publication review [19], it is a key factor in facilitating the identification of irreproducible data or studies [20]. The same animal study referenced earlier (Kilkenny et al.) also looked at design issues and concluded that only 59% of the papers studied had included satisfactory level of details on the methodology, sample size, and key characteristics of the animals used in the study [3]. And while less common, errors in analysis can have devastating impacts, as two researchers reported in 2007 when a simple calculation error (change in sign) undermined several years of work on multidrug resistance efflux transporters and that led to the retraction of a widely cited paper [21].
    A number of studies have investigated the issue of inadequate reporting of research, with estimates of improper reporting reaching as high as 87% [22], although the impact of such data analysis and reporting errors on irreproducible research is inconclusive. For our analysis, we used the results of a study of 234 clinical trials, which found that 18% of data reported was deemed to be “inadequate” [23], which provides a conservative, lower-end estimate for the impact of this category on overall irreproducibility.

Cumulative Irreproducibility Rate
    In order to calculate the total rate of irreproducibility in preclinical research, the estimated prevalence values for all four categories were used as outlined in S1 and S2 Datasets. Given the limited number of studies in which we were able to identify reporting incidence rates for irreproducibility, and a lack of consistency as to how reproducibility/irreproducibility is defined, a rigorous meta-analysis or systematic review was not feasible. However, using both the range and midpoint estimates for each category of error, the combined impact was calculated using a highly conservative probability bounds approach [24] with the cumulative irreproducibility rate estimated to exceed 50% (see Fig. 2 and S1 Dataset).

Comparison to Prior Estimates of Irreproducibility
    Several prominent studies have examined the prevalence of irreproducibility within the confines of the research at a specific company or academic institution. One widely discussed effort was Amgen scientists’ ability to replicate only 6 (11%) of 53 key oncological studies [25]. A similarly low reproducibility rate was seen at Bayer, whose study concluded that a mere 20 to 25% of published data over a 4-year period could be corroborated internally [26]. Likewise, researchers at the Oregon Health & Science University found that 54% of 238 biomedical papers published in 84 journals failed to identify all of the resources necessary to reproduce results [27]. And finally, a review of 80 studies published in the journal Evidence-Based Medicine found that fewer than half (49%) included sufficient details of results to accurately attempt replication [28]. Notably, authors of the latter advocate for tracking replication as a means of post-publication evaluation to both assist researchers to identify reliable findings and to explicitly recognize and incentivize the publication of reproducible data and results. Our calculated estimate (53.3%) of the cumulative prevalence of irreproducible preclinical research falls well within the boundaries of the results published in these previous studies (Fig. 1).

However, it is reasonable to state that cumulative errors in the following broad categories—as well as underlying biases that could contribute to each problem area [14] or even result in entire studies never being published or reported [15]—are the primary causes of irreproducibility [16]: (1) study design, (2) biological reagents and reference materials, (3) laboratory protocols, and (4) data analysis and reporting. Fig 2, S1 Text, S1 and S2 Datasets show the results of our analysis, which estimates the prevalence (low, high, and midpoint estimates) of errors in each category and builds up to a cumulative (total) irreproducibility rate that exceeds 50%. Using a highly conservative probability bounds approach [17], we estimate that the cumulative rate of preclinical irreproducibility lies between 18% (the maximum of the low estimates, assuming maximum overlap between categories), and 88.5% (the sum of the high estimates, assuming minimal overlap). A natural point estimate of the cumulative irreproducibility rate is the midpoint of the upper and lower bounds, or 53.3%.

Extrapolating from 2012 data, an estimated US$114.8B in the United States [18] is spent annually on life sciences research, with the pharmaceutical industry being the largest funder at 61.8%, followed by the federal government (31.5%), nonprofits (3.8%), and academia (3.0%) [20]. Of this amount, an estimated US$56.4B (49%) is spent on preclinical research, with government sources providing the majority of funding (roughly US$38B) [19]. Using a conservative cumulative irreproducibility rate of 50% means that approximately US$28B/year is spent on research that cannot be replicated (see Fig 2 and S2 Dataset). Of course, uncertainty remains about the precise magnitude of the direct economic costs—the conservative probability bounds approach reported above suggest that these costs could plausibly be much smaller or much larger than US$28B...Irreproducibility also has downstream impacts in the drug development pipeline. Academic research studies with potential clinical applications are typically replicated within the pharmaceutical industry before clinical studies are begun, with each study replication requiring between 3 and 24 months and between US$500,000 to US$2,000,000 investment [23]. While industry will continue to replicate external studies for their own drug discovery process, a substantially improved preclinical reproducibility rate would derisk or result in an increased hit rate on such investments, both increasing the productivity of life science research and improving the speed and efficiency of the therapeutic drug development processes. The annual value added to the return on investment from taxpayer dollars would be in the billions in the US alone."

Category 4 of errors is a bit weak, but if anything, they're too generous on the other categories, and they don't include any downstream effects like the funding used in attempts to replicate hot research. 50% is a pretty reasonable estimate. (What is deeply unreasonable is trying to argue it's some much lower number like 10%...)
Add a comment...

gwern branwen

Shared publicly  - 
And people keep asking how one could possibly run a RCT on this or that...
Kaj Sotala's profile photoRobert Obryk's profile photo
Add a comment...

gwern branwen

Shared publicly  - 
What's so great about this writeup is that it's the gift that keeps on giving as you keep reading:

- the weakness of arguments from authority
- how replication and not peer review uncovers fraud & mistakes
- the importance of informative priors in evaluating results
- basic numeracy ('how did he afford such an expensive survey?')
- the value of open data
- real data is messy and noisy, undermining clean stories and post hoc narratives
&ldquo;What&rsquo;s the book? There&rsquo;s no book for this. What do we do?&rsquo;&rdquo;
Neike Taika-Tessaro's profile photoSergio Abriola's profile photoEdward Morbius's profile photoAlex Schleber's profile photo
I've just thoroughly enjoyed reading Brookman's paper over breakfast on how misleading the lib-cons 1D ideological scale can be (the average is the meanest information measure) which, iiuc, provides evidence that partisans and more knowledgeable voters have cross-issue consistency whereas the majority support a mix of lib and cons policies: an arbitrary selection of which can be extreme.

I found it a convincing refutation of what is apparently a current political statistical conclusion from ideological scale studies that legislators are out of step with the electorate and that the electorate is less extreme than their representatives.

The paper's referenced in the link as one of Brookman's current research interests.   
Add a comment...
Have him in circles
2,465 people
Slavomir Kaslev's profile photo
Steve Mynott's profile photo
Selem Delul's profile photo
Nishant Jain's profile photo
Brooke Jarvis's profile photo
Dag Odenhall's profile photo
MN M's profile photo
Mauro Mazzola's profile photo
Joshua Kitlas's profile photo
Contributor to
Basic Information