Scrapbook photo 1
gwern branwen
2,435 followers|1,415,377 views


gwern branwen

Shared publicly  - 
> ...For example, weightlifting enhances brain function, reverses sarcopenia, and lowers the death rate in cancer survivors. Take this last item, lowering death rate in cancer survivors: garden-variety aerobic exercise had no effect on survival, while resistance training lowered death rates by one third... --

[paper in question: "The Effect of Resistance Exercise on All-Cause Mortality in Cancer Survivors", Hardee et al  2014; fulltext: / ]

This is a bad study, but sadly the problems are common to the field. Claiming that this study shows 'weight lifting lowered death rates and aerobic exercise did not change survival' is making at least 4 errors:

1. correlation!=causation; this is simply your usual correlation study (you know, of the sort which is always wrong in diet studies?), where you look at some health records and crank out some p-values. There should be no expectation that this will prove to be causally valid; in particular, reverse confounding is pretty obvious here and should remind people of the debate about weight and mortality. (Ah, but you say that the difference they found between aerobic and resistance shows that it's not confounding because health bias should operate equally? Well, read on...)
2. power: with only 121 total deaths (~4% of the sample), this is inadequate to detect any differences but comically large correlates of health, as the estimate of predicting a third less mortality indicates
3. p-hacking/multiplicity, type S errors, exaggeration factor: take a look at that 95% confidence interval for resistance exercise (which is the only result they report in the abstract), which is an HR of 0.45-0.99. In other words, if the correlate were even the tiniest bit bigger, it would no longer have the magical 'statistical significance at p<0.05'.  There's at least 16 covariates and 3 full models tested. By the statistical significance filter, a HR of 0.67 will be a serious exaggeration (because only exaggerated estimates would - just barely - reach p=0.05 on this small dataset with only 121 deaths).
4. "The Difference Between 'Significant' and 'Not Significant' is Not Itself Statistically Significant" ( the difference between aerobic exercise and resistance exercise is not statistically-significant in this study. The HR in model 1 for aerobic exercise is (0.63-1.32), and for aerobic exercise, (0.46-0.99). That is, the confidence intervals overlap. (Specifically, comparing the proportion of aerobic exercisers who died with the resistance exercisers who died, I get `prop.test(c(39,75), c(1251,1746))` = p=0.12; to compute a survival curve I would need more data, I think.) The study itself does not anywhere seem to directly compare aerobic with resistance but always works in a stratified setting; I don't know if they don't realize this point about the null hypotheses they're testing, or if they did do the logrank test and it came out non-significant and they quietly dropped it from the paper.
5. the fallacy of controlling for intermediate variables: in the models they fit, they include as covariates "body mass index, current smoking (yes or no), heavy drinking (yes or no), hypertension (present or not), diabetes (present or not), hypercholesterolemia (yes or no), and parental history of cancer (yes or no)." This makes no sense. Both resistance exercise and aerobic exercise will themselves influence BMI, smoking status, hypertension, diabetes, and hypercholesterolemia. What does it mean to estimate the correlation of exercise with health which excludes all impact it has on your health through BMI, blood pressure, etc? You might as well say, 'controlling for muscle percentage and body fat, we find weight lifting has no estimated benefits', or 'controlling for education, we find no benefits to IQ' or 'controlling for local infection rates, we find no mortality benefits to public vaccination'. This makes the results particularly nonsensical for the aerobic estimates if you want to interpret them as direct causal estimates - at most, the HR estimates here are an estimate of weird indirect effects ('the remaining effect of exercise after removing all effects mediated by the covariates'). Unfortunately, structural equation models and Bayesian networks are a lot harder to use and justify than just dumping a list of covariates into your survival analysis package, so expect to see a lot more controlling for intermediate variables in the future.

Any of these is sufficient. This sort of problem is why one should put more weight on meta-analyses of RCTs - for example, "Progressive resistance strength training for improving physical function in older adults"

[above comment is still in moderation, so I'm putting it here as a copy.] #statistics #weightlifting  
Dave Gordon's profile photoSth Chn's profile photo
Add a comment...

gwern branwen

Shared publicly  - 
Cryonics preserves memories: "Persistence of Long-Term Memory in Vitrified and Revived C. elegans", Vita-More & Barranco 2015 (discussion:

I don't think I can exaggerate how important this is: cryonics has now shown psychological continuity in an organism with enough of a mind to have long-term memory and learning. Cryonics now works in principle! People are going to try to minimize this and argue that maybe human brains are special snowflakes or that the procedure doesn't scale up to human brains, but the former is the same kind of desperate pleading we saw with genetics & intelligence where nurturists fought every step of the way for a century until finally they were buried with GWASes (and still haven't admitted they were wrong about everything) and the latter is merely quibbling over details. (Important details but compared to the question of whether cryonics works at all, details.)


"Can memory be retained after cryopreservation? Our research has attempted to answer this long-standing question by using the nematode worm Caenorhabditis elegans (C. elegans), a well-known model organism for biological research that has generated revolutionary findings but has not been tested for memory retention after cryopreservation. Our study’s goal was to test C. elegans’ memory recall after vitrification and reviving. Using a method of sensory imprinting in the young C. elegans we established that learning acquired through olfactory cues shapes the animal’s behavior and the learning is retained at the adult stage after vitrification. Our research method included olfactory imprinting with the chemical benzaldehyde (C6H5CHO) for phase-sense olfactory imprinting at the L1 stage, the fast cooling SafeSpeed method for vitrification at the L2 stage, reviving, and a chemotaxis assay for testing memory retention of learning at the adult stage. Our results in testing memory retention after cryopreservation show that the mechanisms that regulate the odorant imprinting (a form of long-term memory) in C. elegans have not been modified by the process of vitrification or by slow freezing.

Our study addresses the specific interest that long or short-term memories of cryopreserved and revived animals have not been tested. The organism of choice to explore this question is the C. elegans because it is the only organism in which both the cryopreservation and revival have been demonstrated, and there is a well-defined assay of learning.

Attached to this project and as an extra goal, we also evaluated the persistence of memory using the traditional protocol of cryopreservation of C. elegans by slow freezing. 2

One method for training and testing learning of the worms is to expose them to a specific chemical compound over a restricted period of time with the presence or absence of food to create a pattern of behavior. When encountering the chemical compound at later times, the worms’ level of response (memory retention) is tested and evaluated by a chemotaxis migration assay. 9,12 In their findings of associative short-term memory, Kauffman, et al. 12 show that after C. elegans are exposed to the chemoattractant butanone, the worms’ memory response starts to decrease one hour after exposure. In their findings on long-term memory, Remy and Hobert 9 show that after C. elegans are exposed to attractant chemicals such as benzaldehyde at the early larval stage, the worms’ memory response in the adult stage continues to be retained after five days, or 120 hours, for phase-sense learning, referred to as imprinting.

The control group consisted of eight sets of 100 or more worms for each study

Cryopreservation by slow freezing was performed one day after olfactory imprinting. Based on the method of Brenner 2 , the worms at the L2 and L3 stages were introduced into a cryovial (cryogenic tube) with the traditional mix of cryoprotectant for slow freezing at 15% v/v Glycerol in M9 Buffer (3g KH 2 PO 4 , 6g Na 2 HPO 4 , 5g NaCl, 1ml 1M MgSO 4 , H 2 O to 1 liter). The cryovials were transferred to a -80oC freezer for two weeks. After two weeks, the worms were transferred to a petri dish with a lawn of E. coli OP50.

in the lid of the square dish, three drops of H 2 O d (4μl each one) were placed. On the other side of the square dish, in the area with value 6, the same drops (1μl each) of sodium azide 1M were placed; however, the only difference was that on the lid of the dish we put three drops of 1% benzaldehyde (1/100 diluted in H 2 O d ) (Supplementary Figure 1 of Remy and Hobert). 9
We used a platinum wire to select and pick up individual worms from a petri dish with E. coli OP50, transferred the worms to a petri dish without food, and held for 15 minutes. Then, 20 worms were transferred along the centerline of the square dish. The worms were counted every 15 minutes. At one hour, we had accumulated 80 values (each value designated the area where was the worms were in that moment), and we calculated a migration report, which Remy and Hobert 9 referred to as the Migration Index (MI).

The trained worms (worms imprinted with benzaldehyde) preferred areas 5 and 6 on the square petri dish, close to the benzaldehyde drops, and the untrained worms preferred areas 1 and 2, demonstrating a native attraction to the benzaldehyde (Figure 1). The mean of the MI was higher and very similar in all the studies with trained worms. The studies with untrained worms were also similar (Figure 1). The highest value of the MI was 4.23 and was obtained with trained and unvitrified worms and the lowest value was 1.34 in the untrained and not vitrified worms (Table 2). In general, the response of the trained worms to the benzaldehyde was double that of the untrained worms, whether they were cryopreserved or not. The variance of MI did not show homogeneity (Levene = 2.920, df1= 9, df2= 60, P=0.006) and the comparison of the media through one factor-ANOVA showed differences between groups (F=26.061, P=0.000). The Tahame test, for comparisons in pairs, showed differences between the two groups: trained and untrained worms, and no differences between the studies inside of each group (Supplementary Table 1). There were no differences between trained and vitrified worms and trained and not vitrified worms (Tahame, i-j=0.72, P=0.305). Also, there were no differences between untrained and slow freezing and trained and slow freezing (Tahame, i-j=0.24, p=0.138). Both methods of cryopreservation did not show differences in the MI average of trained worms and were also similar to the migration index of trained and worms that were not cryopreserved (Figure 1 and supplementary Table 1).

We demonstrated that cryoprotectants used in both the slow freezing and vitrification processes do not affect, alter, or change the mechanism that regulates the olfactory imprinting and long-term memory (Figure 1). Secondly, we determined that the process of cryopreservation methods of slow freezing or vitrification do not affect, alter, or change this mechanism (Figure 1). We also demonstrated that the results of the MI average obtained for the trained and untrained worms are similar to the results of the original authors of the odorant imprinting protocol. 9

The study shows the first results related to the persistence of memory after vitrification. Prior to this study, no other study or methodology had existed to carry out this project or a similar project."

#cryonics #transhumanism  
Dev Anima's profile photoAlexander Kruel's profile photo
Add a comment...

gwern branwen

Shared publicly  - 
Everything is (half) heritable (and half non-shared environment): "Meta-analysis of the heritability of human traits based on fifty years of twin studies", Polderman et al 2015 / )

Pretty astonishing work in its ambition - summarizing heritability from all twin studies. And the GCTAs say twin studies are right, so it can be interpreted pretty literally as genetics. Excerpts:

"Despite a century of research on complex traits in humans, the relative importance and specific nature of the influences of genes and environment on human traits remain controversial. We report a meta-analysis of twin correlations and reported variance components for 17,804 traits from 2,748 publications including 14,558,903 partly dependent twin pairs, virtually all published twin studies of complex traits. Estimates of heritability cluster strongly within functional domains, and across all traits the reported heritability is 49%. For a majority (69%) of traits, the observed twin correlations are consistent with a simple and parsimonious model where twin resemblance is solely due to additive genetic variation. The data are inconsistent with substantial influences from shared environment or non-additive genetic variation. This study provides the most comprehensive analysis of the causes of individual differences in human traits thus far and will guide future gene-mapping efforts. All the results can be visualized using the MaTCH webtool.

Specifically, the partitioning of observed variability into underlying genetic and environmental sources and the relative importance of additive and non-additive genetic variation are continually debated 1–5 . Recent results from large-scale genome-wide association studies (GWAS) show that many genetic variants contribute to the variation in complex traits and that effect sizes are typically small 6,7 . However, the sum of the variance explained by the detected variants is much smaller than the reported heritability of the trait 4,6–10 . This ‘missing heritability’ has led some investigators to conclude that non-additive variation must be important 4,11 . Although the presence of gene-gene interaction has been demonstrated empirically 5,12–17 , little is known about its relative contribution to observed variation 18 . In this study, our aim is twofold. First, we analyze empirical estimates of the relative contributions of genes and environment for virtually all human traits investigated in the past 50 years. Second, we assess empirical evidence for the presence and relative importance of non-additive genetic influences on all human traits studied. We rely on classical twin studies, as the twin design has been used widely to disentangle the relative contributions of genes and environment, across a variety of human traits.

Half of these were published after 2004, with sample sizes per study in 2012 of around 1,000 twin pairs (Supplementary Table 2). Each study could report on multiple traits measured in one or several samples. These 2,748 studies reported on 17,804 traits. Twin subjects came from 39 different countries, with a large proportion of studies (34%) based on US twin samples. The continents of South America (0.5%), Africa (0.2%) and Asia (5%) were heavily underrepresented (Fig. 1a,b and Supplementary Table 3).

The majority of studies (59%) were based on the adult population (aged 18–64 years), although the sample sizes available for studies of the elderly population (aged 65 years or older) were the largest (Supplementary Table 4). Authorship network analyses showed that 61 communities of authors wrote the 2,748 published studies. The 11 largest authorship communities contained >65 authors and could be mapped back to the main international twin registries, such as the Vietnam Era Twin Registry, the Finnish Twin Cohort and the Swedish Twin Registry (Supplementary Fig. 1).
The investigated traits fell into 28 general trait domains. The distribution of the traits evaluated in twin studies was highly skewed, with 51% of studies focusing on traits classified under the psychiatric, metabolic and cognitive domains, whereas traits classified under the developmental, connective tissue and infection domains together accounted for less than 1% of all investigated traits (Fig. 1c and Supplementary Tables 5–7). The ten most investigated traits were temperament and personality functions, weight maintenance functions, general metabolic functions, depressive episode, higher-level cognitive functions, conduct disorders, mental and behavioral disorders due to use of alcohol, anxiety disorders, height and mental and behavioral disorders due to use of tobacco. Collectively, these traits accounted for 59% of all investigated trait

We did not find evidence of systematic publication bias as a function of sample size (for example, where studies based on relatively small samples were only published when larger effects were reported) (Fig. 1d, Supplementary Figs. 2–6 and Supplementary Tables 8–11). We calculated the weighted averages of correlations for monozygotic (r MZ ) and dizygotic (r DZ ) twins and of the reported estimates of the relative contributions of genetic and environmental influences to the investigated traits using a random-effects meta-analytic model to allow for heterogeneity across different studies (Supplementary Tables 12–15). The meta-analyses of all traits yielded an average r MZ of 0.636 (s.e.m. = 0.002) and an average r DZ of 0.339 (s.e.m. = 0.003). The reported heritability (h 2 ) across all traits was 0.488 (s.e.m. = 0.004), and the reported estimate of shared environmental effects (c 2 ) was 0.174 (s.e.m. = 0.004) (Fig. 2a,b, Table 1 and Supplementary Fig. 7).

All weighted averages of h^2 across >500 distinct traits had a mean greater than zero (Supplementary Tables 17–24). The lowest reported heritability for a specific trait was for gene expression, with an estimated h^2 = 0.055 (s.e.m. = 0.026) and an estimated c 2 of 0.736 (s.e.m. = 0.033) (but note that these trait averages are based on reported estimates of variance components derived from only 20 data points reporting on the expression levels of 20 genes; Supplementary Table 21).

For the vast majority of traits (84%), we found that monozygotic twin correlations were larger than dizygotic twin correlations. Using the weighted estimates of r MZ and r DZ across all traits, we showed that, on average, 2r DZ − r MZ = 0.042 (s.e.m. = 0.007) (Table 1), which is very close to a twofold difference in the correlation of monozygotic twins relative to dizygotic twins (Supplementary Figs. 11 and 12). The proportion of single studies in which the pattern of twin correlations was consistent with the null hypothesis that 2r DZ = r MZ was 69%. This observed pattern of twin correlations is consistent with a simple and parsimonious underlying model of the absence of environmental effects shared by twin pairs and the presence of genetic effects that are entirely due to additive genetic variation (Table 2). This remarkable fitting of the data with a simple mode of family resemblance is inconsistent with the hypothesis that a substantial part of variation in human traits is due to shared environmental variation or to substantial non-additive genetic variation.

In only 3 of 28 general trait domains were most studies inconsistent with this model. These domains were activities (35%), reproduction (44%) and dermatological (45%) (Table 2 and Supplementary Table 27). Of the 59 specific traits (ICD-10 or ICF subchapter classifications) for which we had sufficient information to calculate the proportion of studies consistent with 2r DZ = r MZ , 21 traits showed a proportion less than 0.50, whereas for the remaining 38 traits the majority of individual studies were consistent with 2r DZ = r MZ (Supplementary Table 29). Of the top 20 most investigated specific traits, we found that for 12 traits the majority of individual studies were consistent with a model where variance was solely due to additive genetic variance and non-shared environmental variance, whereas the pattern of monozygotic and dizygotic twin correlations was inconsistent with this model for 8 traits, suggesting that, apart from additive genetic influences and non-shared environmental influences, either or both non-additive genetic influences and shared environmental influences are needed to explain the observed pattern of twin correlations (Table 2). These eight traits were conduct disorders, height, higher-level cognitive functions, hyper-kinetic disorders, mental and behavioral disorders due to the use of alcohol, mental and behavioral disorders due to the use of tobacco, other anxiety disorders and weight maintenance functions. For all eight traits, meta-analyses on reported variance components resulted in a weighted estimate of reported shared environmental influences that was statistically different from zero (Supplementary Table 21). Comparison of weighted twin correlations for these specific traits resulted in positive estimates of 2r DZ − r MZ , except for hyperkinetic disorders, where 2r DZ − r MZ was −0.130 (s.e.m. = 0.034) on the basis of 144 individual reports and 207,589 twin pairs, which suggests the influence of non-additive genetic variation for this trait or any other source of variation that leads to a disproportionate similarity among monozygotic twin pairs."
Shared with Dropbox
Add a comment...

gwern branwen

Shared publicly  - 
Everything is heritable; complex behavioral traits can be modified by small shifts in many genes:

"Cagan and colleagues examined DNA in Norway rats (Rattus norvegicus) that had been bred for 70 generations to be either tame or aggressive toward humans. Docility was associated with genetic changes in 1,880 genes in the rats. American minks (Neovison vison) bred for tameness over 15 generations had tameness-associated variants in 525 genes, including 82 that were also changed in the rats."

This rats & minks example is consistent with the foxes, cats, and rabbits I've previously linked.
There are different opinions on whether domestication as a process can give you insight on evolutionary process. The divergence goes back to the beginning of evolutionary biology. Charles Darwin was an optimist, a country gentlemen who spent much of is life in rural England. Alfred Wallace on the other hand seemed to focus more on […]
Add a comment...

gwern branwen

Shared publicly  - 
Millions of Americans get tests, drugs, and operations that won’t make them better, may cause harm, and cost billions. Credit Illustration by Anna Parini
Michael O'Kelly's profile photo
Add a comment...

gwern branwen

Shared publicly  - 
"Maladaptive daydreaming"

+Darcey Riley 
Should elaborate fantasies be considered a psychiatric disorder?
Nicholas Cotter's profile photo
Add a comment...
Have him in circles
2,435 people
Open Data's profile photo
Joshua Fox's profile photo
Tom Moertel's profile photo
Bruno Coelho's profile photo
Tanya Mulkidzhanova's profile photo
Leo Campero's profile photo
Vanessa mendonça's profile photo
Kuhan Perampaladas's profile photo
PJ Verhoef's profile photo

gwern branwen

Shared publicly  - 
"Misperceiving Inequality", Gimpelson & Treisman 2015 /

"Since Aristotle, a vast literature has suggested that economic inequality has important political consequences. Higher inequality is thought to increase demand for government income redistribution in democracies and to discourage democratization and promote class conflict and revolution in dictatorships. Most such arguments crucially assume that ordinary people know how high inequality is, how it has been changing, and where they fit in the income distribution. Using a variety of large, cross-national surveys, we show that, in recent years, ordinary people have had little idea about such things. What they think they know is often wrong. Widespread ignorance and misperceptions of inequality emerge robustly, regardless of the data source, operationalization, and method of measurement. Moreover, we show that the perceived level of inequality—and not the actual level—correlates strongly with demand for redistribution and reported conflict between rich and poor. We suggest that most theories about political effects of inequality need to be either abandoned or reframed as theories about the effects of perceived inequality."
I am a behavioral political economist.  I think most political economy models are worthless.  Unless you start with empirically sound assumptions about voter cognition and motivation, you're wasting your time - and the time of everyone who reads you.  What...
Add a comment...

gwern branwen

Shared publicly  - 
"End-to-End Training of Deep Visuomotor Policies", Levine et al 2015 (media: ; demo video: ; talk: ; HN comments: ); excerpts:

"Policy search methods based on reinforcement learning and optimal control can allow robots to automatically learn
a wide range of tasks. However, practical applications of policy
search tend to require the policy to be supported by handengineered components for perception, state estimation, and lowlevel control. We propose a method for learning policies that map
raw, low-level observations, consisting of joint angles and camera
images, directly to the torques at the robot’s joints. The policies
are represented as deep convolutional neural networks (CNNs)
with 92,000 parameters. The high dimensionality of such policies
poses a tremendous challenge for policy search. To address
this challenge, we develop a sensorimotor guided policy search
method that can handle high-dimensional policies and partially
observed tasks. We use BADMM to decompose policy search into
an optimal control phase and supervised learning phase, allowing
CNN policies to be trained with standard supervised learning
techniques. This method can learn a number of manipulation
tasks that require close coordination between vision and control,
including inserting a block into a shape sorting cube, screwing
on a bottle cap, fitting the claw of a toy hammer under a nail
with various grasps, and placing a coat hanger on a clothes rack.

Reinforcement learning and policy search methods hold the
promise of allowing robots to acquire new behaviors through
experience. They have been applied to a range of robotic tasks,
including manipulation [2, 13] and locomotion [5, 7, 15, 39].
However, policies learned using such methods often rely on
a number of hand-engineered components for perception and
low-level control. The policy might specify a trajectory in taskspace, relying on hand-designed PD controllers to execute the
desired motion, and a policy for manipulating objects might
rely on an existing vision system to localize these objects [29].
The vision system in particular can be complex and prone to
errors, and its performance is typically not improved during
policy training, nor adapted to the goal of the task.
We propose a method for learning policies that directly
map raw observations, including joint angles and camera
images, to motor torques. The policies are trained end-toend using real-world experience, optimizing both the control
and perception components on the same measure of task
performance. This allows the policy to learn goal-driven perception, which avoids the mistakes that are most costly for task
performance. Learning perception and control in a general and
flexible way requires a large, expressive model. Our policies
are represented with convolutional neural networks (CNNs),
which have 92,000 parameters and 7 layers.

To address these challenges, we extend the framework of
guided policy search to sensorimotor deep learning. Guided
policy search decomposes the policy learning problem into
two phases: a trajectory optimization phase that determines
how to solve the task in a few specific conditions, and a
supervised learning phase that trains the policy from these
successful executions with supervised learning [22]. Since the
CNN policy is trained with supervised learning, we can use the
tools developed in the deep learning community to make this
phase simple and efficient. We handle the partial observability
of visuomotor control by optimizing the trajectories with full
state information, while providing only partial observations
(consisting of images and robot configurations) to the policy.
The trajectories are optimized under unknown dynamics, using
real-world experience and minimal prior knowledge.

We evaluate our method by learning policies for inserting a
block into a shape sorting cube, screwing a cap onto a bottle,
fitting the claw of a toy hammer under a nail with various
grasps, and placing a coat hanger on a rack (see Figure 1).
Our results demonstrate clear improvements in consistency and
generalization from training visuomotor policies end-to-end,
when compared to using the poses or features produced by a
CNN trained for 3D object localization.

Unlike with visual recognition, applications of deep networks to robotic control have been comparatively limited.
Backpropagation through the dynamics and the image formation process is impractical, since they are often nondifferentiable, and such long-range backpropagation leads to
extreme numerical instability. The high dimensionality of the
network also makes reinforcement learning very difficult [3].
Pioneering early work on neural network control used small,
simple networks [10, 33], and has largely been supplanted
by methods that use carefully designed policies that can be
learned efficiently with reinforcement learning [14]. More
recent work on sensorimotor deep learning has tackled simple
task-space motion [17] and used unsupervised learning to
obtain low-dimensional state spaces from images [34], but
such methods are limited to tasks with a low-dimensional

Learning visuomotor policies end-to-end introduces two key
challenges: partial observability and the high dimensionality
of the policy. We tackle these challenges using guided policy search. In guided policy search, the policy is optimized
using supervised learning, which scales gracefully with the
dimensionality of the function approximator. The training set
for this supervised learning procedure can be constructed
from example demonstrations [20], trajectory optimization
under known dynamics [21, 22, 28], and trajectory-centric
reinforcement learning methods that operate under unknown
dynamics [19, 23], which is the approach taken in this work.
We propose a new, partially observed guided policy search
method based on the Bregman alternating directions method of
multipliers (BADMM) that makes it practical to train complex,
generalizable policies under partial observation.
The goal of our approach is also similar to visual servoing,
which performs feedback control on feature points in a camera
image [6, 27, 42]. However, our visuomotor policies are
entirely learned from real-world data, and do not require
feature points or feedback controllers to be specified by hand.
This gives our method considerable flexibility in choosing how
to use the visual signal. Furthermore, our approach does not
require any sort of camera calibration, in contrast to many
visual servoing methods (though not all – see e.g. [11, 43]).

Our visuomotor policy runs at 20 Hz on the robot, mapping
monocular RGB images and the robot configurations to joint
torques on a 7 DoF arm. The configuration includes the angles
of the joints and the pose of the end-effector (defined by 3
points), as well as their velocities, but does not include the
position of the target object or goal, which must be determined
from the image. CNNs often use pooling to discard the
locational information that is necessary to determine positions,
since it is an irrelevant distractor for tasks such as object classification [18]. Because locational information is important for
control, our policy does not use pooling. Additionally, CNNs
built for spatial tasks such as human pose estimation often
also rely on the availability of location labels in image-space,
such as hand-labeled keypoints [40]. We propose a novel CNN
architecture capable of estimating spatial information from an
image without direct supervision in image space. Our pose
estimation experiments, discussed in Section V-B, show that
this network can learn useful visual features using only 3D
positional information provided by the robot and no camera
calibration. Furthermore, by training our network with guided
policy search, it can acquire task-specific visual features that
improve policy performance.

Our network architecture is shown in Figure 2. The visual
processing layers of the network consist of three convolutional
layers, each of which learns a bank of filters that are applied
to patches centered on every pixel of its input. These filters
form a hierarchy of local image features. Each convolutional
layer is followed by a rectifying nonlinearity of the form
a cij = max(0, z cij ) for each channel c and each pixel
coordinate (i, j). The third convolutional layer contains 32
response maps with resolution 109 × 109. These response
maps are passed
P through a spatial softmax function of the form
s cij = e a cij / i 0 j 0 e a ci 0 j 0 . Each output channel of the softmax
is a probability distribution over the location of a feature
in the image. To convert from this distribution to a spatial
representation, the network calculates the expected image
position of each feature, yielding a 2D coordinate for each
channel. These feature points are concatenated with the robot’s
configuration and fed through two fully connected layers, each
with 40 rectified units, followed by linear connections to the
torques. The full visuomotor policy contains about 92,000
parameters, of which 86,000 are in the convolutional layers.

We evaluated our method by training policies for hanging a
coat hanger on a clothes rack, inserting a block into a shape
sorting cube, fitting the claw of a toy hammer under a nail
with various grasps, and screwing on a bottle cap. The cost
function for these tasks encourages low distance between three
points on the end-effector and corresponding target points,
low torques, and, for the bottle task, spinning the wrist. The
equations for these cost functions follow prior work [23]. The
tasks are illustrated in Figure 3. Each task involved variation
of about 10-20 cm in each direction in the position of the
target object (the rack, shape sorting cube, nail, and bottle).
In addition, the coat hanger and hammer tasks were trained
with two and three grasps, respectively. All tasks used the
same policy architecture and model parameters.

The success rates for each test are shown in Table I. We
compared to two baselines, both of which train the vision
layers in advance for pose prediction, instead of training the
entire policy end-to-end. The features baseline discards the last
layer of the pose predictor and uses the feature points, resulting
in the same architecture as our policy, while the prediction
baseline feeds the predicted pose into the control layers.
The pose prediction baseline is analogous to a standard
modular approach to policy learning, where the vision system
is first trained to localize the target, and the policy is trained
on top of it. This variant achieves poor performance, because
although the pose is accurate to about 1 cm, this is insufficient
for such precise tasks. As shown in the video, the shape sorting
cube and bottle cap insertions have tolerances of just a few
millimeters. Such accuracy is difficult to achieve even with
calibrated cameras and checkerboards. Indeed, prior work has
reported that the PR2 can maintain a camera to end effector
accuracy of about 2 cm during open loop motion [25]. This
suggests that the failure of this baseline is not atypical, and
that our visuomotor policies are learning visual features and
control strategies that improve the robot’s accuracy.
When provided with pose estimation features, the policy
has more freedom in how it uses the visual information, and
achieves somewhat higher success rates. However, full endto-end training performs significantly better, achieving high
accuracy even on the challenging bottle task, and successfully
adapting to the variety of grasps on the hammer task. This
suggests that, although the vision layer pre-training is clearly
beneficial for reducing computation time, it is not sufficient by
itself for discovering good features for visuomotor policies.

CNN training was implemented using the Caffe [12] deep
learning library. Each visuomotor policy required 3-4 hours
of training time: 20-30 minutes for the pose prediction data
collection on the robot, 40-60 minutes for the fully observed
trajectory pre-training on the robot and offline pose pretraining (which can be done in parallel), and between 1.5
and 2.5 hours for end-to-end training with guided policy
search. The coat hanger task required two iterations of guided
policy search, the shape sorting cube and the hammer required
three, and the bottle task required four. Training time was
dominated by computation rather than robot interaction time,
and we expect significant speedup from a more efficient
Add a comment...

gwern branwen

Shared publicly  - 
The writing and rhetoric in this is interesting. If you replaced every instance of 'rat' with 'jew' and didn't tell someone, I wonder how long before they realized the switch?
Last May, a member of Alberta’s rat patrol paid a visit to a farm on the outskirts of Sibbald, a small town near the Saskatchewan border. He found holes bored into the foundation of a grain silo...
Ari Rahikkala's profile photogwern branwen's profile photoNicholas Cotter's profile photo
The feces are obviously because they're hiding from the jew patrol; they are normally a fastidious people. Have you called the hotline, 310-JEWS, to report your vermin?
Add a comment...

gwern branwen

Shared publicly  - 
Ever wonder how many people have had legal problems related to Silk Road et al? I've finished my compilation.
A listing of all known arrests and prosecutions connected to the Tor-Bitcoin drug black-markets
Evelyn Mitchell's profile photogwern branwen's profile photo
Hm. What I need more is just sorted counts, for the most part equations are overkill. You can see the summary counts I've added to the page right now.
Add a comment...

gwern branwen

Shared publicly  - 
"An example of this approach is a study7 in Australia, which measured reported pain levels, swelling and other symptoms associated with osteoarthritis and chronic pain in 132 people taking different drugs over three years. For each person, measurements were taken every 2 weeks for 12-week periods, when the patient was either off or on a particular drug. By comparing the data collected before and after the different treatments, the researchers showed that, although initially costly, the formalized N-of-1 trials resulted in more-effective prescriptions."

Linked in
N-of-1 trials test treatment effectiveness within an individual patient.To assess (i) the impact of three different N-of-1 trials on both clinical and economic outcomes over 12 months and (ii) whether the use of N-of-1 trials to target patients’ ...
Add a comment...
Have him in circles
2,435 people
Open Data's profile photo
Joshua Fox's profile photo
Tom Moertel's profile photo
Bruno Coelho's profile photo
Tanya Mulkidzhanova's profile photo
Leo Campero's profile photo
Vanessa mendonça's profile photo
Kuhan Perampaladas's profile photo
PJ Verhoef's profile photo
Contributor to
Basic Information