Profile

Cover photo
Carlos Santos
125 followers|25,506 views
AboutPostsPhotosVideos

Stream

Carlos Santos

Shared publicly  - 
 
Mark Palko quotes Justin Fox: On Monday, software engineer Rob Rhinehart published an account of his new life without alternating electrical current — which he has undertaken because generating that current “produces 32 percent of all greenhouse gases, more than any other economic sector.” Connection to the power grid isn’t all Rhinehart has given up. …
1
Add a comment...

Carlos Santos

Shared publicly  - 
 
 
Nice piece about the Makerspace that Ben Shapiro helped create to draw Haitian girls into STEM.  Ben’s quote is great — it’s about changing attitudes and identities, not just teaching tech. The tools, computers and gadgets in Nedlam’s Workshop might imply…
View original post
1
Add a comment...

Carlos Santos

Shared publicly  - 
 
 
A few rough working notes for a discussion group I convened to discussion Engelbart's famous 1962 paper on "Augmenting Human Intellect".
Engelbart: "Augmenting Human Intellect". “By "augmenting human intellect" we mean increasing the capability of a man to approach a complex problem situation, to gain comprehension to suit his particular needs, and to derive solutions to problems… whether the problem situation exists for twenty ...
1 comment on original post
1
Add a comment...

Carlos Santos

Shared publicly  - 
 
 
Frank Wilczek just posted a draft book proposal for a "Princeton Companion to Physics" to Twitter.  I hope it eventuates!

Click through to see the proposal.
“Sneak preview: draft proposal / outline for "Princeton Companion to Physics" http://t.co/iB0CNSLpv5”
3 comments on original post
1
Add a comment...

Carlos Santos

Shared publicly  - 
1
Add a comment...

Carlos Santos

Shared publicly  - 
 
 
Introducing Distributed Code Jam

After 12 years running Code Jam (g.co/codejam) is making an exciting change to the competition this year with their addition of the new Distributed Code Jam track (DCJ) that requires coding for a distributed environment. 

We sat down with the main creator behind Distributed Code Jam, Google software engineer Onufry Wojtaszczyk,  to learn more about him and the new track.

Research at Google: How long have you been involved with Code Jam?
Onufry Wojtaszczyk: Longer than I’ve been at Google, it’s been my favorite programming contest before I joined the company; I really liked the problem quality there, and I still do. Candy Splitting (http://goo.gl/Kj8XV6) in the Qualification Round of 2011 Code Jam was the first problem I contributed.

R@G: What has your personal experience been with competitive coding competitions?
OW: I started pretty late; I was more focused on math than on computer science during university. I started with Potyczki Algorytmiczne, a once-a-year contest in Poland, and moved on to the Topcoder Open (http://goo.gl/NC1m4u).  I’m less active as a competitor now, and have been preparing problems instead. Google Code Jam and ACM ICPC world finals are the two highest-profile competitions that featured my problems. And now, of course, the Distributed Code Jam.

R@G: How is the DCJ track different than the regular Code Jam track?
OW: I’d like to begin by saying how similar it is. One thing that I was really concerned about was keeping it an algorithmic competition; I didn’t want people to dig into the details of machine architecture, or setting up unix sockets, or whatever.

In terms of what’s different, well, the biggest difference is it’s distributed. That means that when you submit your solution, it will run on a hundred machines,instead of one. Instead of the contestant  downloading the input and running their code individually, in DCJ they will upload their code, which will compile and run on multiple machines for them.

R@G: How did you get the idea for DCJ?
OW: The one thing that programming contests didn’t prepare me for were the distributed computations that are common at Google. Many of the problems Google solves - delivering the best Search results, directions in Maps, making sure our data centers operate efficiently - are at a scale that requires spreading the work across multiple machines. So I started playing around with ideas for what  a competition that included distributed computing could look like, and what the format [would] be. Then I got a few other Googlers excited about the idea, and that’s how Distributed Code Jam was born.

R@G: Can you describe an example of a DCJ problem?
OW: One example would be the “maximum sum substring” problem, where you have a sequence of integers, some positive, some negative, and you have to find a substring of this sequence that has the highest sum available. On a single machine, the way you solve this is by going linearly along the sequence, remembering at each point the largest sum of a substring ending at this point. 

Now what if you have a string of 10¹⁰ integers, and you want your code to run within a 1 second time limit? You’ll need to make use of multiple machines! For example,  you can split the input between the machines - each taking a substring of length 10⁸ to process. If the best substring was contained in one of the parts, the problem is easy. The trick, however, is to notice that if it’s not contained in one of the parts, then it will consist of a suffix of one part, then some number of parts (possibly zero) taken whole, and then a prefix of another part. 

R@G: Would you say DCJ is harder than Code Jam?
OW: Distributed computing is a new field for programming contests. In the regular programming competitions, there’s a number of tricks of the trade that people have learned, and they take for granted; and the body of knowledge can be a bit intimidating for a newcomer. In the distributed competitions, we are discovering a lot of ideas as we go, so I expect the distributed part might actually be simpler in the sense that there’s no established “everybody knows this” body of knowledge that you have to master.

R@G: Is DCJ something you would recommend for a novice programmer or a more experienced programmer?
OW: I’d say that you have to have some skill as a programmer to participate; in general you will need the ability to solve a single-machine version of any problem to solve the multi-machine version. This is one of the reasons we won’t begin Distributed Code Jam with a “Qualification Round”, instead only beginning with people who already passed round 3 in the regular Google Code Jam. I hope the problems should be solvable using good thinking and common sense, and not some advanced programming knowledge.

To register and learn more about Code Jam and Distributed Code Jam, visit g.co/codejam. Good luck!
5 comments on original post
1
Add a comment...
Have him in circles
125 people
Luis Roberto P. Paula's profile photo
asia medici's profile photo
Maria Lucia Bacic's profile photo
Renata Ayabe's profile photo
Maria Júlia de Almeida's profile photo
Du Jake's profile photo
Alice Bacic's profile photo
Octávio Weiss Ribeiro's profile photo
Jessika  Taice's profile photo

Carlos Santos

Shared publicly  - 
 
Não sei se vai ser tão útil assim mas vale a pena saber..
 ·  Translate
 
Did you know you could get bibtex directly from a doi? It's called DOI content negotiation and it can do a lot of other really cool tricks.

I don't know how to do get bibtex from the browser but this works on the command line:

curl -LH "Accept: application/x-bibtex" http://dx.doi.org/10.1007/s11083-012-9252-6

Here is the magic output:

@article{Dorais_2012,
doi = {10.1007/s11083-012-9252-6},
url = {http://dx.doi.org/10.1007/s11083-012-9252-6},
year = 2012,
month = {mar},
publisher = {Springer Science $\mathplus$ Business Media},
volume = {30},
number = {2},
pages = {415--426},
author = {Fran{\c{c}}ois Gilbert Dorais and Steven Gubkin and Daniel McDonald and Manuel Rivera},
title = {Automorphism Groups of Countably Categorical Linear Orders are Extremely Amenable},
journal = {Order}
}

http://www.crosscite.org/cn/
DOIs provide a persistent link to content. They identify many types of work, from journal articles to research data sets. Typically, someone interacting with DOIs will be a researcher, who will resolve DOIs found in scholarly references to content using a DOI resolver. Such researchers may not ...
1 comment on original post
2
Add a comment...

Carlos Santos

Shared publicly  - 
 
 
A heartwarming story of how Prof. Mason's life has been impacted by teaching a MOOC. Read more
Prof. Peggy Mason Shares Heartwarming Experience as a MOOC Instructor
View original post
1
Add a comment...

Carlos Santos

Shared publicly  - 
 
Faz uns dois meses que eu imprimi um paper da Cinthia Dwork sobre isso mas não deu tempo de estudar. A matemática é um pouco complicada (ou eu estou ficando velho) mas o resultado vale a pena ser entendido.
 ·  Translate
 
Preserving validity in adaptive data analysis

With all data analysis, there is a danger that findings observed in a particular sample do not generalize to the underlying population from which the data were drawn. Adaptive analysis of a data set - where the analyst is informed by data exploration, as well as the results of previous analyses of the data set - can lead to an increased risk of spurious discoveries that are neither prevented nor detected by standard approaches. 

In order to increase the reliability of data driven insights, researchers from Google, Microsoft, IBM, the University of Toronto, Samsung and the University of Pennsylvania have introduced the reusable holdout mechanism, a new methodology for navigating the challenges of adaptivity which allows the analyst to safely validate the results of many adaptively chosen analyses without the need to collect costly fresh data each time. 

Learn more on the Google Research blog, linked below.
3 comments on original post
1
Add a comment...

Carlos Santos

Shared publicly  - 
 
 
"Unless humans slow the destruction of Earth's declining supply of plant life, civilization like it is now may become completely unsustainable, according to a new article".

(Posted by +rasha kamel​)
Unless humans slow the destruction of Earth's declining supply of plant life, civilization like it is now may become completely unsustainable, according to a new article.
View original post
1
Add a comment...

Carlos Santos

Shared publicly  - 
 
 
"Rapid advances in sequencing technology are expanding our understanding of biodiversity and evolution in complex plant groups, but access to samples remains a problem. Herbarium material provides a readily accessible solution, but to date has had limited use. Researchers have developed a genomic data set for Solidago using only herbarium material. Called 'next-generation sampling,' this innovative sampling strategy could transform how scientists obtain data sets for species-rich plant groups".

(Posted by +rasha kamel​)
View original post
1
Add a comment...

Carlos Santos

Shared publicly  - 
 
 
A nice list with interesting history — I didn’t know most of these (Thanks to Guy Haas who sent it to me): Although “Amazing Grace” Hopper is sometimes mentioned, Lovelace often serves as a token when talking about women in technology. However, her…
View original post
1
Add a comment...
People
Have him in circles
125 people
Luis Roberto P. Paula's profile photo
asia medici's profile photo
Maria Lucia Bacic's profile photo
Renata Ayabe's profile photo
Maria Júlia de Almeida's profile photo
Du Jake's profile photo
Alice Bacic's profile photo
Octávio Weiss Ribeiro's profile photo
Jessika  Taice's profile photo
Basic Information
Gender
Male
Links
Contributor to