Introducing Distributed Code JamAfter 12 years running Code Jam (g.co/codejam) is making an exciting change to the competition this year with their addition of the new Distributed Code Jam track (DCJ) that requires coding for a distributed environment. We sat down with the main creator behind Distributed Code Jam, Google software engineer Onufry Wojtaszczyk, to learn more about him and the new track.Research at Google:
How long have you been involved with Code Jam?Onufry Wojtaszczyk:
Longer than I’ve been at Google, it’s been my favorite programming contest before I joined the company; I really liked the problem quality there, and I still do. Candy Splitting (http://goo.gl/Kj8XV6
) in the Qualification Round of 2011 Code Jam was the first problem I contributed.R@G:
What has your personal experience been with competitive coding competitions?OW:
I started pretty late; I was more focused on math than on computer science during university. I started with Potyczki Algorytmiczne, a once-a-year contest in Poland, and moved on to the Topcoder Open (http://goo.gl/NC1m4u
). I’m less active as a competitor now, and have been preparing problems instead. Google Code Jam and ACM ICPC world finals are the two highest-profile competitions that featured my problems. And now, of course, the Distributed Code Jam.R@G:
How is the DCJ track different than the regular Code Jam track?OW:
I’d like to begin by saying how similar it is. One thing that I was really concerned about was keeping it an algorithmic competition; I didn’t want people to dig into the details of machine architecture, or setting up unix sockets, or whatever.
In terms of what’s different, well, the biggest difference is it’s distributed. That means that when you submit your solution, it will run on a hundred machines,instead of one. Instead of the contestant downloading the input and running their code individually, in DCJ they will upload their code, which will compile and run on multiple machines for them.R@G:
How did you get the idea for DCJ?OW:
The one thing that programming contests didn’t prepare me for were the distributed computations that are common at Google. Many of the problems Google solves - delivering the best Search results, directions in Maps, making sure our data centers operate efficiently - are at a scale that requires spreading the work across multiple machines. So I started playing around with ideas for what a competition that included distributed computing could look like, and what the format [would] be. Then I got a few other Googlers excited about the idea, and that’s how Distributed Code Jam was born.R@G:
Can you describe an example of a DCJ problem?OW:
One example would be the “maximum sum substring” problem, where you have a sequence of integers, some positive, some negative, and you have to find a substring of this sequence that has the highest sum available. On a single machine, the way you solve this is by going linearly along the sequence, remembering at each point the largest sum of a substring ending at this point.
Now what if you have a string of 10¹⁰ integers, and you want your code to run within a 1 second time limit? You’ll need to make use of multiple machines! For example, you can split the input between the machines - each taking a substring of length 10⁸ to process. If the best substring was contained in one of the parts, the problem is easy. The trick, however, is to notice that if it’s not contained in one of the parts, then it will consist of a suffix of one part, then some number of parts (possibly zero) taken whole, and then a prefix of another part. R@G:
Would you say DCJ is harder than Code Jam?OW:
Distributed computing is a new field for programming contests. In the regular programming competitions, there’s a number of tricks of the trade that people have learned, and they take for granted; and the body of knowledge can be a bit intimidating for a newcomer. In the distributed competitions, we are discovering a lot of ideas as we go, so I expect the distributed part might actually be simpler
in the sense that there’s no established “everybody knows this” body of knowledge that you have to master.R@G:
Is DCJ something you would recommend for a novice programmer or a more experienced programmer?OW:
I’d say that you have to have some skill as a programmer to participate; in general you will need the ability to solve a single-machine version of any problem to solve the multi-machine version. This is one of the reasons we won’t begin Distributed Code Jam with a “Qualification Round”, instead only beginning with people who already passed round 3 in the regular Google Code Jam. I hope the problems should be solvable using good thinking and common sense, and not some advanced programming knowledge.To register and learn more about Code Jam and Distributed Code Jam, visit g.co/codejam. Good luck!