Cover photo
Andrej Karpathy
Worked at Google
Attends Stanford University
Lives in Stanford
5,547 followers|227,423 views


Andrej Karpathy

Shared publicly  - 
We've started an "AI Salon" at Stanford where a few of us grad students interested in AI topics get together and chat about recent AI trends, past reflections and future directions. We're kicking things off tomorrow talking about IBM's Watson: "Has IBM’s Watson driven forward research or was it primarily an engineering accomplishment?"

I am one of two people moderating the discussion and I will be arguing (irrespective of my own opinion :p) that Watson has contributed to driving research forward. I'm preparing and finding links/arguments to support these claims in order to seed the followup discussion, and would be curious to hear what others think about the topic as well.

For instance, it can be argued that the main paper published on Watson ( is too high level (many details of the involved modules are left out) and doesn't actually offer any contributions that can be  easily followed up on by the community. On the other hand, Watson's performance was very impressive on Jeopardy so the system is still an example of how far current technology can take us if we glue it together? Certainly that must have some value. In addition the paper has 330 citations so far, which I'm just going through and trying to categorize the citation contexts.

Another question is whether such complicated systems like Watson, with many moving parts and engineered modules are necessary to achieve that level of performance. Are we missing a more sophisticated and yet undiscovered algorithm/approach that is much cleaner, generalizable and less systemy? Or have we pretty much discovered all the necessary algorithms (lego pieces) and all that's left is for someone to just build the full lego castle and address all the edge cases that come up with hacks?

Few links so far:
DeepQA research team:

Neat video explaining the technology on high level:
Xu Jia's profile photoMichael Tetelman's profile photoChris Baldassano's profile photoAndrej Karpathy's profile photo
Good point. And unfortunately they're not very specific about what drove most of that performance increase over time. It could be improvements in parsing / etc, or it could be sheer brute-force amount of code for handling all kinds of edge cases that come up in the specifics of Jeopardy, such as handling the funny ways "this" is used in clues, or other specific syntactic constructs.
Add a comment...

Andrej Karpathy

Shared publicly  - 
Today I attended a talk by Douglas Hofstadter: The author of GEB, which I have fond memories of reading twice as an undergraduate student (although I felt it’s necessary to skip around a bit). This post is a summary of the main themes of the talk.

The topic of the talk was Categories, but Douglas used the term in a more general sense than it is usually used. His talk mainly consisted of an enumeration of examples for Categories:
- Simple things first: chair, phone, cooking, etc. nouns and verbs are categories: they each express a certain thing/action/event and refer to things in the world.
But more importantly, he relaxed the notion to higher level ideas/events/situations. For example,
- Conjunctions are categories, in the sense that using “and”, “but”, “however”, “on the other hand” etc are all a way of expressing or modifying an idea that is being communicated, all with slightly different meanings. He seemed interested in trying to more accurately define when each is used, but ultimately argued that categories have fuzzy boundaries.
- The expressions “Killing two birds with one stone”, “Tail wagging the dog”, “Left hand doesn’t know what the right hand is doing”, etc. are all categories that describe a high level situation, or a particular arrangement of events or interactions. He listed a few from other languages as well.
- Interestingly, he argued that in chinese there are multiple words for “fall”, all to be used in different situations and therefore all describing a different category.
- In another example, he recalled a story with his then 1-year-old son at the Grand Canyon who, when put at the edge of the canyon, ignored the majestic vista and instead observed ants on the rocks nearby. He came up with a name for that category “Danny at the Grand Canyon”, which refers to a situation where someone is paying little attention to something others would deem exciting.
In general, he made the point that we acquire new categories over time, that we reason about categories primarily through analogy, and identified this process as the cornerstone of intelligence.

I found the exposition a little imprecise so to re-phrase and summarize my own understanding in my own vocabulary would look something as follows:
- Our ideas/events/concepts occupy a complex and high-level “discourse space” in which the metric is analogy and this ability is unique to humans and central to intelligence. Regions in this space are categories, and some of them are flagged with expressions that we can use to communicate the concepts to each other. Moreover, these are acquired over time automatically by observing each other speak. 
- What was not clear to me was how his notion of a Category related to language. On one hand, he uses the term “discourse space” (which has strong linguistic connotations) and argues that multiple words for “fall” delineate multiple distinct categories (i.e. language informs categories), but then he also had an example where he himself made up a name for a category (“Danny at the Grand Canyon”) which seems to imply Categories as a more general concept which can at some point acquire a linguistic label. Do they inform each other? I wish I had asked the question.
- I'm also not comfortable with a flat 1-of-k approach to separating out a semantic space with "categories", some of them broad and some fine-grained. The entire notion of category rubs me the wrong way with arbitrariness, not only in this talk but also in say, Computer Vision/Machine Learning and our datasets.

Unfortunately, overall I was slightly disappointed because it seemed to me that while the talk gave a few interesting examples of “categories”, the ways we acquire them and how we communicate them, none of it felt very actionable. There were no insights that suggested an algorithm, only a set of observations that point out that humans are indeed capable of acquiring new (fuzzy) categories in a magical “discourse space” and map/compare between them with a magical process of analogy.

But then, GEB was also just a nice collection of thought-provoking examples, not a manual for AI. So in that sense the talk was a success and it's all fun to think about :)
Brian Merrell's profile photoDavid Andrews's profile photoRahul Sukthankar's profile photo
Many of these ideas are discussed in his new book, "Surfaces and Essences: Analogy as the Fuel and Fire", which I read over the holidays.  I don't agree with everything he says there but I found it thought provoking.  The book is a much faster read than GEB so I recommend you check it out.
Add a comment...
In this recent interview with DataSciNews I had a chance to talk a bit about ConvNetJS and some slightly crazy ideas on future of Javascript + Machine Learning
Dan Farmer's profile photoGreg Belanger's profile photoKashif Ansari's profile photoBrian Pin's profile photo
+ 100 -- I knew the projects were good, but that was really a great interview too. Very interesting. Thanks for sharing!
Add a comment...

Andrej Karpathy

Shared publicly  - 
I've been powering through the videos of lectures for this EdX class on Justice by Sandel. This is basically the MOOC dream class for me: A class on something I know little about taught well and by a very engaging/accomplished instructor. (Though he's best at 2x speed, of course!)

The class is about what's fair/just and includes great discussions about Utilitarianism (which have swayed my opinions measurably away from it), Libertarianism, discussions of several prominent moral philosophers (John Locke, Immanuel Kant, John Rawls, Aristotle) and several tricky topics and cases ("Forced" cannibalism, conscription, democracy, surrogate motherhood, consent, affirmative action, racial discrimination, etc etc.).

Just a thought provoking class that has managed to change and refine several of my views and long-held beliefs. Always nice to come across! Recommended.

EDIT: Doh, I linked to class that will being on Spring 2014, not the archived already finished class. Here is the link:
Vladimir Stepanov's profile photoBill Danza's profile photoJohan Sundström's profile photoRafael Espericueta's profile photo
An excellent instructor can indeed make a subject interesting that otherwise would be utterly uninteresting.  

But still...  I get angry every time I turn my attention to the absurdity of our medieval system of "justice"; for it really seems to be based on conceptions of reality forged in the dark-ages.  Large numbers of innocent people (victimless crimes are not crimes, but the unfortunates being punished for them are REAL victims) languish in unthinkable conditions.  I can now feel my blood pressure rising.   Guess I should avoid that class, no matter how fantastic the lecturer...   ;-)
Add a comment...

Andrej Karpathy

Shared publicly  - 
I used to play quite a bit of chess in a previous life and my passion was recently reignited after watching the recent Chess World Championships.

As part of brushing up on my old skills, I randomly discovered Notably, they have a very interesting webapp for Chess Tactics training (Training > Chess Tactics), which you can find here:

Every board is a puzzle and you must find the right sequence of moves that win material in next few turns. It's addicting, fun, exceptionally educational and nicely made! But what's most interesting is the general concept of snippets of challenges, the gamification of the improvement, and various other features that come with the app. For example, once you make a guess you are presented with a small comments section of discussion surrounding that board and further analysis.

Application of paradigm to tips/tricks in programming?
It occurred to me that something like this would also be wonderful in other domains, especially for teaching/improving programming: You are presented with a short snippet of code and asked to fill it for some desired behavior. You get points for time to write the code, the execution time, shortness, etc. Once you submit, you get to see what other people wrote and perhaps up and down vote solutions. The core focus would be on relatively short tricks of things you don't regularly use enough of, involving list comprehensions, decorators, clever use of various built in data structures, etc.

I haven't seen something like this and a brief search seems to indicate that it maybe doesn't even exist. Codecademy has a concept of some Challenges (BETA) but it's not very good. In particular, they tell you if you get it right or wrong but don't show you solutions of other people, give you points, ladders, rating, or anything else.

Similarly, Project Euler, TopCoder and so on are focused on larger chunks of code that are sprinkled with math/algorithms :(

</random thought>
Add a comment...

Andrej Karpathy

Shared publicly  - 
I just attended the first portion of the Data-driven education at NIPS ( ) . A few interesting highlights:

Presentation on EdX:
- on orders of 20-100K students / class register
- ~1-12K finish or explore more than half
- most popular: CS, Health in Numbers, JusticeX
- 25% from US, ~15% India, then Spain, etc... 
- No China demographics since YouTube is blocked in China :)
- Spain/Greece have much higher completion rate than others. Why? (hypothesis: unemployment, people trying to pad resumes)
- Mongolia has much higher ratio of female enrollment. (hypothesis: cultural)
- US policy doesn't allow EdX to give out course certificates to Iran (and few countries)
- Demographics: Level of education: primarily Bachelors+, Masters+. Lifelong learners. No overly-eager, enthusiastic high-school students. (Peter Norvig in audience comment: these are just the early adopters, like with iPhones on first release day)
- "India effect": A Robust and significant observation that people in India do not seem to watch lecture videos. Not an observed effect elsewhere (speaker mentioned he did not understand why).

Khan Academy presentation
- ~72mil registered users
- >1.6bil answered problems
- 800K hours/month watch time
- Khan Academy will release partially anonymized usage data! (exercise logs, etc. Nice!)
- presentation unfortunately primarily about describing the Khan Academy Mastery Learning approach / user interface, not on modeling or juicy statistics :(
- Users modeled with an attribute vector, performance on exercises is predicted with logistic regression, time to complete exercise is also predictive.

All slides will eventually be available on the workshop website. I particularly recommend waiting for the EdX presentation slides as there were a lot of interesting statistics. Really liked that presentation and the speaker, Daniel Seaton.
James McInerney's profile photo
Add a comment...
Have him in circles
5,547 people

Andrej Karpathy

Shared publicly  - 
Federico Pernici's profile photoMelanie St's profile photoDavid Andrews's profile photoPeter Luschny's profile photo
This was a fun video, thanks.  But also see today's cold, wet fish:
Add a comment...

Andrej Karpathy

Shared publicly  - 
Another Academia GIFs Tumblr. I'm thankful for people who have time to make these
Add a comment...
My new pet project I've been hacking on over Christmas break: ConvNetJS

(see demos that train Convolutional Neural Networks on CIFAR10,  MNIST entirely in your browser))

ConvNetJS is a Javascript library for training Deep Learning models (mainly Neural Networks) entirely in your browser. Open a tab and you're training. No software requirements, no compilers, no installations, no GPUs, no sweat.
Clement Farabet's profile photoYoungjin Yoon's profile photoFengchao Xiong's profile photoAndrej Karpathy's profile photo
That's awesome!
Add a comment...

Andrej Karpathy

Shared publicly  - 
Training Object Detectors has never been this easy. From your webcam, and in your browser.
For those of you curious regarding what I've been up, here's a short description and the Kickstarter link to our project!  

We've built a webapp for real-time training of visual object detectors, an API for building vision-aware apps, and an ecosystem for doing all of this inside the browser.  You won’t need to be a C++ guru or know anything about statistical machine learning algorithms to start using laboratory-grade computer vision tools for your own creative uses.

We've been working on this full-time since the Summer, but to deliver something truly awesome at a great price for everyday developers, we've gone to Kickstarter!  There are some really awesome rewards for early backers, so please share the link with friends and feel free to ask me anything related to the project!
Srimugunthan Dhandapani's profile photoYantao Xie's profile photo
Add a comment...
Have him in circles
5,547 people
PhD Student
  • Google
    Research Intern, 2011 - 2011
Map of the places this user has livedMap of the places this user has livedMap of the places this user has lived
Mountain View - Kosice - Toronto - Vancouver
Computer Science PhD student at Stanford. I love technology, robots, and artificial intelligence
Computer Science PhD student at Stanford, working on Machine Learning and Vision. On a quest to solve intelligence.
  • Stanford University
    PhD Computer Science, 2011 - present
  • University of British Columbia
    MSc Computer Science, 2009 - 2011
  • University of Toronto
    BSc Computer Science and Physics, 2005 - 2009
Basic Information