Posts

Post has attachment

TensorFlow is a framework for creating artificial neural networks and other machine learning models that was recently open-sourced by Google. It approaches the problem from a much lower level of abstraction than the biological metaphor that one typically uses when talking about neural networks. So in this post I wanted to explain how this approach matches up with the more traditional description.

Add a comment...

Post has attachment

Public

There are a number of different ways to interpret the inputs, outputs and weights involved in an artificial neural network. In this ShapeOfData post, I explain how one can think of the wights between layers/rows of neurons as linear transformations. I also explain (roughly) how the word2vec algorithm uses this approach to embed words in a high-dimensional space in a way that reflects their relative semantics.

Add a comment...

Post has attachment

After a long break, I finally had some time to write a new Shape of Data post.

Add a comment...

Post has attachment

Convolutional neural networks have become very popular for image processing. (Some of these networks are better at distinguishing cats from dogs than humans are.) In this post, I discuss how the mechanics of neural networks are very natural for image processing, and how convolutional neural networks overcome some of the issues that arise when using standard neural networks for image recognition.

Add a comment...

Post has attachment

Some thoughts on why I left academia...

Add a comment...

Post has attachment

Public

Precision/Recall and the ROC (or AUC) score are ways of determining how accurate the results of a classification algorithm are. There's some very simple, but nonetheless interesting, geometry behind how they work.

Add a comment...

Post has attachment

In my new Shape of Data post, I explain how many data analysis techniques involve using gradient descent to find a point in a configuration space that maximizes or minimizes a some function.

Add a comment...

Post has attachment

After a long break from blogging, things have finally settled down enough that I think I can get back to writing regularly. This post doesn't have much geometry in it, but future posts will!

Add a comment...

Post has attachment

Many of the interesting problems in data analysis these days have to do with running relatively simple algorithms on massive data sets that are too big to be handled by a single computer. (This is, essentially, what the buzzword "Big Data" refers too.) Initially, it was common to rework individual algorithms to work in a distributed setting one at a time. But as patterns began to emerge in this process, a number of groups developed general frameworks to make the translation process easier and automate it as much as possible. One of the earliest and most popular frameworks was MapReduce, which I describe in this Shape of Data post.

Add a comment...

Post has shared content

Public

Add a comment...

Wait while more posts are being loaded