### Dan Piponi

Shared publicly -For the flashcard app I wrote recently I wanted a nice way to extend the colour of a flashcard to fill the background of the UI. You may have noticed iTunes does something like this with album covers. This problem has been informally called "getting the dominant colour" [1]. I thought I'd make up my own approach, but being lazy I didn't want to write code making histograms or clusters or anything like that.

I think (can't find the link) I mentioned on G+ a while back that there is an iterative algorithm for computing the median of a set of numbers using majorise-minimization [2]. The median of a set of colours doesn't make sense, but the algorithm has an obvious extension to 3D so I tried that. It works quite nicely. For example, if you look at the examples, I used the mean colour to generate the border on the left and I used this "median" to generate the colour on the right. I much prefer what happened on the right. It's picked out a colour from the sky instead of just making a dark mush that blends sky and ground.

Anyway, as might be expected, a few minutes web searching revealed that I didn't invent anything new. The quantity I compute is called the Geometric Median [3]. It has a centuries old history and enough literature to generate at least one survey paper. The algorithm I came up with is standard too.

Still, before writing this code I hadn't really appreciated that there is an N-dimensional generalisation of the median. It has some nice properties that make it a decent candidate for the "dominant colour". For example, if you're limited to modifying less than half of the pixels, no matter how much you change their values, the change in the geometric median remains bounded. That means the geometric median is fairly robust against things like a music distributor slapping a bright yellow SALE sticker on the album cover. If one colour really is "dominant", the geometric median will usually pick out something close to it.

I think an "N-median" variation of the N-means clustering algorithm algorithm could work nicely too but that would require actual work.

[1] http://stackoverflow.com/questions/5050250/fast-way-of-getting-the-dominant-color-of-an-image

[2] http://sites.stat.psu.edu/~dhunter/papers/mmtutorial.pdf

[3] https://en.wikipedia.org/wiki/Geometric_median

[4] https://en.wikipedia.org/wiki/Robust_statistics#Breakdown_point

I think (can't find the link) I mentioned on G+ a while back that there is an iterative algorithm for computing the median of a set of numbers using majorise-minimization [2]. The median of a set of colours doesn't make sense, but the algorithm has an obvious extension to 3D so I tried that. It works quite nicely. For example, if you look at the examples, I used the mean colour to generate the border on the left and I used this "median" to generate the colour on the right. I much prefer what happened on the right. It's picked out a colour from the sky instead of just making a dark mush that blends sky and ground.

Anyway, as might be expected, a few minutes web searching revealed that I didn't invent anything new. The quantity I compute is called the Geometric Median [3]. It has a centuries old history and enough literature to generate at least one survey paper. The algorithm I came up with is standard too.

Still, before writing this code I hadn't really appreciated that there is an N-dimensional generalisation of the median. It has some nice properties that make it a decent candidate for the "dominant colour". For example, if you're limited to modifying less than half of the pixels, no matter how much you change their values, the change in the geometric median remains bounded. That means the geometric median is fairly robust against things like a music distributor slapping a bright yellow SALE sticker on the album cover. If one colour really is "dominant", the geometric median will usually pick out something close to it.

I think an "N-median" variation of the N-means clustering algorithm algorithm could work nicely too but that would require actual work.

[1] http://stackoverflow.com/questions/5050250/fast-way-of-getting-the-dominant-color-of-an-image

[2] http://sites.stat.psu.edu/~dhunter/papers/mmtutorial.pdf

[3] https://en.wikipedia.org/wiki/Geometric_median

[4] https://en.wikipedia.org/wiki/Robust_statistics#Breakdown_point

7

1

2 comments

One issue, which could be significant, is that the geometric median doesn't necessarily have to be one of your input data points. There are other definitions of median-like quantities in general spaces that do have that property (sacrificing other things to get it).

Add a comment...