### Suresh Venkatasubramanian

Shared publicly -21st century living, in a nutshell.

A philosophy webcomic about the inevitable anguish of living a brief life in an absurd world. Also Jokes

5

Add a comment...

Start a hangout

Suresh Venkatasubramanian

Works at U. Utah

Attended Stanford

Lives in Salt Lake City

1,209,733 views

AboutPostsCollectionsPhotosYouTubeReviews

21st century living, in a nutshell.

A philosophy webcomic about the inevitable anguish of living a brief life in an absurd world. Also Jokes

5

Add a comment...

What's also nice about his post is the distinction between information obtained from you directly and information obtained about you.

DP doesn't protect the latter. And his complaint is in how people expect it should.

While this is true,I think it's a distinction that has taken some time to emerge (because privacy is such a slippery notion) and in some ways DP has helped emphasize this distinction. So a possibly better line of "attack" as it were might be to spell this out more clearly (if it already hasn't been done so) instead of playing a valiant but losing game of whack-a-paper :)

DP doesn't protect the latter. And his complaint is in how people expect it should.

While this is true,I think it's a distinction that has taken some time to emerge (because privacy is such a slippery notion) and in some ways DP has helped emphasize this distinction. So a possibly better line of "attack" as it were might be to spell this out more clearly (if it already hasn't been done so) instead of playing a valiant but losing game of whack-a-paper :)

Another great post by Frank McSherry resolving some misconceptions about the guarantees of differential privacy under correlations between data entries. My favorite passage:

"By way of analogy, we use cryptography to protect our financial information, even though there is probably lots of correlation among our bank accounts. If I see you moving into your new house and call to ask about your mortgage, because often people with new houses have mortgages, have we violated the security guarantees of the cryptosystem your bank uses? Is it now time to write a paper: "cryptography vulnerable to correlated data", announcing that to provide its guarantees, modern cryptography assumes all messages are independent? Please don't."

https://github.com/frankmcsherry/blog/blob/master/posts/2016-08-29.md

"By way of analogy, we use cryptography to protect our financial information, even though there is probably lots of correlation among our bank accounts. If I see you moving into your new house and call to ask about your mortgage, because often people with new houses have mortgages, have we violated the security guarantees of the cryptosystem your bank uses? Is it now time to write a paper: "cryptography vulnerable to correlated data", announcing that to provide its guarantees, modern cryptography assumes all messages are independent? Please don't."

https://github.com/frankmcsherry/blog/blob/master/posts/2016-08-29.md

4

Add a comment...

I hadn't seen this response to the McSherry et al criticisms of the paper trashing differential privacy. +Moritz Hardt, +Aaron Roth, +Anand Sarwate any thoughts?

http://blogs.harvard.edu/infolaw/2016/05/17/diffensive-privacy/

http://blogs.harvard.edu/infolaw/2016/05/17/diffensive-privacy/

A Response to the Criticisms of Fool's Gold: An Illustrated Critique of Differential Privacy. By Jane Bambauer and Krish Muralidhar. Two years ago, we coauthored an article that challenged the popular enthusiasm for Differential Privacy. Differential Privacy is a technique that permits ...

1

1

Aaron Roth

+

3

4

3

4

3

Frank's response sums it up pretty well: https://github.com/frankmcsherry/blog/blob/master/posts/2016-05-19.md

Add a comment...

Do you want to know when your child is born whether they will commit a crime at age 18? This person thinks he can do it.

1

3 comments

Just so the internets know: that was tongue firmly in cheek.

Add a comment...

This is a short and sweet illustration of the challenges of mathematical modeling

If you're trying to solve a real world problem I think you have to tread very carefully, at the early stages, in the way your formulate it mathematically. Once you've formulated it, it's easy to imagine that this is now the problem you need to solve and forget about the original problem. You can then spend a long time making progress on solving the formulation, but fail to realise that you're actually stuck in a rut and that there's another formulation of the original problem that works much better.

I had the problem of trying to find optimal (in some sense) paths for moving a large number of vehicles from various A's to various B's. It seemed natural to pick one vehicle and then optimise for that. The problem of finding the optimal path could then be written as something like a calculus of variations problem. At this point I was now trapped in a particular formulation. Trying to solve the calculus of variations problem led to an ODE, but instead of initial conditions it had both initial and terminal conditions. You can discretise time and use a standard ODE solving method, but you can't simply step forward through time from the start as you don't know, for any choice of starting path, whether it will end the right way. It turns into a messy search problem. And then I had the prospect of solving this problem reliably for a large number of paths.

Eventually I realised I had to break out of my original formulation. Instead of solving for one entire path at a time it works better to solve for many paths incrementally and simultaneously. If you can solve and tabulate the solutions to the problem of getting from each A to each B over the time period s to t, and you can solve from the short time period from s-ds to s, then you can combine these to solve for the entire table of solutions from s-ds to t. Now it's just recursion. This is classic dynamic programming, which was in fact originally invented to solve this very problem in the case where the vehicles were missiles. It can be incredibly easy compared to solving millions of ODEs with initial and terminal conditions.

I think I basically recapitulated the history of control theory, which is unsurprising as I was getting my problem solving clues from web pages on control theory and colleagues who had some experience with related problems. The former approach is the Pontryagin principle approach [1] and the latter is the Hamilton-Jacobi-Bellman approach [2] (which is similar to reinforcement learning). The former approach is probably a good choice if there's a chance you can solve the equations exactly and get a simple formula. The latter approach seems to work better when the quantity you're optimising is determined by big tables of numbers.

[1] https://en.wikipedia.org/wiki/Pontryagin%27s_maximum_principle

[2] https://en.wikipedia.org/wiki/Hamilton%E2%80%93Jacobi%E2%80%93Bellman_equation

I had the problem of trying to find optimal (in some sense) paths for moving a large number of vehicles from various A's to various B's. It seemed natural to pick one vehicle and then optimise for that. The problem of finding the optimal path could then be written as something like a calculus of variations problem. At this point I was now trapped in a particular formulation. Trying to solve the calculus of variations problem led to an ODE, but instead of initial conditions it had both initial and terminal conditions. You can discretise time and use a standard ODE solving method, but you can't simply step forward through time from the start as you don't know, for any choice of starting path, whether it will end the right way. It turns into a messy search problem. And then I had the prospect of solving this problem reliably for a large number of paths.

Eventually I realised I had to break out of my original formulation. Instead of solving for one entire path at a time it works better to solve for many paths incrementally and simultaneously. If you can solve and tabulate the solutions to the problem of getting from each A to each B over the time period s to t, and you can solve from the short time period from s-ds to s, then you can combine these to solve for the entire table of solutions from s-ds to t. Now it's just recursion. This is classic dynamic programming, which was in fact originally invented to solve this very problem in the case where the vehicles were missiles. It can be incredibly easy compared to solving millions of ODEs with initial and terminal conditions.

I think I basically recapitulated the history of control theory, which is unsurprising as I was getting my problem solving clues from web pages on control theory and colleagues who had some experience with related problems. The former approach is the Pontryagin principle approach [1] and the latter is the Hamilton-Jacobi-Bellman approach [2] (which is similar to reinforcement learning). The former approach is probably a good choice if there's a chance you can solve the equations exactly and get a simple formula. The latter approach seems to work better when the quantity you're optimising is determined by big tables of numbers.

[1] https://en.wikipedia.org/wiki/Pontryagin%27s_maximum_principle

[2] https://en.wikipedia.org/wiki/Hamilton%E2%80%93Jacobi%E2%80%93Bellman_equation

2

My colleague John Hughes has a bunch of great examples of mistaking the model for the domain. One needs a certain level of maturity to appreciate the Modelers' Hippocratic Oath:

https://www.wilmott.com/archives/707 [CC +Dan Piponi]

https://www.wilmott.com/archives/707 [CC +Dan Piponi]

Add a comment...

This explains why my students' drafts have so much process writing in them.

3

Add a comment...

Moritz tackles a very important issue in this post: defining goals for fairness in Machine Learning. It's trickier than it seems, but a very important topic.

Many have given up on the very concept, often arguing that machine learning isn't biased, but merely mirrors our own biases by fitting models to 'naturally occurring' biased data. That's simply not true: machine learning applied without care is inherently biased toward where there is a lot of data in the first place: think groups of people who consume more, or ethnicities that show up more often on medical records. It is something we can control and characterize however, as long as we have the proper definitions in place of what it means to a probabilistic model to be 'fair'.

Many have given up on the very concept, often arguing that machine learning isn't biased, but merely mirrors our own biases by fitting models to 'naturally occurring' biased data. That's simply not true: machine learning applied without care is inherently biased toward where there is a lot of data in the first place: think groups of people who consume more, or ethnicities that show up more often on medical records. It is something we can control and characterize however, as long as we have the proper definitions in place of what it means to a probabilistic model to be 'fair'.

As machine learning increasingly affects domains protected by anti-discrimination law, there is much interest in the problem of algorithmically measuring and...

3

Add a comment...

And he did it!

+John Moeller

+John Moeller

My student +John Moeller (moeller.fyi) just defended his Ph.D thesis today! and yes, there was a (rubber) snake-fighting element to the defense. John's dissertation work is in machine learning, but his publications span a wider range. He started off with a rather hard problem: attempting to ...

13

1

Add a comment...

Yes, you read that correctly. The whole of Dagstuhl is now a Pokégym and there are Pokémon wandering the streets of Wadern (conveniently close to the ice cream shop that has excellent ice cream!) Given this latest advancement, I was reminded of Lance Fortnow's post about Dagstuhl from back in ...

6

1

Add a comment...

On what it means to do data-driven science: some thoughts on +Carl Zimmer's excellent series on exploring his genome.

If you haven't yet read Carl Zimmer's series of articles (one, two, three), you should go out and read it now! Because after all, it's Carl Zimmer, one of the best science writers around, especially when it comes to biology. But even more so because when you read the story of his personal quest ...

1

1

Add a comment...

The first ruling on algorithmic decision-making!

1

Add a comment...

I was invited last week to an event co-sponsored by the White House,Microsoft, and NYU called AI Now: The social and economic implications of artificial intelligence technologies in the near term. …

2

1

Add a comment...

Suresh's Collections

Communities

8 communities

Story

Tagline

I like algorithms. And I hope they're fair.

Introduction

CS prof, interested in algorithms, geometry, data mining, clustering

Education

- Stanford
- IIT Kanpur

Basic Information

Gender

Male

Other names

geomblog

Collections Suresh is following

Work

Occupation

Researcher

Employment

- U. UtahAssociate professor, present

Places

Currently

Salt Lake City

Previously

Aarhus, DK - Stanford, CA - Philadelphia, PA - Morristown, NJ - New Delhi, India - Berkeley, CA

Links

Public - a year ago

reviewed a year ago

That this restaurant has an overall rating less than 4 is a travesty. Skip Finca's overrated food and preserve your ear drums: Cafe Madrid is a much more intimate (read: quiet and charming) Spanish fine dining experience, with possibly the best service I've ever had in Salt Lake City. Call ahead if you want the paella: it's worth it.

Public - a year ago

reviewed a year ago

Public - a year ago

reviewed a year ago