Interesting throughout, a bit concerned by:
QUANTA MAGAZINE: You think the goal of your field should be developing artificial intelligence that is “provably aligned” with human values. What does that mean?
STUART RUSSELL: It’s a deliberately provocative statement, because it’s putting together two things — “provably” and “human values” — that seem incompatible. It might be that human values will forever remain somewhat mysterious. But to the extent that our values are revealed in our behavior, you would hope to be able to prove that the machine will be able to “get” most of it. There might be some bits and pieces left in the corners that the machine doesn’t understand or that we disagree on among ourselves. But as long as the machine has got the basics right, you should be able to show that it cannot be very harmful.
Yikes, using the basis of human values as an ethical system worries me--in general we are horrendous at this but I guess it's the only example we have of non-self-nullifying (most of the time) value systems.