Shared publicly  - 
 
In 1936 Tarski proved a fundamental theorem of logic: the undefinability of truth.   Roughly speaking, this says there's no consistent way to extend arithmetic so that it talks about 'truth' for statements about arithmetic.  Why not?  Because if we could, we could cook up a statement that says "I am not true."  This would lead to a contradiction, the Liar Paradox: if this sentence is true then it's not, and if it's not then it is.

This is why the concept of 'truth' plays a limited role in most modern work on logic... surprising as that might seem to novices!

However, suppose we relax a bit and allow probability theory into our study of arithmetic.  Could there be a consistent way to say, within arithmetic, that a statement about arithmetic has a certain probability of being true? 

We can't let ourselves say a statement has a 100% probability of being true, or a 0% probability of being true, or we'll get in trouble with the undefinability of truth.  But suppose we only let ourselves say that a statement has some probability greater than X and less then Y of being true, where 0 < X < Y < 1.  Is that okay?

Yes it is, according to this draft of a paper by Paul Christiano, Eliezer Yudkowsky, Marcello Herresho and Mihaly Barasz!   

But there's a catch, or two.  First there are many self-consistent ways to assess the probability of truth of arithmetic statements.  This suggests that the probability is somewhat 'subjective' .  But that's fine if you think probabilities are inherently subjective.  

A bit more problematic is this: their proof that there exists a self-consistent way to assess probabilities is not constructive.  In other words, you can't use it to actually get your hands on a consistent assessment.

Fans of game theory will be amused to hear why: the proof uses Kakutani's fixed point theorem!  This is the result that John Nash used to prove games have equilibrium solutions, where nobody can improve their expected payoff by changing their strategy.  And this result is not constructive.

If you don't know Tarski's original result, try this:

http://en.wikipedia.org/wiki/Tarski%27s_undefinability_theorem
80
31
Alok Tiwari's profile photoJohn Baez's profile photogwern branwen's profile photoMichael Tetelman's profile photo
77 comments
 
It seems to me that this is in the same "area" as Chaitin's Ω. Also what of Fuzzy Logic? (There is a Fuzzy Definition of Probability.)
 
Do you think the result is solid and will hold up?
 
Very interesting, thanks for sharing this, recently I have been thinking about Dirac's beauty test for mathematical "truth" for various reasons and this gives me even more to think about :) 
 
Many years ago I took a course in Mathematical Logic from Don Monk, a student, of Tarski. Although this was long ago, almost another life, I still like thinking on these things. Thanks for the post.
 
Epic Rap Battles of History: Tarski vs Gödel
 
I don't get it. Can you elaborate on why truth can't be defined just because there are statements which are inconsistent with itself? 
 
So, could a statement about probability of truth be a 100% true itself?
If not that could lead to a prob of prob and so on, which, probably is the best resolution of "the undefinability of truth."
 
+Elias Mårtenson, For a good layman’s, but not easy, introduction to matters such as that, check out Douglas Hofstadter's Gödel, Escher, Bach.
 
Because if we could, we could cook up a statement that says "I am not true."  Is that somehow supposed to be obvious? I don't get it.
 What if that system did not let me make up paradoxical statements?
 
+Sivaramakrishnan Swaminathan "What if that system did not let me make up paradoxical statements?" Then it would be so incomplete that it would border on being useless for the work of mathematics. These theorems are proved within specific systems that do allow self-reference, through the use of certain tricks, and any system that does not allow such tricks also does not allow mathematics as we know it.
 
+Sivaramakrishnan Swaminathan You are aksing a fair question, and the answer in +Bill Reed 's comment is correct, but maybe a little short. A slightly longer answer is that it is hard to define "does not let me make up paradoxical statements". What people like Gödel and Tarski showed is that even very modest systems of logic (roughly, any system that allows some simple deductions and is rich enough to capture integer arithmetic) can be used to construct paradoxical statements. 

I'm sure other commenters will correct me if I've misstated anything. For a popular introduction to this stuff, I also recommend Hofstadter's Godel, Escher, Bach. 
 
+Adam Smith & +Sivaramakrishnan Swaminathan I was thinking that anyone that has worked through these systems, knows what was being hinted at, but that any time you start trying to fill in the details, with caveats and all, it becomes not so easy to explain in a simple way. What +John Baez was doing was giving an analogy and should be treated only as such. Hofstadter's Godel, Escher, Bach is the answer ... that and a lot of work.
 
If we attach a probability of being true to a statement, that reflects our lack of knowledge about the truth. A consistent theory in terms of probability of truth is a consistent theory that doesn't say much... Here is a consistent theory in terms of probabilities: "All statements are 50% likely to be true, including this one" :)
 
+Cristian Georgescu Inconsistent! Assume two independent propositions A and B. By your theory pr(A) = pr(B)= .5. Now by probability theory pr(A&B), where A and B are two independent propositions, is pr(A)Xpr(B) or in this case .25. But by your theory pr(A&B)= .5. Contradiction as .25 ≠ .5 ... Your theory has a 0% probability of being true.
 
+gwern branwen - I didn't go through the proof in detail yet, but it's short and I understand the mathematical tools used, so I should do it now.   Obviously I wouldn't have blogged about it this way if I thought it were fishy.
 
I am more concerned about what happens with the concept of truth if we attach a probability to it rather than the mathematics of working out a Tarski result in a multi-valued logic. What would be the meaning of something like: "The Goldbach's conjecture is true with a probablility of 64.2%" ?
 
+Bill Reed is correct that we can't assign a probability 50% to every proposition; if we could, the result in this paper - the existence of a well-behaved way to assign probabilities - would be extremely easy!   But the equation

pr(A&B) = pr(A) pr(B)

doesn't play a role in the formalism of this paper.  Instead - see Theorem 1 - the authors assume that for every pair of propositions
A and B,

pr(A) = pr(A&B) + pr(A & not(B))

This also rules out assigning a probability 50% to each proposition, since then we'd have 50% = 50% + 50%.
 
+Cristian Georgescu I can offer a possible interpretation of what it might mean, but I'm way out of my depth. One of the consequences of Godel's and Tarski's is the "existence" of multiple versions of mathematics. I would suggest that the probability then could be a measure of the versions for which that proposition is true ... i.e. probability of 50% means that the theorem is true in 50% of these multiple versions. (This is very imprecise, so I hope you understand it.)
 
+John Baez I used two independent propositions for simplicity. I have not read the paper.
 
When we assign probability to every statement A, that means that even statements like "P(A) = .5" have a probability of .5, and also "P("P("P(...) = .5") = .5") = .5" and so on. That's why I added "including this statement" after "all statements have a probability of 50% to be true". In which case, the statement "P(A) = .5" can be true in certain cases, or false in some other cases, as pointed out by John Baez and Bill Reed. Again, I am not trying to be critical to the article, but rather have concerns that after assigning a probability to the truth we have problems understanding what we are talking about.
 
+Cristian Georgescu Did you see my suggestion for what it could mean? Your way seems to me to lead to nothing being true. ( .5, .25, .125, ... on to 0)
 
+Cristian Georgescu - There are different interpretations of probability theory, but perhaps a statement like "The Goldbach's conjecture is true with a probablility of 64.2%" makes the most sense if you're a subjective Bayesian.  Then this could say which bets you'd be willing to take about the truth of Goldbach's conjecture.  As the word "subjective" indicates, we don't expect, in this interpretation, for any probability to be the "objectively correct" probability:

http://en.wikipedia.org/wiki/Bayesian_probability
 
+John Baez how does your (earlier) objection fare if we replace "P(A) = 0.5" by "P(A) >= 0.5" for every statement A?
 
+John Baez that does make more sense, but what training in Math I have was in the more pure aspects and years ago, so it is not natural for to thinks as a Bayesian!
 
In a certain way people already consider Goldbach's conjecture to be "true" in a Bayesian way since it was proven to be true up to very, very large numbers, and it's more and more true with every new verification. But such a concept of truth would make mathematics look more like physics.
 
Ah, Kosko, me too... What did you make of the thing about information as a fluid? Irrotatable and compressible...
 
+Cristian Georgescu This is also true of the Riemann Conjecture, and perhaps more so because Number Theorist routinely use it to prove other theorems.
 
Maybe a good thing, maybe not that mathematics would be more like physics, especially now that computers are used to prove results that soon will be impractical to verify. But in case of physics we come with theories that are true to the degree that they approximate a reality. What would be the reality that is approximated by mathematics?
 
+Stephen Paul King I've not run into that, but it would not surprise me because the same sort of thing happens with the flow of probability determined from the Schrodinger Equation. Sadly, my interest in these things always reaches further than my knowledge. (I've always seen math as this incredibly fun one person game.)
 
+Alok Tiwari wrote: "+John Baez how does your (earlier) objection fare if we replace "P(A) = 0.5" by "P(A) >= 0.5" for every statement A?"

I forget what my earlier objection was, but it's impossible to have P(A) >= 0.5 for every statement A in the setup discussed in this paper, again because

P(A) = P(A&B) + P(A & not(B))

This equation would imply P(A) >= 1 for every statement, and using this equation again we'd get P(A) >= 2 for every statement!
 
Thanks, +John Baez . Damn. I shouldn't be so lazy. Wow, that was rather embarrassing. (Or maybe there's a sensible question along those lines... anyhow, I should just try and read the article.)
 
I see that there is an implicit assumption that statements like "P(A) = P(A&B) + P(A & not(B))", or "P(A) = 0.5" are either true or not. Why would we assume that? We should give those statements also a probability to be true.
 
Well, the bottom line is we have to base our 'truth' on some 'base', but then how do we know that the 'base' itself is 'true'?
 
+Pranav Deshpande All we have is assumptions and if those assumptions become inconvenient we will have other assumptions. It is something very surprising that the assumptions we call math and logic have been so, well, convenient. It is a mystery why this should be so, but it is.
 
Ah,... the unreasonable effectiveness of mathematics. That's the real mystery, isn't it?
 
+Bill Reed Well said. Its interesting to see that we humans 'have' to make sense of everything, and our drive to understand and rationalize everything based on assumptions! :-) 
 
In the paper they offer a useful interpretation of what their probabilities mean. One way of looking at the result is that they construct (or rather, prove the existence of) a probability measure on the set of all models of their theory, and then the probability of a statement is just the probability that it holds in a randomly chosen model. The existence of such a measure turns out to be equivalent to three very simple axioms that the probabilities should satisfy. In addition to the property that John mentioned above, we want the probability of every tautology to be 1 and the probability of every contradiction to be 0.
 
It seems that self-reference is an interesting topic in Maths and has generated a lot of important theorems. How about in Physics? Could there be some important principles that is related to self-measurement? 
What I have in mind is for a universal theory of universe, a description of the measurement process in this language is necessarily self-measurement...
 
Thanks, +Timothy Gowers, for this cogent explanation!  "the probability of a statement is just the probability that it holds in a randomly chosen model"was I what I was struggling to verbalize in an earlier comment, but this was just a guess on my part.
 
What if we wanted to map statements to complex-valued amplitudes instead of probabilities? Does the idea translate to workable axioms?
 
Fuzzy logics (including probability logics) are not immune to paradoxes.
 
+Cristian Georgescu wrote: "I see that there is an implicit assumption that statements like "P(A) = P(A&B) + P(A & not(B))", or "P(A) = 0.5" are true.  Why would we assume that?"

First, if we don't make any assumptions about how the probabilities P(A) behave, we can't prove anything about them and there's no point writing a math paper about them.  The statement

P(A) = P(A&B) + P(A & not(B))

is a universally accepted law in probability theory, so it makes plenty of sense for the authors to assume it's true and see what they can do with it.  As +Timothy Gowers points out, they show this rule, together with

P(A) = 1

for every tautology A, guarantee that P comes from a probability distribution on the set of 'models' of the axioms of arithmetic.  Conversely, any such probability distribution gives you a choice of P.  Then they go on to study P's obeying certain additional rules.

Second, we have to be very careful about two ways they're using the letter P in their paper: as part of the extended language of arithmetic, and as part of the metalanguage.  (I think the paper would be clearer if they used different symbols for these different uses, though as they point out, no ambiguity arises if you pay careful attention.)  When I'm saying they assume

P(A) = P(A&B) + P(A & not(B))

is true, this is part of the metalanguage, and this does not imply that

P(P(A) = P(A&B) + P(A & not(B))) = 1

in the extended language of arithmetic.  This is a subtle point, so don't feel bad if it makes no sense!
 
+Richard Botting - I don't know Spivey's work, and the link you gave doesn't seem to provide access to his paper.  So, I don't know the answer to your question!  I bet people have tried to sidestep Tarski's theorem in many ways.
 
Posted to the "Foundations of Math" Community.  If for any reason you might not want it there, just let me know!  :)
 
+John Baez. Spivey.... It is a book. I think it was his Ph. D. Thesis from Cambridge U., UK. I may have to check out the copy I ordered 20 years ago for my university library. 
 
Tarsky was a subject of matter in my career for semesters. Love it. I share this post with your permission. kudos.
 
Thank-you!  :)  :)
 
+Cristian Georgescu

The probabilities in question are not like ones that arise from physical processes, such as probability of fair die rolling 6 (as a result of bouncing and symmetry).

They're proving that there's a set of betting weights which is internally consistent. (There would be more than one).
 
A repost from the blog where I fist saw this result:

The use of the Kakutani fixed-point theorem demonstrates that there is a fixed point – an enumeration of probabilities for sentences in L’. Do we know that such an enumeration is unique? How do we know this?

It seems to me that much of the value of having probabilities for statements analogous to the use of the predicate True relies upon uniqueness, and not just consistency.
 
+Tim Wesson

Yes. Also, correct me if I am wrong but fixed point theorem would not demonstrate that an enumeration exists which is consistent with rest of the axioms (e.g. from which you can't derive arbitrarily high probability for contradictions).
 
+Dmytry Lavrov Members of T have probability 1 in A, and the axioms for the coherence of |P eliminate contradictions (for any contradiction ç, ¬ç is a member of T).
 
+Tim Wesson

I was speaking of statements which are not in T but whose "probability" can get arbitrarily close to 1 , not exact contradictions, sorry. In simple terms, its not clear to me that an arbitrary value assigned to a statement being arbitrarily close to 1 implies that you can trust the statement.
 
+Dmytry Lavrov It is possible that the allocation of 'probabilities' is meaningless - perhaps some of the models for T that are equally weighted in the probability calculus are simply implausible, meaning that the 'probability' is skew.

Still, it is worth accounting for axiom 1.  If you suspect that something is wrong with good reason, that reason should factor into the probability for your statement.

Harder are the things that are intuitively dodgy, but you cannot pin down.  The formalist would have no problem with this case - intuition is mere inspiration to the formalist, and is no indication of truth.
 
+Tim Wesson

But what's about possible collisions between those statements? E.g. if you assume two statements true because they have "probability" >0.9999999999999 , which is quite true enough, is it truly unlikely that you'll hit some sort of contradiction? I don't see that proven truly unlikely, though I may be missing something. edit: also this is why I am not sure those numbers can be called probabilities yet.
 
+Dmytry Lavrov Starting with your second statement:  why should these things be called 'probabilities' - well, if they aren't unique, then that is a very good question.  One that I came here hoping for some insight on.  Unfortunately, none of the big guns have turned up, it seems :o(

Okay, to your first question:  it's in the construction, specifically the question of coherence (section 2.1).  That there is coherence implies that a near-contradiction arises exceedingly rarely.  For example, a one-in-a-million falsehood in L will turn up typically one time in a million attempts to make langage L determinate.  What is more, any one such falsehood could easily be constructed not to occur by fixing it as true ahead of its normal position in the queue, eg. right after the base axioms, at T1, instead of waiting until making the determination directly or indirectly at Ti for some i>1.  It is worth noting that making L determinate still does not fix L'\L, that is:  statements involving the probability predicate.

Those statements of high probability in L are those which come out true in almost all models in which statements of L are bivalent - that is, deterministically true or false.  However, your high probability statement is no longer operating in such a scenario, but rather one where only the direct deductions and contradictions from the axioms are bivalent.  Thus your scenario of high probability falsehoods does not apply, since the only truths and falsehoods are already predetermined, and no more will arise.  Instead, statements deductively 'near' high probability statements will themselves have high probability, and this is enforced by axiom 1 of Theorem 1.

Statements in L'\L are determined not by this process, but rather by the use of the fixed point theorem in the proof of the consistency of the reflection principle.  This process also fixes the probabilities of each choice in the initial determination, but is non-constructive, so you cannot know the probabilities, although you can reason with and about them:  the probability relations remain.

My original question is how do we know that the fixed point (that fixes all the probabilities) is unique?  This result might still be useful without uniqueness, but 'probability' would be a misnomer.
 
Fascinated but beyond me at the moment
Add a comment...