Shared publicly  - 
In 1977, Martin Gardner wrote that "in an unpublished proof, Graham has recently established … a bound so vast that it holds the record for the largest number ever used in a serious mathematical proof." 

The video here gives a fun explanation.  But it seems nobody in the video can quite remember the problem where Graham's number shows up.  It's a bit tricky:

Consider an n-dimensional hypercube.   Connect each pair of vertices with an edge.   Color each of the edges either red or blue. What is the smallest value of n for which every such coloring contains 4 vertices lying in a plane that are all connected to each other with edges of the same color?

For a while the best upper bound on this n was Graham’s number, namely g(64), where we define

g(1) = 3↑↑↑↑3


g(k+1) = 3 ↑↑↑↑↑↑↑↑↑↑↑↑g(k) of these↑↑↑↑↑↑↑↑↑↑↑↑ 3

Remember, Knuth’s up-arrow notation is a way of talking about exponentiation, tetration, and so on:

2↑4 = 2 × (2 × (2 × 2))
2↑↑4 = 2 ↑ (2 ↑ (2 ↑ 2))
2↑↑↑4 = 2 ↑↑  (2 ↑↑ (2 ↑↑ 2))

etc.  But Graham used a devilish trick to let the number of up-arrows grow very rapidly.  Each number in his sequence g(k) is the number of up-arrows in the next one!   And so, his number is vastly bigger than any numbers I've mentioned so far in my tale... except, of course, for all the infinite ones.

Graham's number was an upper bound on the answer to a problem.  One way to get a huge upper bound is to be stupid.   For example, I could say "a million is an upper bound on the smallest prime, because a million is not prime, so it must have a prime factor smaller than this." 

Graham is certainly not stupid, and he improved his bound considerably by the time he published a paper on this stuff.   But it's worth noting that the logician Harvey Friedman has found math problems where the lower bound on the right answer is much bigger than Graham's number!   We'll talk about those later.

For more on Graham's number, try this:

Cody Roux's profile photoJohn Baez's profile photoRoyce Peng's profile photoAndrés Eduardo Caicedo's profile photo
What's amazing is that even g_2 is incomprehensible, but Kruskal's TREE(3) makes g_64 look tiny.
Indeed!   And what's amazing is not that there are really big numbers, but that there are really simple problems whose answers are really big numbers.  (By the way, I think Harvey Friedman is the main one to blame for TREE(3), though it's based on work of Kruskal.)
All of a sudden, the ending of the proof in Rudin's “Real and Complex Analysis” which goes something like “[…]<10000ε, which is small when ε is small” seems rather mundane. And I thought it so amusing when I first encountered it back in my student days.
From number theory we also have Littlewood's famous result that π(x)-li(x) changes sign infinitely often. As it turns out, the first sign change occurs around 10^136 (look up Skewes' number on Wikipedia). Again, this seems rather ho-hum now, though it is still impressivly large for such a natural question (depending on what you consider natural, of course). On the other hand, it seems that many of the enormous numbers in this series arise from combinatorics and logic, especially the former of which is already infested with large numbers. So I wonder what are the biggest numbers occurring in analysis proofs?

(For the nonmathematicians: π(x) is the number of primes less than or equal to x, while li(x) is known as the integral logarithm: It is the integral of 1/ln(x) from 0 to x. (You need to take a principal value in order to handle the singularity of the integrand at x=1, though.))
When I took graduate combinatorics, it always blew my mind how fast things blew up. I like this funny Erdos quote on Ramsey numbers:
"Erdős asks us to imagine an alien force, vastly more powerful than us, landing on Earth and demanding the value of R(5, 5) or they will destroy our planet. In that case, he claims, we should marshal all our computers and all our mathematicians and attempt to find the value. But suppose, instead, that they ask for R(6, 6). In that case, he believes, we should attempt to destroy the aliens."

In computer science, there are the Busy Beaver machines, a six state machine having on the order of 10^36534 steps before halting. There's some relation between these as Graham's number as someone constructed a class of turing machines whose number of steps is on the order of g1 (the first step in Graham's number)
Last night I went back and reread from omega^2 to get up to speed. I also told my 12 YO daughter about tetration and she thought it was awesome. I can't watch the video now, but your own explanations are always excellent. Thanks for this! 
I like this narrowing from 6--G to 11--G !
Maybe it's mentioned in the video, but my favourite thing about Graham's number is that (if I understand correctly) the actual solution of the problem is thought to be around 13.
The main nice thing about writing a short expression for Graham's number in the lambda calculus is that, like a universal Turing machine, the lambda calculus is a way of modelling computation - and an even nicer one if you like functions and composition more than tape with marks on it.  So, you can define a variant of the busy beaver function BB(n) to be the length of the longest lambda-term you can get by reducing a lambda-term with n symbols - not counting the ones that give arbitrarily long lambda-terms as you reduce them.  And then this expression

graham = (λc2 c3.(λc4. c3 c4 (λn. n (λf b. b f (λx. x)) (λb f. c3 (b f)) c3) c4) (c2 c2)) (λf x. f (f x)) (λf x. f (f (f x)))

shows that BB(86) is at least on the order of Graham's number... if  I counted the number of symbols right.

+Mike Stay - do you know if Cris Calude ever tried to work out lower bounds on a lambda calculus version of the busy beaver function?  It seems more fun than working with Turing machines.
+Eduard Baumann and +Artie Prendergast-Smith - now you're getting me interested in the current best bounds on the solution to this problem.  According to Wikipedia it's somewhere between 13 and a huge number that's tiny compared to Graham's number:

F^7(12) = F(F(F(F(F(F(F(12)))))))


F(n) = 2 ↑↑↑↑↑↑↑↑↑↑↑↑n of these↑↑↑↑↑↑↑↑↑↑↑↑ 3

This improved upper bound is the one Graham actually published in his original paper with Rothschild... Martin Gardner seems to have dug up an earlier version of Graham's proof somehow.   The lower bound of 13 is due to Jerome Barkley in 2008.  Do people really think this is the right answer?
+Ray Cromwell - did you actually get some intuition for Ramsey theory in that course?  I find it very gnarly and dry - so far the only thing I like about it is that it involves big numbers!  I'm trying to read Ramsey Theory by Graham, Rothschild and Spencer, but so far I'm mainly bouncing off this book.
+John Baez Oh, I didn't understand that Eduard's comment was referring to the same thing. I guess my assertion that the true value is around 13 isn't actually supported by what's in the Wikipedia article, but I do remember getting the impression that it's a small number somewhere. Maybe +Timothy Gowers might have something to say about this...
+John Baez Unfortunately, no, I was so in love with the beauty of Balanced Incomplete Block Designs / STS / Finite Projective Planes in that course that I could almost think of nothing else for a while, I was mostly muddling through the Ramsey Theory stuff. To this day, I am still in love with the Shannon-Fano Plane I learned for that course. Something deep within my brain just clicked and it's one of the most beautiful structures to me. 

This is one of the things I've always struggled to communicate with people from the liberal arts, that mathematicians can get a sense of awe and beauty from areas of math that are similar to the way others can appreciate a beautiful sculpture or poem.

Some amateur writing on the Fano Plane from me  a few years ago ( based on puzzle it solves. 

BTW John, I've been reading your columns for years, ever since the first "This week..." on USENET and you've continued to inspire me over the years to stay invested in math. Your original articles on the Sporadic Groups and Monstrous Moonshine were unnerving and amazing, as well as the stuff on Category Theory.
When Gardner reprinted his article on Ramsey theory in The Colossal Book of Mathematics in 2001, he included an appendix to show how things had changed since 1977, but he still leaves the reader with the impression that the least known upper bound for the hypercube problem is Graham's number.  That could have merited a correction!
Graham's number is possibly the worst upper bound estimate in history. :) 
I'm glad you've enjoyed my stuff ever since the Lower Paleolithic Internet, +Ray Cromwell.  Like you, projective planes seem a lot more pretty to me than Ramsey theory.  Perhaps their well-organized structure is why they don't give rise to so many huge numbers... though the search for a projective plane of order 10 was quite gnarly and computer-intensive:

These days I'm trying to focus on more practical math like electrical circuits, Markov processes and the like, but I find this frees me up to explore utterly recreational areas in my spare time without feeling guilt.   So, I've been reading about notation systems for large countable ordinals, which is a very spooky subject. 

Any notation system only covers countably many ordinals, so there are uncountably many countable ordinals bigger than all of those, and a unique smallest one with this property... so presto, we have managed to name an ordinal that's not named by our notation system!  

It seems hopeless to get around this, but very smart people have tried harder and harder to fight their way deeper and deeper into this problem, and the results are pretty scary.    I want to try to explain this stuff in a simple way.
+John Baez When you put it that way, it just seems like another variation of the Berry Paradox. It seems like it could also be related to Algorithmic Information Theory/Chaitin Incompleteness, and by implication, the Halting Theorem or Goedelian Incompleteness.  That is, any finite axiom system, or let's say, notational system with a fixed set of symbols and operators, can only "reach" a subset statements with property P, even though there may be many many more statements with property P, they can't be represented by your notation. In the case you describe, the property P is being a countable ordinal with some kind of relation (e.g. whether one is larger than another).  Thus, your notation system has a kind of maximum entropy, and you can't really "compress" the set of all countable ordinals to a more compact encoding.

The idea of theorems as a kind of data compression is another appealing idea to me.
Oh Friedman, what ever where you trying to compensate for?
Harvey Friedman is one of the most far-out logicians in the world.  Maybe he was beaten up in the playground and wants to get revenge by proving that he knows bigger numbers than anyone else.  But I don't care: he's done a lot of amazing things:
I've always been fascinated by reverse mathematics, and the tree theorem. It's quite useful in rewriting, e.g. it gives a bound on the termination length of so-called "path orderings". These bounds are basically useless of course (in practical terms at least).
I'm not a man of fixed mathematical tastes, +Andres Caicedo - I enjoy everything from nonlinear PDE to octonions to n-categories.  I probably just need to get some intuition for Ramsey theory.  The questions are pleasant enough, and I like the idea that the answers tend to be big.  Unfortunately the proofs so far look like just a mass of subscripts to me.  I haven't penetrated the surface.
+Artie Prendergast-Smith I believe you're thinking of the original Martin Gardner article, in which he said something like "experts in the field believe that the actual number is 6!"  Once the lower bound was pushed to 11 and then 13, I don't think anyone had the intuition anymore that the actual value would really be the lower bound;  in fact, Exoo (who found the 11 lower bound) argues that the value is probably considerably higher, based on the fact that, in his construction of a 10-dimensional counterexample, he made a lot of arbitrary choices in his construction and still came up with a counterexample.  His argument was that, since the counterexample came so easily, the actual maximum is probably considerably higher.  Of course, "considerably higher" in this case could mean something like 42, and not what we would consider a terribly large number!
+John Baez Concerning the current best bounds, I was able to derive an upper bound of 2^^^(2^^n), where n is a reasonably small number - still a monstrously large number, but much lower than either of Graham's numbers.  I basically just found explicit bounds for Spencer's proof of the theorem, and used Shelah's primitive recursive bound for the Hales-Jewett theorem, which was not available to Graham and Rothschild when they wrote their paper. (In fact, the non-primitive recursive bound on the Hales-Jewett theorem is the main reason why Graham's number is so big!)
+Royce Peng - have you written up and/or published your proof?  (Unfortunately I'm too ignorant about these matters to understand in detail what you're saying, but you're making it sound interesting, and other people, who know more, would be even more interested.)
+John Baez Unfortunately, I doubt it is publishable, since I am just finding explicit bounds for already published proofs.  I suppose I will write a blog post about it.  Stay tuned!
Great!  Blog posts are a good way to get something out of results that aren't quite publishable.  But if you look at all the papers that have been published, you'll see the bar is not very high, either.
Add a comment...