Terence Tao

31,183 followers

31,183 followers

Posts

Post has attachment

Public

The Sydney Mathematical Research Institute (SMRI), an institute for international research visitors (modeled in part on the Max Planck Institute in Bonn) headed by Geordie Williamson at the University of Sydney, has now officially launched. (I am on an advisory board for this institute.)

Add a comment...

Post has attachment

Public

The Australian Mathematical Sciences Institute (AMSI) is seeking a new Director to replace Geoff Prince, who plans to retire in the near future.

Add a comment...

Public

*[I felt it was an appropriate time to repost this old G+ post of mine from April 2012. -T.]*

Anonymity on the internet is a very fragile thing; every anonymous online identity on this planet is only about 31 bits of information away from being completely exposed. This is because the total number of internet users on this planet is about 2 billion, or approximately 2^{31}. Initially, all one knows about an anonymous internet user is that he or she is a member of this large population, which has a Shannon entropy of about 31 bits. But each piece of new information about this identity will reduce this entropy. For instance, knowing the gender of the user will cut down the size of the population of possible candidates for the user's identity by a factor of approximately two, thus stripping away one bit of entropy. (Actually, one loses a little less than a whole bit here, because the gender distribution of internet users is not perfectly balanced.) Similarly, any tidbit of information about the nationality, profession, marital status, location (e.g. timezone or IP address), hobbies, age, ethnicity, education level, socio-economic status, languages known, birthplace, appearance, political leaning, etc. of the user will reduce the entropy further. (Note though that entropy loss is not always additive; if knowing X removes 2 bits of entropy and knowing Y removes 3 bits, then knowing both X and Y does not necessarily remove 5 bits of entropy, because X and Y may be correlated instead of independent, and so much of the information gained from Y may already have been present in X).

One can reveal quite a few bits of information about oneself without any serious loss to one's anonymity; for instance, if one has revealed a net of 20 independent bits of information over the lifetime of one's online identity, this still leaves one in a crowd of about 2^11 ~ 2000 other people, enough to still enjoy some reasonable level of anonymity. But as one approaches the threshold of 31 bits, the level of anonymity drops exponentially fast. Once one has revealed more than 31 bits, it becomes theoretically possible to deduce one's identity, given a sufficiently comprehensive set of databases about the population of internet users and their characteristics. Of course, such an ideal set of databases does not actually exist; but one can imagine that government intelligence agencies may have enough of these databases to deduce one's identity from, say, 50 or 60 bits of information, and even publicly available databases (such as what one can access from popular search engines) are probably enough to do the job given, say, 100 bits of information, assuming sufficient patience and determination. Thus, in today's online world, a crowd of billions of other people is considerably less protection for one's anonymity than one may initially think, and just because the first 20 or 30 bits of information you reveal about yourself leads to no apparent loss of anonymity, this does not mean that the next 20 or 30 bits revealed will do so also.

Restricting access to online databases may recover a handful of bits of anonymity, but one will not return to anything close to pre-internet levels of anonymity without extremely draconian information controls. Completely discarding a previous online identity and starting afresh can reset one's level of anonymity to near-maximum levels, but one has to be careful never to link the new identity to the old one, or else the protection gained by switching will be lost, and the information revealed by the two online identities, when combined together, may cumulatively be enough to destroy the anonymity of both.

But one additional way to gain more anonymity is through deliberate disinformation. For instance, suppose that one reveals 100 independent bits of information about oneself. Ordinarily, this would cost 100 bits of anonymity (assuming that each bit was a priori equally likely to be true or false), by cutting the number of possibilities down by a factor of 2^100; but if 5 of these 100 bits (chosen randomly and not revealed in advance) are deliberately falsified, then the number of possibilities increases again by a factor of (100 choose 5) ~ 2^26, recovering about 26 bits of anonymity. In practice one gains even more anonymity than this, because to dispel the disinformation one needs to solve a satisfiability problem, which can be notoriously intractible computationally, although this additional protection may dissipate with time as algorithms improve (e.g. by incorporating ideas from compressed sensing).

EDIT: It is perhaps worth pointing out that disinformation is only a partial defence at best, and to protect anonymity it is better not to emit any information in the first place. For instance, in the above example, even with disinformation, one has still given away about 74 bits of information, which already is more than enough (in principle, at least) to identify the identity.

Add a comment...

Post has attachment

Public

The University of Sydney is starting up a mathematical institute, with a visitor program to support international visitors in the mathematical sciences to the university and to other Australian institutions.

Add a comment...

Post has attachment

Public

Submissions for the Breakthrough Junior Challenge are being accepted until July 1.

Add a comment...

Post has shared content

Public

A group of us are setting up a new combinatorics journal. The aim is to provide an ethical alternative for papers at the level of the top specialist journals (such as Combinatorica), where "ethical" means that there are no charges for authors or readers.

For more details about the journal and the reasons for starting it, see the blog post below. And if you are combinatorialist reading this and would like to help us off to a strong start, then we'd be delighted. Our part of the bargain is that it will be the journal's policy to accept papers if and only if they are of a high standard: if we don't receive many good papers then we won't publish many papers. That way, you can submit a good paper to us and be confident that it won't be selling it short.

For more details about the journal and the reasons for starting it, see the blog post below. And if you are combinatorialist reading this and would like to help us off to a strong start, then we'd be delighted. Our part of the bargain is that it will be the journal's policy to accept papers if and only if they are of a high standard: if we don't receive many good papers then we won't publish many papers. That way, you can submit a good paper to us and be confident that it won't be selling it short.

Add a comment...

Post has attachment

Public

Here at IPAM (where I serve on the Scientific Advisory Board), we are launching some experimental minicourses in industrial mathematics. Registration for the first such course - in deep learning and AI - is now open. (Note that the course requires active participation and there is a non-trivial registration fee.)

Add a comment...

Post has attachment

Public

Progress on the Hadwiger-Nelson problem! Thanks to a computer-assisted search, a finite collection of points in the plane has been located such that it is not possible to 4-color the points so that pairs of points a unit distance apart always have distinct colors. Hence the possible values of the chromatic number of the plane has now been narrowed down to 5, 6, or 7. (EDIT: the graph was found by hand, but the verification of non-4-colorability was computer assisted.)

The author (Aubrey de Grey, who incidentally was an amateur mathematician that had contributed to some previous Polymath projects) is interested in starting a project to reduce the amount of computer assistance required (right now one has to check that certain graphs with a couple thousand vertices are not 4-colorable, which is beyond the capability of human verification). For those interested in this, he can be reached on twitter at https://twitter.com/aubreydegrey

UPDATE: There is now a polymath proposal on this at polymathprojects.org/2018/04/10/polymath-proposal-finding-simpler-unit-distance-graphs-of-chromatic-number-5/

The author (Aubrey de Grey, who incidentally was an amateur mathematician that had contributed to some previous Polymath projects) is interested in starting a project to reduce the amount of computer assistance required (right now one has to check that certain graphs with a couple thousand vertices are not 4-colorable, which is beyond the capability of human verification). For those interested in this, he can be reached on twitter at https://twitter.com/aubreydegrey

UPDATE: There is now a polymath proposal on this at polymathprojects.org/2018/04/10/polymath-proposal-finding-simpler-unit-distance-graphs-of-chromatic-number-5/

Add a comment...

Post has attachment

Public

The Breakthrough Junior Challenge for 2018 (open to applicants between the age of 13 and 18 who have made a video explaining a scientific or mathematical fact) is now officially open for submissions. (The winner receives a $250K scholarship, with an additional $50K for the winner's teacher and $100K for a lab at the winner's school.)

Add a comment...

Post has attachment

Public

Goodhart's law can be formulated as "When a measure becomes a target, it ceases to be a good measure." It initially arose in economics and is mostly applied to situations involving human agents, but as the article below illustrates with several anecdotes, the same law applies in AI research. My favorite is the AI that learned to win at a generalised form of tic-tac-toe by sending their moves to the AI opponent in a highly convoluted fashion that caused them to crash due to exceeding memory limitations.

In mathematics, the analogous phenomenon is that the argmax (or argmin) function - that takes a function F ranging over some parameter space and locates its maximum (or minimum) - can be very unstable. An approximation G to F that agrees well with F for "typical" cases may have a vastly different location for its global maximum (or minimum), due to "edge" case discrepancies. More generally, it can be dangerous to extrapolate average case behaviour of a function to draw any conclusions about worst case (or best case) behaviour.

In mathematics, the analogous phenomenon is that the argmax (or argmin) function - that takes a function F ranging over some parameter space and locates its maximum (or minimum) - can be very unstable. An approximation G to F that agrees well with F for "typical" cases may have a vastly different location for its global maximum (or minimum), due to "edge" case discrepancies. More generally, it can be dangerous to extrapolate average case behaviour of a function to draw any conclusions about worst case (or best case) behaviour.

Add a comment...

Wait while more posts are being loaded