Today is Firefox 6 Release Day. Every engineer loves release day - a concrete moment in a mostly digital world. Thanks to the still-new Firefox Rapid Release ( http://blog.mozilla.com/futurereleases/2011/07/19/every-six-weeks/ ) we get this high every six weeks. Thank goodness for our excellent operations team.

I have a number of features that ended up in FF6. The initial ietf-websockets implementation is finally there (It is ietf-07, FF7 will have ietf-10) and it will get sufficient attention in the release notes. I wanted to highlight here a couple smaller things of mine are being released today too.

FF6 contains 'syn-retry', which is one of my favorite changes I have made to the HTTP stack. This basically means that if we can't complete the TCP handshake for a new HTTP connection within 250ms we start a second one in parallel. Your OS probably won't resend that first connection attempt (the SYN) for 3000ms. The loss of that particular packet is so detrimental to performance we step in at the application layer to see if that's just an ephemeral problem that can be cured by trying again.

There are some subtle bonuses that go along with this implementation. Chiefly, because we might have to choose from a couple different sockets the process of making the TCP connection was separated from the HTTP transaction we intend to put on it - we place the transaction on the first available connection instead of tieing the two objects together at handshake time. The fun part is that the first available connection might not be one of the newly opened ones, it might be a reused persistent connection that frees up while the handshake is taking place!

For example - in FF5 if we needed to fetch an image and there were 2 existing, but busy, connections to the correct host we would open a 3rd connection in parallel and when it was opened we would simply fetch the image on it. In FF6 we still open that 3rd connection but if a transaction on either connection-1 or connection-2 becomes available before connection 3 is ready, then we immediately dispatch onto that instead and save a bunch of latency even when there is no packet loss at all.

This actually happens a lot. The connection takes 1 RTT to complete, 3 RTT if it is over SSL. Any existing transactions that are served faster than that will complete before a new handshake can be completed - and that's a big fraction of web resources. The 250ms retry timer does not even need to go off, the concepts are really orthogonal.

In the cases where we have initiated a connection (or two) that don't get used right away we just add them to the idle persistent connection pool so that a new transaction request can use them without having to wait for a handshake. This is basically an accidental implementation of limited connection pre-warming. We'll be doing more of that (in a more structured way) in the future, I hope.

In the forthcoming FF7 biesi built on this feature to do the retry with IPv6 disabled (fall back to v4) which really helps in cases of servers with broken v6 stacks.

The last interesting bit of my code in FF6 is a simple piece of tuning (well, it actually came with a bunch of code to support legacy OS's, but that's a different story). FF <=5 limits you to 6 parallel connections per hostname (all browsers do this), and 30 aggregate connections to all hosts (just a FF thing). It turns out that 30 is a real choke point - I was watching packet traces of www.nytimes.com that spread the requests across so many different hostnames the 30 limit was reached. The result was that we were closing idle connections to hostname A to make room under the 30 connection quota and connect to B, only to see more requests for A flow in which now had to reopen the connection just to load one page. Ugly! It is worked around by increasing the limit of 30 to be 256, because an idle TCP connection is a pretty cheap thing to store.

But the whole situation makes me mad. Bumping into the limit means connections get torn down and recreated - adding latency for no gain. If they were all under the same hostname then many of them would get queued and served on a persistent connection instead - saving the handshake latency at the cost of parallelism. But server developers want the parallelism so they shard the hostnames and this is one of the unforseen complications. We need to enable infinite parallelism with sane congestion control semantics - something like HTTP over SCTP, or shared TCP congestion control blocks, or more likely SPDY because SPDY brings those concepts into the application layer which makes it inherently more deployable. And I like to release stuff :)
Shared publiclyView activity