Shared publicly  - 
 
Today is Firefox 6 Release Day. Every engineer loves release day - a concrete moment in a mostly digital world. Thanks to the still-new Firefox Rapid Release ( http://blog.mozilla.com/futurereleases/2011/07/19/every-six-weeks/ ) we get this high every six weeks. Thank goodness for our excellent operations team.

I have a number of features that ended up in FF6. The initial ietf-websockets implementation is finally there (It is ietf-07, FF7 will have ietf-10) and it will get sufficient attention in the release notes. I wanted to highlight here a couple smaller things of mine are being released today too.

FF6 contains 'syn-retry', which is one of my favorite changes I have made to the HTTP stack. This basically means that if we can't complete the TCP handshake for a new HTTP connection within 250ms we start a second one in parallel. Your OS probably won't resend that first connection attempt (the SYN) for 3000ms. The loss of that particular packet is so detrimental to performance we step in at the application layer to see if that's just an ephemeral problem that can be cured by trying again.

There are some subtle bonuses that go along with this implementation. Chiefly, because we might have to choose from a couple different sockets the process of making the TCP connection was separated from the HTTP transaction we intend to put on it - we place the transaction on the first available connection instead of tieing the two objects together at handshake time. The fun part is that the first available connection might not be one of the newly opened ones, it might be a reused persistent connection that frees up while the handshake is taking place!

For example - in FF5 if we needed to fetch an image and there were 2 existing, but busy, connections to the correct host we would open a 3rd connection in parallel and when it was opened we would simply fetch the image on it. In FF6 we still open that 3rd connection but if a transaction on either connection-1 or connection-2 becomes available before connection 3 is ready, then we immediately dispatch onto that instead and save a bunch of latency even when there is no packet loss at all.

This actually happens a lot. The connection takes 1 RTT to complete, 3 RTT if it is over SSL. Any existing transactions that are served faster than that will complete before a new handshake can be completed - and that's a big fraction of web resources. The 250ms retry timer does not even need to go off, the concepts are really orthogonal.

In the cases where we have initiated a connection (or two) that don't get used right away we just add them to the idle persistent connection pool so that a new transaction request can use them without having to wait for a handshake. This is basically an accidental implementation of limited connection pre-warming. We'll be doing more of that (in a more structured way) in the future, I hope.

In the forthcoming FF7 biesi built on this feature to do the retry with IPv6 disabled (fall back to v4) which really helps in cases of servers with broken v6 stacks.

The last interesting bit of my code in FF6 is a simple piece of tuning (well, it actually came with a bunch of code to support legacy OS's, but that's a different story). FF <=5 limits you to 6 parallel connections per hostname (all browsers do this), and 30 aggregate connections to all hosts (just a FF thing). It turns out that 30 is a real choke point - I was watching packet traces of www.nytimes.com that spread the requests across so many different hostnames the 30 limit was reached. The result was that we were closing idle connections to hostname A to make room under the 30 connection quota and connect to B, only to see more requests for A flow in which now had to reopen the connection just to load one page. Ugly! It is worked around by increasing the limit of 30 to be 256, because an idle TCP connection is a pretty cheap thing to store.

But the whole situation makes me mad. Bumping into the limit means connections get torn down and recreated - adding latency for no gain. If they were all under the same hostname then many of them would get queued and served on a persistent connection instead - saving the handshake latency at the cost of parallelism. But server developers want the parallelism so they shard the hostnames and this is one of the unforseen complications. We need to enable infinite parallelism with sane congestion control semantics - something like HTTP over SCTP, or shared TCP congestion control blocks, or more likely SPDY because SPDY brings those concepts into the application layer which makes it inherently more deployable. And I like to release stuff :)
20
14
Drit Suljoti's profile photoChad Council's profile photoPatrick McManus's profile photoMark Nottingham's profile photo
10 comments
 
Congrats on the release! I admire the fact that things like gain-less latency from tearing down and recreating connections makes you mad. (seriously) It's that kind of passion that makes innovation successful (IMHO)
 
Cool stuff, Patrick.

What happens for us folks who are > 250ms from most of the Web sites in the world?
 
:: What happens for us folks who are > 250ms from most of the Web sites in the world?

you get lots of pre-warmed connections! Which is great, because your connections are painful to create from scratch.
 
Awesome stuff!

Is SPDY getting deployed? I hadn't heard anything about it for a few months... HTTP over SCTP would be interesting, is someone working on that? And shared TCP congestion control is fascinating, but a little mind-bending.

Would pipelining the requests, but allowing responses out of order (identified by some token), help at all?

(I'm bummed I didn't manage to see and the wee one you while you were in our neck of the woods)
 
Hey Ben - lots of good stuff.

Google has had a lot of promising results deploying SPDY through Chrome and the google websites. Using most (all?) of *.google.com over tls chrome will generally use SPDY today. The google team (esp +Mike Belshe, +Roberto Peon, and +William Chan) deserve a ton of extra credit for publishing some results in a bunch of forums ( e.g. http://www.slideshare.net/ido-cotendo/from-fast-to-spdy-velocity-2011 ) and being totally open to questions about their experiences. Open standardization is necessary and will be a bear, but all signs say it is feasible.

I have just recently started playing with what it would take to integrate it into firefox and then generate some benchmarks and do some experiments. I'm not going to oversell that project - I'm just getting started. But if it proves good for the web, mozilla wants to contribute both an implementation and our experiences back into the technology.

as far as something like reordered pipelines - in my opinion you've broken the existing protocol so badly at that point that you need a major version number update and you've only addressed one aspect of head of line blocking. If you're going to do that you might as well tackle the whole problem set. That being said, HTTP/1.1 pipelines definitely have a role to play making stuff better during the ~decade it will take to transition to a new protocol.
 
A good feature to improve performance if web server is load balanced and when the network is inconsistent. IP implementation at OS level still decides how the packet should be sent so, assuming network is consistent, if the first SYN packet takes 250ms then the second SYN packet also takes the same time.. rite?
 
Not if it gets lost... and mobile IP is making lossy networks much more common.
Add a comment...