RE: FACETED SEARCH IA -- LONG TECHNICAL QUESTION: Google finally published http://googlewebmastercentral.blogspot.com/2014/02/faceted-navigation-best-and-5-of-worst.html
, which is the most comprehensive writeup they've ever published. I'll add (toot toot) that many of the things I've suggested in the past are there. OK, so there are basically 2 solutions she cites, both of which I've investigated:
#1. Nofollow the link and canonicalize it to a superset.
#2. Exclude it via robots.txt
Aside from the fact that she mentions nofollow+canonicalize first, she provides no obvious
preference. She does seem to imply #1 is better because "you can consolidate indexing signals from the unnecessary URLs with a searcher-valuable URL by adding rel=canonical." OK, but then she says it only "minimizes the crawler’s discovery of unnecessary URLs ... rel=nofollow doesn’t prevent the unnecessary URLs from being crawled (only a robots.txt disallow prevents crawling).
I've never completely
understood this overloaded use of nofollow to indicate distrust (and historically sculpt). The only thing I can think of is in the pagerank model "nofollow" means don't flow probability/PR (the wandering visitor) here. +Alistair Lattimore +Traian Neacsu +Eric Wu
? Allegedly, it doesn't work for sculpting anymore, though this would be a perfect example of when it would be appropriate to sculpt -- tens-of-thousands of URLs that are superfluous/duplicative but not quite duplicate content. But they have to
exist. Sculpting doesn't work, though. So that advantage is out.
On the flipside, she notes that nofollow+canonicalize "doesn’t prevent the unnecessary URLs from being crawled (only a robots.txt disallow prevents crawling)." What does she mean here? Does she mean that if it's externally linked it will still get crawled? Doesn't that mean all that you need is some hacky mirror site to get you into trouble that leaves off the nofollow?
HOWEVER, it will preserve link equity if someone does link to some odd filtered URL whereas if the URL were robots.txted out it would not, theoretically. I just don't think those odds are high.
For the record, I've been the proponent of #2 for a long time because I didn't trust nofollow+canonicalizing.
More questions than answers in some ways :)