Shared publicly  - 
 
Weird Search Results Demystified

An entertaining, thoughtful, and illuminating slideshow exploring why a particular Facebook page for a band ranks so well on a search for the term [fave] in Google from +Dan Shure

Dan also does a great job of showing off some of the tools that he uses to explore search results rankings, and because of that, it turns into a nice example of some of the things you can do when you see a search result that doesn't look like it belongs where it is.

Understanding why anomolies like the top result on the [fave] search happen is one of the best ways to learn about SEO. :)
13
5
AJ Kohn's profile photoDan Shure's profile photoBill Slawski's profile photoTom Critchlow's profile photo
13 comments
 
Thanks +Bill Slawski I was wondering what your thoughts might be on it! +Tom Critchlow and I still think there might be more going on with Google trying to predict intent (in that it somehow knows Facebook is the desired domain) and you of all people might have some thoughts about that?
 
You're welcome, +Dan Shure I'm going to have to go over it again, and spend more time with the result to come up with some kind of hypothesis. It doesn't have the feel of an algorithmically determined navigational result, though there might be some kind of entity association thing going on. Just based upon pure term frequency (tf/idf), the page seems to do well, but that shouldn't be enough by itself. The page does have a toolbar pagerank of 4, some of which might be partially attributable to internal Facebook links, and that's fairly decent relatively to some of the other PageRanks for other pages in those results.

Will dig deeper...
 
+Bill Slawski wow thanks Bill!! I will likely put some more investigation into the other odd searches (yuit for youtube etc) ... but would love to hear anything you come up with :)
 
+Dan Shure So what may be going on here could be an algorithmic collision between a Google Suggested Query (spelling) Revision and an association between the (suggested/corrected) query and the entity Facebook.

Google may partially have transformed [fave] into [facebook fave] by correcting and completing [fave] into [facebook] and keeping [fave] when it should have discarded it.

Google did introduce Google Suggest spelling corrections (and local results within those Suggested queries as well, in the Google Blog post in May of 2010:

http://googleblog.blogspot.com/2010/04/search-with-fewer-keystrokes-and-better.html

A couple of months before that, the patent application "Information Search System with Real-Time Feedback" (US patent application 20110191364) was published by Google:

http://appft.uspto.gov/netacgi/nph-Parser?Sect1=PTO1&Sect2=HITOFF&d=PG01&p=1&u=%2Fnetahtml%2FPTO%2Fsrchnum.html&r=1&f=G&l=50&s1=%2220110191364%22.PGNR.&OS=DN/20110191364&RS=DN/20110191364

A snippet from the patent application:

-[0080] The query reviser 256 is a system that takes input query terms and prepares suggested query revisions, refinements, reformulations, spelling corrections, prefix searches, and other functions that can modify a user's search query to possibly increase the probability that the search engine 250 will find what the user intended to look for. For example, a misspelled search term may be less likely to occur in an index 264 than the correct spelling, and therefore may reduce the chances of finding the correct information. The query reviser 256 may detect the misspelling, and offer a corrected spelling as at least part of a suggested query term._

Google also has a patent where they might identify a named entity in a query, and assume that the query is an implied site search. The patent is:

Query rewriting with entity detection
http://patft.uspto.gov/netacgi/nph-Parser?Sect1=PTO2&Sect2=HITOFF&u=%2Fnetahtml%2FPTO%2Fsearch-adv.htm&r=1&p=1&f=G&l=50&d=PTXT&S1=7,536,382.PN.&OS=pn/7,536,382&RS=PN/7,536,382

I wrote about it in:

Boosting Brands, Businesses, and Other Entities: How a Search Engine Might Assume a Query Implies a Site Search
http://www.seobythesea.com/2009/05/boosting-brands-businesses-and-other-entities-how-a-search-engine-might-assume-a-query-implies-a-site-search/

If we do a search for [fave facebook], the top result is the same page at the top of the search results on your search for just [fave]. It's then followed by three more results from Facebook (the implied site search.

So my hypothesis is that Google may be making the spelling "correction" and associating the revised query with facebook.com, but treating it as a sitesearch of facebook.com, but only showing the first result.

If I can find some other examples of this happening....

[yiut youtube] gives us the entity association, and the additional results from the implied site search.

[yuit] returns the same top result, but not the additional results, much like the search for [fave].

Is this an algorithmic flaw in how the Query Suggest auto-correct intersects with a query (named entity) association with YouTube in one case, and Facebook in the other?

It's possible...

Would the facebook fave page (www.facebook.com/thefavehoboken) rank where it does if not for the entity association between the query [facebook] and the site facebook.com? Would the Youtube yiut video (tossnie yiut) rank as highly as it does if not for the entity association between the query [youtube] and youtube.com?

An entity association between a query term and a webpage could potentially move a result into a top ranking,

What's odd here is that Google would do the entity association for a corrected version of a query term, and still do a site search of the associated site for the uncorrected version of the initial query.
 
+Bill Slawski WOW! As I just mentioned on Twitter, this makes COMPLETE sense... it is definitely an intuitive sense that I had. The whole "collision" of two algorithmns makes perfect sense, at my first impression.

As +Tom Critchlow and I had the suspicion, that there was a flaw going on in this query, I think you've at least revealed the premise of what's happening here.
 
+Dan Shure It does look like some kind of flaw in Google's algorithm, and I'm picking out known potential reasons. Again, a hypothesis, but it seems to fit. As I said on Twitter, Google has lots of PhDs who didn't catch the problem, but you did. Nice job.
 
Well, +Dan Shure and +Tom Critchlow, you both did a great job in recognizing that something was wrong. Someone at Google should have caught it. The exploration of why the page shouldn't rank where it does, the methodology used to do that, and the identification of at least one similar result behaving is the same way was both very smart, and presented in a very engaging way. :)
 
Love this analysis +Bill Slawski - I need more time to loop back around on all this dig deeper! Thanks for chiming in tho!
 
Thanks, +Tom Critchlow. I'll look forward to hearing your thoughts once you've had the chance to look into it more. I was thinking too, that maybe someone from Google should be pointed at those particular search results. Not sure if they would acknowledge them as being a problem, but it might not hurt to have them fixed if they aren't supposed to work the way that they are.
 
+Bill Slawski Just don't point it out to them just yet (check you DMs) but I would like this to be broken for a few more days... although... what am I talking about, they'd probably take a LONG time to fix it, lol.
AJ Kohn
 
I'm a bit late to this discussion and sent something to +Bill Slawski yesterday but ... I'm not entirely convinced that entities are being used that much.

A lot of what I see could be the application of big data against machine learning. At last year's Inside Search event Mike Cohen mentioned that the advances (in voice search in particular) were not algorithmic in nature but simply because they were able to expand the machine learning data set.

I'm a fan of entities and have admired Freebase/Metaweb as far back as 2007 when a smart engineer put me on to it. I'm just not sure we're not seeing the continuing efforts to optimize based MapReduce.

I'd like it to be entities but I'm skeptical, but willing to be convinced. cc: +Dan Shure +Tom Critchlow
Add a comment...