Was joking about the use of terms like Ninja and Guru earlier today. :)
But seriously, the praise and comments we've had for the ISOOSI Tuesday chats, aka #isoosichat has been a phenomenon, and very very much appreciated. We all absolutely love making the shows, and it is more than gratifying to know that our audience enjoy them almost as much as we do.
Thanks to regulars and , frequent helpers and and to every single guest we've had the pleasure to have join us.
Imagine that pages in the Web might be given 2 sets of scores. These sets of scores are for "broad topic" queries, such as "I wish to learn about motorcycles".
The first score might be an authority score, based upon how well it answers that broad topic. The second score might be a hub score, where compilations of links have been collected that can be used to find authoritative pages for that broad topic.
The patent I've linked to was written when the inventors where at AltaVista, and came into ownership by Yahoo when Overture purchased them, and then Yahoo purchased Overture. One of the inventors was the inventor of the patent in the first post of this series, Krishna Bharat, and the other inventor is Monika Henzinger. Reading anything you can from either is recommended.
A paper cited in the patent is one that should be read by anyone studying SEO and studying how pages may be ranked in search results, by Jon Kleinberg:
Authoritative Sources in a Hyperlinked Environment (pdf)
Definitely try to read through as much of the paper as you can before you move on to the patent - it becomes much easier to read and understand if you do so.
The point behind the patent is to improve upon the Hubs and Authorities Algorithm in the Kleinberg paper, to prevent topic drift when the focus is upon terms that may have more than one meaning (for example, Jaguar may refer to a Car brand, a type of Animal, and a football player in the NFL from Jacksonville).
The patent and the paper aren't from Google, even though Bharat and Henzinger both ended up working at Google, and you will see elements of Hubs and Authorities scores in their work.
There definitely are a number of algorithms and paths that Google and other search engines are pursuing, and I'm going to try to include many of those in this series to keep it from being myopic. :)
You order a pizza on your phone, and pay for it, and drive to pick it up. It's waiting for you as you arrive, since the cooks can track your location and get traffic estimates regarding your arrival time.
Last week, news spread of Google adding menus from restaurants to search results, which may have surprised a lot of people. More quietly, a patent application was published at the USPTO that describes how Google might work with merchants to allow people to make mobile orders and payments at places such as restaurants and pharmacies, enabling the merchant to track the location of orderers and synching that with preparation time.
There are those who see the word "semantic" and ask where the semantic markup or schema.org markup is. This post isn't for you.
There are those who see heading elements, and wonder whether or not the fact that they are often bigger and bolder on a page than other text makes those pages rank higher for the words within the heading. This post isn't for you.
There are those of you who see a list on a page, and it doesn't even technically have to use an HTML list element, who recognize that any of the items within the list could be ordered differently such as alphabetically, or by word length, or even randomly, and each of those list items would be equally as valuable a list item as any of the others. And they would be equally as close to the words in the heading of the list as any of the other list items.
And closeness is magical to search engines and SEO. Do a search for "ice cream" and the page that includes the phrase "ice cream" should be more relevant and rank higher than the page that includes the phrase, "I went to the store to buy cream, and slipped on the ice.
Not only are list items equal distances away from the heading of that list, but heading elements on a page are equal distance from every word in the substance that they head. I know this, because it's covered in Google's definition of "semantic closeness."
And each word on a page is an equal distance to the words in the title of that page. That's what the semantic meaning of a page title is, and that's included in Google's definition of "semantic closeness" as well.
As I noted above, no schema.org markup was required to have semantic closeness. Meaning happens, and some HTML elements have meaning baked right into them, which goes beyond just how they present things on an HTML page.
So the next time that you see someone state that there is no correlation between the use of a heading element and ranking at Google, ask them if they accounted for semantic closeness and leave them scratching their head. If they don't get it, they probably never will.
Jon Kleinberg noticed one day that at certain times of the year his email was filled with specific topics, such as around the time that mid terms and final exams were going to happen, his emails would focus upon test taking and extra office hours.
He noticed that this kind of behavior happened on the Web as well, where certain topics would be triggered by different events, in blogs, in news, in search queries, and so on. He looked at archives of things like presidential messages, for terms that would recur, and the events that triggered them, and started thinking of information in streams.
When traffic goes through a network, it isn't in a steady stream, but rather travels in bursts. Sometimes there are patterns to the bursts. Having a sense of topics that are hot, topics that have cooled off, others that may be seasonal or influenced by time of day or day of week, could be useful.
A number of lists of ranking signals used by the search engines mention things like "freshness", and it's likely that algorithms for things like news or blog search do use "freshness" as an important signal, but when a search engine acts as a reference source, like a library, sometimes more mature results are what is being called for.
When Monika Henzinger published patents for Google on Document inception dates for documents found on the web, those documents were dated based upon when they were first published, or first crawled by a search engine. Sometimes the rankings of those might be influenced by that date. This could be influenced by the relative age of a set of search results. So, if a search for "declaration of independence" turned up documents that were more mature, there might be a preference to show older documents, and they might be boosted in search results. On a search for "Windows 8.1" the set of search results might tend to be a lot younger, and so newer documents might be boosted in search results.
If there is a sudden increase in searches for "justin bieber canada", the bursty nature of the Web might cause fresher documents to rank higher, and we might see a "query deserves freshness" algorithm kick in where news articles and newer pages move up in search results.
Don't call it freshness, because sometimes mature pages are the ones that move up.
I couldn't help myself but publish this one - at a rate of one a day, it could take some time to get up to 100-200 ranking signals, and I don't know if I'm patient enough for that. :)
With the first couple of signals that I've written about, the idea of Google wanting to identify authority and hubs plays a strong role, and the idea that some pages are great resources that should rank highly comes out of that.
This patent focuses upon looking at user behavior signals for pages linked to from other pages to determine a reachability score for those pages. Good Hubs are pages that tend to lead to authoritative pages. To a degree, it's similar to scoring pages that act as good Hubs.
I wrote a post about this patent that describes how it works titled:
Does Google Use Reachability Scores in Ranking Resources?
So pages that are great resource pages based upon some measure of quality of the links from those pages to other resources is something that could potentially boost the rankings of those pages.
In the book about Google by Steven Levy, we are told that Google values "Long Clicks" as a signal of quality. The patent does describe how Google might determine what a long click is, but doesn't use it as a direct ranking signal. Instead, it uses Long Clicks to determine the quality of pages that link to a number other pages that result in long clicks. Those pages would likely be good Hubs pages. :)
This ongoing series will look at some of the different ranking signals that Google has likely used in the past to rank search results. in response to a query.
In 2001, Krishna Bharat filed for a patent with the USPTO that was granted in 2003. What it did was take the top search results (top 100, top 1,000, etc.) and boost some of those results based upon how often they "cited" or linked to each other within that "local" setting.
According to the patent, search results are ordered the way that they would normally be based upon things such as relevance and importance (PageRank), and then they are examined again and a local relevance score is added into the mix to use to change the order of those results:
Further, the method ranks the generated set of documents to obtain a relevance score for each document and calculates a local score value for the documents in the generated set, the local score value quantifying an amount that the documents are referenced by other documents in the generated set of documents
You may have heard or read that Krishna Bharat rewrote how Google worked in the early 2000s by applying the Hilltop Algorithm to how it works. The "other references" section of this patent refers to a paper by Bharat from before he joined Google that describes what Hilltop is and how it works:
Hilltop: A Search Engine based on Expert Documents
Google has since published a number of patents and papers that may boost or demote some local results based upon other signals, since then and I'll be including some of those in this series.
The online encyclopedia faces a number of challenges, including the hiring of a new boss, and a reduction in the numbers of people making contributions to it.
Will efforts to make it more friendly to mobile devices, to broaden its readership to new places, and to adapt to wearable devices such as Google Glass succeed?
How well does the world's information scale? Will Wikipedia continue to be a favorite source of Google knowledge panels, or will its role there diminish as Google builds up other sources such as Freebase? If Wikipedia decides to start showing advertisements (a possibility mentioned in the article), will the commercialism of that approach alienate contributors to Wikipedia?
I usually recommend to most people interested in learn SEO to spend some time as a Wikipedia editor learning how its updated, how its notability policy shapes what is included within it, and why Google might often see it as an authoritative source of information.
- Go Fish DigitalDirector of Search Marketing, 2013 - present
- SEO by the SeaPresident and Internet Marketing Consultant, 2005 - present
I presently live in the Virginia Piedmont, about 50 miles west of Washington, DC in a county filled with horse pastures and farm fields.
I enjoy reading fiction and science fiction, listening to most types of music, delving into the history behind small towns, out door photography, and exploring nature.
I am the Founder and President of SEO by the Sea, and I like working with people with their web sites, to help make them easier to find, and easier to use.
Some posts I've written in the past that focus upon analyzing patents from the search engines:
- Google's Reasonable Surfer: How the Value of a Link May Differ Based upon Link and Document Features and User Data
- The Google Rank-Modifying Spammers Patent
- Google’s Agent Rank / Author Rank Patent Application
- The Google Hummingbird Patent?
I am the Director of Search Marketing at Go Fish Digital.
I am often called a patent analyst or patent expert or patent guru by many people and bloggers and media writers, but my job is not to analyze or interpret patents. I do that for fun, and to learn things about search engines and search that I otherwise couldn't. I consider it performing due diligence and feeding my curiosity - the information behind many business models and algorithms that search engines use are being made public, and it's worth taking the time and making the effort to read through them and trying to understand what they say and why they say it.
- Widener University School of LawLaw
- University of DelawareEnglish
- Hillsborough High School