Locale-aware crawling by Googlebot
Today we're taking our first small steps to support crawling, indexing, and ranking of locale-adaptive pages.
What are these? A locale-adaptive page is one that changes its response based on the perceived geographic location or language preference of the visitor. For example, a page may dynamicaly serve different content to users in the USA and Canada on the same URL, or may block access from a some countries, or may respond differently based on the visitors Accept-Language header.
Today Googlebot gets two new capabilities:
1. Geo-distributed crawling, where you may start seeing real (verifiable!) Googlebot user-agents crawling from IP addresses that appear to be coming from outside the USA.
2. Language-dependent crawling, where Googlebot may set different Accept-Language headers in the HTTP request.
A few important points:
1. We still (very strongly) recommend having separate URLs for different locales and using rel-alternate-hreflang annotation for them. Separate URLs are better for users, and that's what really counts. Locale-aware crawling is for the few edge cases where it's not possible for you to have separate URLs.
2. It's early days and the countries that Googlebot will appear to come from and the Accept-Language headers it may try do not cover all combinations of countries and languages around the world. Also, we will continue to tweak things as we build out this feature. This is another reason to have separate URLs.
3. Locale-aware crawling gets enabled algorithmically if we detect your site may benefit from it. You don't need to do anything :)
Dig in more:
1. rel-alternate-hreflang: https://support.google.com/webmasters/answer/189077
2. Blog post: http://googlewebmastercentral.blogspot.com/2015/01/crawling-and-indexing-of-locale.html
3. Help Center about locale-aware crawling: https://support.google.com/webmasters/answer/6144055