Is http://schema.org preferred over https://schema.org? Is https://schema.org wrong?
My message to the schema.org
mailing list copied and pasted verbatim below.
My apologies to those who have, as a result, seen this twice, but I received no response there and wanted to reach out to others for input on this important issue.
Well, an "important issue" if https://schema.org
yet webmasters think it's prefererred
- rather less important if either protocol works as well, but still nice to have clarity.
I'll just add for any non-SEOs reading this that SEO best practices for URL and website canonicalization are straightforward, well-accepted, and well understood:
- Pick the www or non-www form of a domain you want to use, and 301 redirect the non-preferred form to the preferred form
- Pick the http:// or https:// as the protocol you want to use, and 301 redirect the non-preferred form to the preferred form
To use Thing as an example, both http://schema.org/Thing
return a 200 response header.
Is http:// preferred though, and his https:// actually incorrect?
The Meusel and Heiko paper on fixing schema.org
errors  buckets use of "the https protocol" under common errors. And a cogent Stack Exchange answer  says that one should using http, saying "Typically, user agents wouldn’t dereference these URIs."
So, sponsors/ontologists, what's the official story? :)
This keeps coming up because for many months now Google has been encouraging webmasters to use https:// for their sites . Because Google has tied this explicitly to improved search engine rankings, the audience most likely to consume and act on this information - search marketers - is the same group most likely driving schema.org
implementation on their site. And though it's conflating web page consumption with deferencing of URIs, nonetheless webmasters have been observed using https://schema.org
and justifying doing so because of this Google initiative.
If https is
incorrect, then there are thing that can be done to mitigate against its use:
- State that preference or requirement for http:// in the documentation.
- Add a rel="canonical" statement to each schema.org
page where the href value uses the http:// form of the URL. Not only would that send a clear message to any human examining the canonical, but send a message ("a strong hint" in the words of Google) to the search engines not to index the https:// form, and so they wouldn't be as likely to surface in search results (there are currently 1,890 https://schema.org
URLs in Google, 31,000 in Bing).
- Tangentially, use of a canonical would also stop the propagation of www.schema.org
URLs (currently just one www page indexed in Google, but 31,800 in Bing).
- 301 direct https://schema.org/*
- essentially resolving all technical issues with one stroke. Note that an open GitHub issue  proposes redirecting www.schema.org/*
but doesn't wrap a secure to non-secure redirect in this, and would actually redirect "https://www.schema.org/Person
 Robert Meusel and Heiko Paulheim, Heuristics for Fixing Common Errors in Deployed schema.org
 https - Secure and non-secure Schema.org Markup?http://bit.ly/1HE4ZwH
 CODE: redirect http://www.schema.org/Person
· Issue #4 · schemaorg/schemaorghttps://github.com/schemaorg/schemaorg/issues/4
 Official Google Webmaster Central Blog: HTTPS as a ranking signalhttp://googlewebmastercentral.blogspot.ca/2014/08/https-as-ranking-signal.html