owner

Structured data markup  - 
 
Microdata, yea nor nay?

In the latest on The Beautiful, Tormented Machine (always loved that blog title) +Manu Sporny starts tolling the bell for microdata.

Are reports of microdata's impending death exaggerated, or should we be heading up to the attic to find the box with black crepe paper?

Ongoing support and for and development of microdata - or lack thereof - is of no small significance for search marketers, as we have use something with which to markup our web pages with different schemas and vocabularies (chiefly schema.org - more on that later), and obviously there's a cost to investing in the wrong technology.

An informal poll for the SEOs here concerning those makup choices...

1.  Do you use microdata, RDFa or RDFa Lite to markup schema.org?

2.  Do you find microdata easier to use than one or the other of the RDFa alternatives?

3.  Do you have confidence that Google, Bing, Yahoo and Yandex consider HTML marked up with RDFa or RDFa Lite every bit as valid as HMTL marked up with microdata, and do you have confidence that Google - specifically - is as likely to generate rich snippets from documents encoded with RDFa/RDFa Lite as it is from microdata?

Note:  At time of writing Bing still http://binged.it/RZy4Jm has "No" in the column under "RDFa support" for the "Scenario" (their column heading) "Products and offers".  If true, and if the search engines otherwise treat microdata- and RDFa-encoded HTML equivalently, this is a fairly compelling reason to favor microdata over RDFa.  If, of course, you care about Bing (it's hard to say that US-facing sites shouldn't), and if that table is up-to-date (all of the microdata examples, as far as I know, use data-vocabulary.org, despite long-standing stated Bing support for schema.org).

I tested RDFa and RDFa lite that included schema.org/Product and schema.org/Offer in Bing's Markup Validator:
http://www.seoskeptic.com/examples/product-rdfa.html
http://www.seoskeptic.com/examples/product-rdfa-lite.html

For RDFa I got this message:
We are not seeing any markup on this page.

For RDFa Lite the validation was incomplete - that is, Bing listed only the first property for any declared type.  But this is also the case with schema.org. types other than Product or Offer, suggesting that it's the tool that's buggy.

I've posed some questions concerning this to a Bing-ite (Binger?).  I'll let y'all know if I get a response.

Neither Google's Structured Data Testing Tool nor the RDFa Play site had any problems with the markup, by the way.

4.  Do you really know how to use itemref properly? :)

5.  Do you feel compelled to use microdata because it seems to be the de facto structured markup favored by Google?  Or even de jure - for example, only microdata and JSON-LD are listed as supported formats for schemas in Gmail (http://bit.ly/16NPwpo).

6.  Do you feel compelled - or at least inclined - to use microdata because of the lack of RDFa (or RDFa Lite) examples on schema.org?  Would you feel better about using RDFa if it had a presence in the official documentation?

A note to semantic web technologists that may be rolling their eyes at this muttering "I am so over the RDFa/microdata debate!"  This is actually of significant consequence to SEOs, who tend to do what Google recommends they do and rely on code examples.  If SEOs think Google wants microdata, they'll produce microdata.  If there's any question that RDFa might not be as beneficial for search visibility as microdata, they'll favor microdata.

My answers will, I promise, follow shortly. :)

#rdfa #microdata #schemaorg  
Full disclosure: I'm the chair of the RDFa Working Group and have been heavily involved during the RDFa and Microdata standardization initiatives. I am biased, but also understand all of the nuanced decisions that were made during the creation of both specifications.
13
10
Danilo Shiga's profile photoBrad Jones's profile photoSEO's profile photoTR C's profile photo
50 comments
 
Appreciate your thorough answer +Jarno van Driel ... more commentary to come (I really should do some, like, work today:).
 
+Jarno van Driel did you read the article that was linked to? (I only ask because your response seems strange considering the facts on the ground) http://manu.sporny.org/2013/microdata-downward-spiral/

That said, I do agree with you. SEO folks will implement whatever Google says to implement. With respect to Microdata vs. RDFa simplicity, what do you think of RDFa Lite? Did you know that you can just search/replace most Microdata pages w/ RDFa Lite attributes and you get the same data? That is, the only difference is typically the name of the attributes.
 
My quick note here on relative difficulty was that I have typically used microdata because I found it easier than RDFa, but I still found I had a long and steep learning curve when it came to microdata.  And RDFa Lite is definitely easier to use than RDFa - though as I work with either of them I find them easier and easier.

+Jarno van Driel while I know that +Manu Sporny is an RDFa advocate (a fact, Manu, that I appreciate you're always upfront to disclose) I find it very difficult not to agree with him when it comes to the utility of RDFa vs microdata (see http://bit.ly/14CW1YV and http://bit.ly/LnCYzh).  It is simply more extensible.

itemref is a terribly useful feature of microdata.  As the three of us have discussed it and the equivalent proposed RDFa utility here before I won't go through it again (see http://bit.ly/16Ss2kS) except to say again that those that have been dismissive of its utility clearly aren't accustomed to marking up real-life code in the wild, which is indeed (above reference) as often as not "long, invalid, messy and unstructured HTML...."
 
This article hit me at an interesting time. I just finished reading this interview with +Jeff Dean at Google: 

http://www.bizjournals.com/seattle/blog/techflash/2013/08/google-scientist-jeff-dean-on-how.html

Here's an interesting quote:

"We have the start of being able to do a kind of mixing of supervised and unsupervised learning, and if we can get that working well, that will be pretty important. In almost all cases you don't have as much labeled data as you'd really like. And being able to take advantage of the unlabeled data would probably improve our performance by an order of magnitude on the metrics we care about. You're always going to have 100x, 1000x as much unlabeled data as labeled data, so being able to use that is going to be really important."

So if you treat 'unlabeled data' as a proxy for 'unstructured data', you can get a sense of how big the challenge is for Google. The ability for them to build adoption of structured data is clearly a priority. 

My wild guess is that, in support of that effort, they may play it both ways: Continue to unofficially support RDFa/lite (why throw out perfectly good structured data?) while officially pushing for the simplest solution that the rest of the unstructured data on the web. In that scenario, search marketers could choose their own route.

Now, whether or not they support any use cases for the additional extensibility of RDFa? That seems unlikely. Why move the herd back towards RDFa with a tempting rich snippet only available through that markup?
 
Excellent points +Matthew Brown.

Certainly the "there's oodles more unstructured than structured data out there, and we need to understand it" is something of a search engine theme, and understanding this essential truth is, I think, really useful in deciphering how the search engines make use of structured data, and why they might encourage its use.

For example, clearly schema.org +/or rel="canonical" +/or rel="prev" & rel="next" +/or Google authorship isn't necessary for inclusion in Google "in-depth articles," but when you see the prominence and (so far) stickiness of those snippets, the suggestion by Google that a site can improve their chance of appearing there by using these technologies provides a pretty big carrot to publishers - with the benefit to Google that they now have better data from whatever.com, regardless of whether or not whatever.com is ever rewarded with an in-depth article.

Regarding the extensibility of RDFa, this is one of those areas where semantic web advocates and search players (both search engines and SEOs) have different agendas.  That RDFa is so versatile and so extensible makes it a great choice for building out the semantic web, but that it doesn't do a significantly better job of encoding schema.org types and properties than microdata makes it a basically equivalent tool for search marketers.  And the search engines doubtlessly well understand that adoption will be higher if they reduce (or downplay) the choices available to webmasters (all internet marketers are familiar with the concept of removing competing calls-to-action in order to improve the overall conversion rate), so the lack of RDFa examples on schema.org is not surprising.
 
+Jarno van Driel your criticisms seem to fall into two areas: The first is based on your belief that using Microdata is easier (or seems to be easier) than RDFa Lite, the second is with the amount of effort you believe it takes to participate in the W3C.

To address the criticisms that fall into the first category:

the syntax of RDFa (and it's light variant) is too messy

Changing schema.org Microdata to schema.org RDFa Lite, in most cases, is a simple search and replace of properties like itemprop, which is Microdata, with property, which is RDFa. Quite literally, the only thing that changes are the HTML attribute names you use. So, I need you to explain in a bit more detail why you think RDFa is "messier".

developers I know simply don't have the time to really study 'RDFa' and the vocabularies it supports

Schema.org is such a vocabulary, why not just use that with RDFa Lite? RDFa is used for doing things like medical data and scientific publishing markup, but if developers aren't marking that information up, they have no need to learn those vocabularies. There are literally hundreds of vocabularies that RDFa supports, but since your group is interested in SEO, schema.org and OGP are the only ones that matter to you. I think this is a false impression that you need to learn all those other vocabularies... just stick with RDFa Lite and schema.org and you'll be fine.

sales persons find it nearly impossible to get clients to pay for RDFa, heck it's already difficult to get clients to invest in schema.org in combination with Microdata

Your sales people shouldn't be trying to "sell" RDFa. They should be trying to sell better representation in Google search results. Part of that is schema.org markup, which can be done in either RDFa or Microdata - Google supports both.

To address the criticisms that fall into the second category: 

W3 has the prejudice of being too 'elite' and that keeps a lot of folks away from it

I understand this criticism, as I thought the same thing before getting involved at W3C. It's unfortunate that people have this view of W3C because it's completely unfounded. Sure, there are fairly academic groups at W3C, but most of them (like the HTML WG, CSS WG, RDFa WG, JSON-LD, etc.) are focused on developers and trying to make their lives easier. The majority of the RDFa WG, for instance, were made up of people that do Web development at some capacity for a living. Many of them participated in the group at no financial cost, they were simply some of the best people in their field and they brought something unique to the table (like the perspective of being a Drupal developer, or someone publishing data for the UK government, or someone publishing legal data for Creative Commons, or someone publishing financial/product data in a decentralized way, etc.).

If you want to get involved at W3C, look down this list of the active groups:

http://www.w3.org/community/groups/

Join the mailing list, introduce yourself, ask how you can help.

The folks I know who have tried to participate have moved away from participating because of the investment it takes. it simply gets in the way of work that has to be done to be able to put food on the table.

My company is in the exact same position. Small start-up, lots of customer work, not enough time, not enough money. We participate in community groups, mostly, which don't cost a dime to participate in. Even the Working Groups will invite you to participate if you can contribute some unique perspective (as an Invited Expert). Many Working Groups are 50% invited experts and 50% member companies.

As far as time, yeah, we're all crunched for time. However, we find that knowing what technology is coming down the pipe 3-5 years in advance of our competitors gives us a serious competitive advantage. Especially when we're the ones building the technology at W3C. This sales line always works: "Do you want to hire company XYZ, or do you want to hire the people that built the standard (us)?" It works really well when trying to land a potential client.

I have to admit I didn't even know there was a Microdata community one could participate

There isn't one, that was sort of the point of the blog post. The closest you'll get is the WHATWG IRC channel on freenode, and they don't really like talking that much about Microdata on there.

Which brings me back to my question to you? If I were to get involved in Microdata and even get others to participate, what type of commitment (in hours) are we talking about and how should one go about throwing this into the mix with day-to-day work?

The amount of commitment is up to you. I probably put in close to 10 hours a week on standards-related activities (this includes chairing 2 different groups, which is very time intensive). For just regular participation, most people put in just 2 hours a week. Granted, if you're trying to get a new community off of the ground, that's more like 20 hours a week for the first 3 months, then tapering off to 10 hours a week thereafter. You get out what you put into it. Sometimes it just helps to track the mailing list and see what people are talking about.

One of the things that would be helpful is that when you criticize RDFa, that you use examples and be specific. My job is to go out here and try to figure out pain points and problems people are having with the spec so that we can fix them in RDFa 2.0. In many cases, I find that the people criticizing the technology don't quite understand it (which is partly a documentation and evangelization problem, which we also need to fix). If you want this stuff to get better, solid examples of pain points go a long way.
 
Fascinating article from +Manu Sporny and even more so the discussion here, but also frustrating. This feels like a VHS versus Betamax debate. For average site owners and developers, I don't want to get embroiled in a whose technology is the best, I just want to know standard the masses will adopt. Or in reality, I'll do whatever the site that sends me 80% of my traffic wants me to do.

I read all of +Jarno van Driel comments and just kept nodding my head. There's a balance to be achieved in staying ahead of the curve/bleeding edge versus investing time, resources, especially for smaller companies. If that time is spent backing the wrong horse, then it can have expensive consequences.

After years of using microformats and hreviews, and bolting on bits of structured data, it was time to clean up and standardise, and that was microdata. Having invested the time and energy to deploy it, my heart sinks at the prospect of replacing it in 12/24 months if turns out that this format has run it's course.

This is a great community, and the place to learn. I'm not academic or a thought leader, so will let the first wave of innovators pass through. But I want to be there heading the second wave as an early adopter, enthusiast, and hopefully commercially capitalizing on the benefits. 

When it comes to the fate of microdata, I suspect for us average folk, that we simply wait for Google to decide, and as Manu says, SEO's will simply do what Google tells them to do...
 
0. RDFa is very versatile and make possible using of many dictionaries for making markup semantic. But this art of making sites semantic is firstly very sofisticated for users, and secondly, not useful for SEO, cause search engines don't need this bunch of dictionaries to make clear, about what the site's content goes. So, RDFa can be used for semantic markup, but this art of semantic isn't SEO semantic, it is semantic art pour l'art.

If we speak about Semantic SEO, so we must speak about using of Schema.org dictionary (and some things which are shortly included into it, like Good Relations etc..). In this case is using of RDFa redundant - RDFa is imho for another purposes.

1. I use microdata: the purpose of using it is primarily that the producing of semantic website markup has for me the one and only reason: to let search engines better understand my content. And Schema.org provides firstly Microdata for marking up. So it wasn't hard to come to this desicion. Search engines behind Schema.org were able to choose RDFa, but they don't. W3C doesn't produce any search results - so if i transfer my code to RDFa, so only after Schema.org (and search engines behind it)  do it officially. Befor it happens i see no reason to think about RDFa in SEO approach.

2. Microdata isn't easier to implement as RDFa - if we speak about using of the same dictionary (Schema.org), so is the markup already the same.

3. Google, Bing, Yahoo and Yandex will so long support RDFa, till they realize, that the amount of sites using RDFa become statistically minor, that it can be ignored without information losses. Or, otherwise, if the amount of sites will not become minor, they will include RDFa into official Schema.org documentation.

4. Yea, itemref is a mighty instrument, specially for SEO purposes.

5. It seems to me like it is both de jure and de facto standard: most examples are in Microdata, RDFa isn't present in the documentation. It looks like were RDFa a product in a terminating stage, short for closing/supporting stop, and Google gives merciful time for RDFa users to transfer old markups to the new standard. The current trend of W3C gives another feeling, but i remain by my opinion till i read another things in Schema.org blog.

6. No, i'm not feeling compelled nor inclined to use Microdata - i just use the thing, which is (mostly) recommended from establisher, in our case it is Microdata, which is most recommended by Google. If RDFa would be present in the official documentation, i would look into Google's help to see, which examples are the most: if this were Microdata examples, i would still use Microdata, if it were RDFa, i would think, Google like RDFa more than Microdata, and transfer my markup into RDFa.
 
+Jarno van Driel  We come unavoidable to discussion "which is best", cause Google/Schema.org preferred Microdata, but W3C preferred RDFa (if i rightly understand "the downward spiral"-thread by +Manu Sporny ). But we must duscuss it from the point "which is better for which purpose", and with this approach i tend to say "for general web semantic RDFa is better, cause versatile and makes using many dictionaries possible. But for semantic SEO is Microdata better, cause more unified and favoured by search engines".
 
+Jarno van Driel ok, so we have a lot of information in this thread. Lots of good points on all sides. I'd like to start focusing on what we can do about these problems:

I find myself having a conundrum surrounding Microdata/schema.org/RDFa/Vocabularies and it is unclear to me where I should/could go to post my problems

Post your problems here, we will respond: http://lists.w3.org/Archives/Public/public-rdfa/

That mailing list is setup to talk about RDFa problems that people are having. We have all of the experts that created RDFa on there to help you. Everyone there is pretty knowledgeable about Microdata as well, so we could help you there too, but don't be surprised if we try to convince you to use RDFa instead. :P

I would have to reply that many a time when I made a technical post in the form of a question, idea or problem (through websites, forums and/or by email) I got almost no reply or only ended up with more questions

When all else fails, asks the people that created RDFa. You'll find that we're a very friendly, helpful bunch. More importantly, when we start seeing people have problems, we are motivated to try and fix them by changing RDFa (case in point: RDFa Lite).

itemref doesn't exist for RDFa

It does, it's called Property Copying and the feature is described here:

http://www.w3.org/TR/html-rdfa/#property-copying

They are different ways to accomplish more or less the same.

Well, except that Facebook supports only RDFa w/ their Open Graph Protocol. So if you want good SEO for Google, and good pages for Facebook, RDFa is the only language that supports both. Look at the bigger picture.

I have requested that the schema.org folks allow us to add RDFa Lite examples to schema.org on multiple occasions. We even converted all of the schema.org examples over to RDFa 1.0 over 2 years ago: 

https://github.com/mhausenblas/schema-org-rdf/tree/master/examples

For example, here's the Person example:

https://github.com/mhausenblas/schema-org-rdf/blob/master/examples/Thing/Person/Person.rdfa

We did that hoping that schema.org would integrate the examples into the site. We had a group of people that had volunteered to update all the documentation as well. Unfortunately, nothing came of that initiative (they didn't integrate any of the examples). When RDFa Lite came out, we offered to update all of the examples to RDFa Lite, or just even provide RDFa Lite examples for schema.org... no response.

The most frustrating thing with schema.org for the RDFa Community is that we have the people that could help update the documentation and make it much better, like integrating live testing tools like this (which was initially +Dan Brickley's idea, by the way):

http://rdfa.info/play/

into schema.org. Since schema.org is being developed in a closed environment, we can't make the improvements that we know developers desperately need. Sure, we can run a parallel site w/ examples and demos, but the site that most developers look at is schema.org, not some other site. Even if we were to put up another site, there would still be the question of legitimacy.

So, given this information, what could we do to help this community w/ RDFa Lite markup for schema.org?
 
+Evgeniy Orlov said: But for semantic SEO is Microdata better, cause more unified and favoured by search engines

What about if you want your pages to appear nicely in both Google's search engine and Facebook? RDFa is the only language that works for both Facebook OGP and schema.org.
 
+Jarno van Driel said: I have had very confused developers at my side whom got completely lost when they read online examples or when they had to work on code that was already in place on a site. The fact that RDFa offers something (wonderful) as this doesn't seem to get through to many junior and medior level developers

Yes, that's a big problem that we have both in the RDFa and Microdata communities. We don't have very good documentation right now because we've been so focused on creating the technology.  In the short term, if you have problems, we have support channels:

Website: http://rdfa.info/
Live testing tools: http://rdfa.info/play/
Mailing list: http://lists.w3.org/Archives/Public/public-rdfa/
IRC: #rdfa  on irc.w3.org:6665
and I'm on G+ and Twitter constantly looking for people that are having problems w/ RDFa (and JSON-LD).

Hopefully now that standardizing RDFa 1.1 is off of our plate, we can focus on creating new tutorials for developers, like this one: What is Linked Data?

and this one: What is JSON-LD?
 
I work on this stuff at Google, including for schema.org

There is a lot that could be said here; I will put numbers next to some paragraphs to make my thoughts seem more organized.

1. Schema.org is a collaboration between competitors; the companies do not talk in any detail about their specific product plans.  So often schema.org endorsements are closer to "we think this is good for the Web" than "all our products understand this today". Historically RDFa 1.1 has been rather in the former category. Some schema topics have been like this too. RDFa 1.1 / Lite is a good thing for the Web, that is clear.

2. In general schema.org has been supportive of the development of RDFa Lite (in blog posts, at conferences etc.), but has not yet very visibly advocated for it on the site's example pages. This is partly an hours-in-the-day issue, and partly to wait for implementations to catch up i.e. not wanting to confuse non-standards-geek publishers by advocating something that will cause frustration for 1000s of developers if adopted. 

3. itemref.

Several practically minded voices have been saying "we need something like itemref in RDFa 1.1+". The RDFa group did make an attempt at that, but it was handled as a piece of vocabulary rather than as core syntax. I haven't found anyone outside the WG yet who thinks that this design works. It isn't clear to me yet how critical an issue this is, or what can be done about it at this time.

4. Nobody here benefits from "Microdata is dying!" rhetoric. 

I don't care whether your first love is RDFa, Microdata, Microformats v1 or v2 or whatever. For publishers who don't live and breath this kind of standards work we will just look like an in-crowd of squabbling geeks.

The fact is that all these technical approaches grow closer by the day - they're all more or less describing items/things in terms of their named properties/relationships to other such items or to atomic values. By talking about our various approaches in the language of right and wrong, battle and conflict, we mask that underlying commonality. We would do much better working on ways of communicating those common characteristics of Microdata and RDFa (and microformats and json-ld and linked data), than on pronouncing our preferred approach to be the victor in some fictional war. In practical terms this probably means collaborating on APIs. The kind of collaboration that works better without all this righteous "our standard is the best, the other ones should go away now!" talk.

5. Google has a pretty solid RDFa 1.1 parser. Please let me know if you find any cases in which Google Rich Snippets (or the Structured Data Testing Tool) gives a rich snippet only in Microdata not RDFa,  or vice-versa. 
 
So +Dan Brickley, if I understand correctly, you see the future as potentially merging or blended formats, or running in parallel, rather than one standard being adopted over another. Which is reassuring for people like me who are comfortable with one format and have limited time resources.
 
+Dan Brickley said: RDFa 1.1 / Lite is a good thing for the Web, that is clear.

Thanks for saying that Dan. The unfortunate truth is that most Web developers will never see this, all they see is what's on schema.org and the message on that site is very different than the one you're telling here. I'm not saying it's intentional. I'm merely pointing out the fact that most Web developers will never know that this is the position of schema.org.

This is partly an hours-in-the-day issue

Then let the RDFa community help. We'd be happy to add examples, update documentation, write helpful tools, as long as they're integrated into schema.org. That's why we would like to see schema.org up on github, so that we can contribute in a more meaningful way. That will help Web developers understand schema.org's position that you outline above and help them write better markup.

itemref - I haven't found anyone outside the WG yet who thinks that this design works

This is completely surprising to me. Did I miss a comment that you made to the RDFa WG to fix this during the review period for the spec? We know of no Microdata itemref example that isn't covered with this feature. We were very careful in designing it to be compatible w/ Microdata. Do you have an example where it doesn't work?

Nobody here benefits from "Microdata is dying!" rhetoric.

At no point did I say "Microdata is dying!" Sure, there are a number of data points that imply a particular trend, but those are all facts - you can figure out the trend line for yourself. I wrote the post because a number of people in the Web community were confused about the Microdata API being ripped out of Safari and Blink: 

https://twitter.com/stilkov/status/367237073117671424

I have gone back into the blog post and added the following sentence: "To be clear, I’m not saying Microdata is dying, just that not having these basic things in place will be very problematic for the future of Microdata." Hopefully, that helps clear up that particular interpretation of the blog post.

Google has a pretty solid RDFa 1.1 parser.

Any chance that you could submit a conformance report so that we could post it to the rdfa.info website? This would help people understand what parts of RDFa Google supports.

I think +Martin Hepp has a number of examples of Google's processor not doing the right thing when it comes to certain RDFa markup.

The kind of collaboration that works better without all this righteous "our standard is the best, the other ones should go away now!" talk.

Could you point to where I say that in the blog post?

I think you're reading a bit too much of the historical baggage around this conversation into that blog post. That blog post was a response to a group of people that were confused about the current state of Microdata and browsers. I elaborated on the issue using facts (with references to the source material). If you think these facts are wrong, please point them out and I'll be happy to correct them in the blog post.

I do agree with you - many of these solutions have converged, and that's a good thing. However, when things converge like this, it's a good idea to do some housekeeping. Get rid of the cruft. In order to do that housekeeping, we need to start looking at things like the facts I cover in my blog post. More information, as long as it's grounded in reality, will help us make that decision. In the end, we all want what is good for the Web and Web developers. We want to help Web developers have the tools they need to be more effective at building great content for the Web. We're all going in the same direction, we just differ in the paths we happen to be walking on today.
 
Just on this point quickly - "I haven't found anyone outside the WG yet who thinks that this design works.
This is completely surprising to me. "

When you were thinking of someone outside the WG who thinks the design works, who did you have in mind?

I agree that detailed review comments from Google staff on the draft design could be very useful, but having at least a few non-WG people say "yeah that works great!" would also be a fine sanity check. 
 
+Aaron Bradley Late in answering your original questions, but:

1) I initially (early 2012) used microdata for adding schema.org structured data, because that was what the examples at schema.org used. As RDFa Lite matured, however, I have started using RDFa Lite instead.

2) I don't find any significant difference in implementing schema.org via microdata vs. RDFa Lite. I am concerned about some of the schema.org-specific proposals that aren't part of either microdata or RDFa Lite (specifically: roll-your-own type extensions via "BaseType/CustomType" property values, and the recently proposed "SetOf/Type" for table-based markup).

Also, with respect to your initial RDFa example that you tried with Bing's webmaster tools, I suspect Bing does not support the xmlns-prefixed attributes that were deprecated in RDFa 1.1 in favour of "prefix" (per http://www.w3.org/TR/rdfa-in-html/#backwards-compatibility). Given that you're only using the schema.org vocab, you could just use the "vocab" property (and then your example would be identical to RDFa Lite markup). But only a Microsoft engineer could definitively answer how Bing processes RDFa.
 
When you were thinking of someone outside the WG who thinks the design works, who did you have in mind?

Web developers that we've asked about the feature, people in various companies that we've polled about the feature, folks that reviewed the spec. We asked around and Web developers seemed fine with it.

You didn't answer my question, however. Do you have any examples of it not working?
 
Also, for those just joining this thread. I want to make this very clear - +Dan Brickley has been absolutely vital in making sure that schema.org has been as successful as it has. He doesn't get enough thanks for the usually thankless job of having to go out and convince people that this whole Linked Data thing is something that's worth doing. 

I truly, honestly, deeply appreciate all of the hard work he's done across all of these Linked Data initiatives. I've always been thankful that he has been the public face of schema.org as he always listens and tries to do the right thing. That does not come across often enough because it's usually when things go wrong that I write blog posts like the one I did, which forces him to take time out of his busy day and explain schema.orgs position.

So, be sure to thank him if you get the opportunity, he doesn't get nearly enough of that. :)
 
+Manu Sporny  I personally use both namespaces for Schema and for OGP, cause i strongly clear them: first for Google etc, second for Facebook. If i read, RDFa were good for all of them, so i must ask, why is RDFa not already mentioned as THE markup language from all companies, which push the web semantic.
 
Thanks everyone for your contribution to this thread.  While, +Dan Brickley, it may have had its genesis in the (admittedly tiresome) standards debate, much of what has been discussed is useful and even actionable.  More on that later.

As promised, my quick responses the questions I initially posed (repeated questions below are paraphrased for brevity):

1.  Do you use microdata, RDFa or RDFa Lite to markup schema.org?

In sites I work on I typically use microdata, though in my dubious role as "the search guy that's supposed to know semantic web stuff" I've certainly tried to school myself in RDFa, and have coded material in RDFa and RDFa Lite.

2.  Is microdata easier...?

I find it so, but its quite possible that the reason for that is that I learned microdata before RDFa.  I had fiddled with RDFa after I had discovered GoodRelations, but didn't get proficient at it because I couldn't get buy-in to deploy it on any site.  With the weight of the search engines behind it, getting buy-in for schema.org was considerably easier, and I learned microdata pretty throughly then, both because it was initially the explicit approach-of-choice for schema.org, and because I was able to lean on the schema.org examples.

One way or another, I'm happy I pivoted back to RDFa later, once a mature RDFa Lite was available for my use.  I don' t think anyone would disagree that - especially for the relatively straightforward task of marking up HTML with schema.org types and properties - RDFa Lite is considerably easier than RDFa. :)

3. Do you have confidence that the search engines treat RDFa (Lite) the same as microdata.

Yes ... but preying in the back of my mind is a suspicion that Google, specifically, isn't going to provide the same benefits in terms of engine search visibility for microdata-encoded data as for RDFa.

Which I know is crazy, because for that to be true Google would either have to have problems parsing RDFa, or choose to ignore (or otherwise treat specially) web data it was able to successfully structure by parsing RDFa; that Google can parse RDFa is demonstrably true because of Structured Data Testing Tool, and the "we've parsed it, but we'll ignore what we parsed" scenario just doesn't seem plausible.

So I guess the perception boils down to the lack of RDFa examples on schema.org, and the "outdated" RDFa examples on google.com (Google typically shows microdata examples with schema.org, but RDFa examples with data-vocabulary.org; those latter examples are valid, of course, but if you've visited the data-vocabulary.org home page of late, you'll see that it's basically a billboard for schema.org - which absolutely makes sense, as schema.org obviously builds on the work of data-vocabulary.org).

There's been times in the past where I've been troubled by RDFa-encoded sites seemingly not garnering Google rich snippets with the same frequency of similar resources using microdata, but I'll freely confess my observations were anecdotal rather than empirical (I can't begin to tell you - if indeed it was necessary, which with this crowd it probably isn't - how difficult it is to even attempt mapping the use of, well, anything to the appearance of rich snippets in search).

As previously discussed at length, I have no confidence that Bing treats RDFa (any flavor) encodings of schema.org types Product and Offer the same as microdata because they say they don't.  (I haven't heard back from Bing on this.)

4.  Do you really know how to use itemref properly?

Hell no!  Which is unfortunate because situations where I need it keep coming up. :)

But +Jarno van Driel is helping me understand itemref better, and we're working together on putting together some permalinked examples for the benefit of the world.  And we'll certainly be reaching out to some of the parties here for their input on that, and property copying in RDFa (I find it distressing that it's a feature at risk).  Excited about this.

5.  Do you feel compelled to use microdata because it seems to be the de facto structured markup favored by Google?

No really.  Anymore, that is - that was the case when schema.org first came out.

6.  Does the lack of RDFa examples on schema.org make you less inclined to use it?

Absolutely.  Not, anymore, because of suspicion that it may not be treated equitably by Google, but because I'm not the brightest coder on the block it's considerably more work for me to get the markup right without reference examples.

Some responses to a couple of the many great comments here forthcoming.
 
Thanks, Manu. I know it's frustrating for the RDFa community. In many ways, Microdata took (some but not enough of) the best ideas from RDFa and ran with them. Personally I believe RDFa at W3C is the better bet for the future; perhaps an eventual RDFa 2 will be able to slim down further and reflect a few more lessons learned. It is clear that for the forseeable future there will be multiple ways of exchanging structured data graphs. Even forgetting Microdata, JSON-LD, RDF Turtle and RDFa raise that issue, so the API discussion seems inescapable. 

Regarding itemref, I regard the absence of non-insiders saying on the record "oh, this is great, it'll make my life easier!" or just "yes, I understand that" as an example of it not working, or at least not being seen to be working. It might be that it just need more examples.

I would be interested to know what +Aaron Bradley and +Martin Hepp think of the property copying mechanism, i.e. http://www.w3.org/TR/rdfa-in-html/#property-copying ... does it  address their itemref-related use cases within RDFa Lite?

BTW I know the usability tests within Google that were conducted around the Microdata design a few years back were not universally appreciated (or trusted), but I am currently looking into possibility of conducting some more work in that style. 
 
Regarding the Google Rich Snippet's helpcentre outdated RDFa examples, note that https://support.google.com/webmasters/answer/146861?hl=en has more modern RDFa - as of a few weeks go. These are time-consuming to update since any change requires careful consideration of downstream consumers, but updates are in progress. 

BTW the little RDFa example in the Custom Search Engine docs was also updated recently too, https://developers.google.com/custom-search/docs/structured_data#rdfa ... 
 
+Aaron Bradley said: property copying in RDFa (I find it distressing that it's a feature at risk)

It's not at risk any more! It'll be baked into the standard as of next Thursday.
 
First of all, I only gave one +1 to the note of thanks that +Manu Sporny gave to +Dan Brickley because you can only give one +1. I enthusiastically second all of Manu's sentiments:  you're doing a hell of a job Dan.

Regarding "approach wars" I understand what you're saying, Dan, and absolutely agree that squabbling isn't helpful.  Onward ho!

But continuing to discuss the relative merits of the different approaches is useful and necessary.

It's somewhat frightening to ponder, but currently in most organizations - including those with enterprise level websites - its guys like me that make all the calls regarding pretty much everything concerning the implementation of publicly-consumable structured data for that organization's web site(s).

Developer - So the CTO sez we're gonna start addin' schema.org to our site?

Me - You bet!

Developer - Cool.  So what do I use?

In the majority of situations, I can't say "well, you can use microdata or RDFa" for a number of reasons.  First - regardless of the level of seniority or experience of said developer - they're almost bound to respond with "what are those?"  So I can't have a nuanced conversation with technical staff about the best approach to use, because they'd have to be familiar with the protocols under discussion.  And they simply aren't.

Second, it's me that's going to be providing instructional and reference links, participating in technical discussions about the markup protocol and its deployment, and testing work in development.  It's just not feasible to embark on an exploration of multiple alternatives.

Third, making that sort of call is one of the things I'm paid for.  Ack!

And if you want to ponder something even more frightening, I go to conferences and am asked about the best approach to employ by search marketers who, generally, know next to nothing about microdata or RDFa so that they can go back to their offices and make recommendations to their IT people who, generally, know next to nothing about microdata or RDFa.  O.M.F.G. :)

Is "talking about our various approaches in the language of right and wrong, battle and conflict" a useful approach?  Absolutely not!  And so much more is accomplished (to paraphrase what you said, Dan) but focusing on the common benefits of the different approaches then the differences between them.

But while I don't advocate pronouncing my "preferred approach to be the victor in some fictional war," I do absolutely must recommend a preferred approach in very much non-fictional deployments of schema.org.  To that end the warfare isn't helpful, but (measured, reasonable, fact-based, polite, informed) discussion of the relative merits of the different approaches is very helpful indeed in making informed decisions about which approach to take.

A quick observation that in the current environment where developers still don't by-and-large have much knowledge about structured data markup, microdata does have the advantage of having fewer choices than RDFa.  That is, there's many ways to approach a chunk 'o code you want to mark up with RDFa, but not a lot of options when it comes to microdata.  RDFa Lite, though, is far more akin to microdata in that regard: to-may-to, to-mah-to.

At the end of the day I'm pleased as punch that these sorts of conversations are taking place between the developers of the technologies and those that are deploying them.  Thanks to the too-numerous-to-mention-by-name SEOs and internet marketers for rolling up their sleeves and diving in, and to those in the semantic web community for their receptiveness and helpfulness - including, but not limited to, +Dan Brickley, +Manu Sporny, +Gregg Kellogg, +Kingsley Idehen and +Martin Hepp.
 
It's sad that browsers are pulling out support for microdata. I didn't know about that. In what I've seen, a microdata parser is much easier to implement than an RDFa parser. If browsers are removing microdata support, it doesn't bode well for getting RDFa support in. But I hope I'm wrong. I'd be happy with either/both honestly.
 
+Dan Brickley Really nice to see Google's RDFa example for Organization rich snippets updated to schema.org from data-vocabulary.org, and the markup for the non-visible latitude and longitude properties of Place a nice addition to that.  Thanks for pointing that out.

And I appreciate that such updates are time-consuming, and - done right - require a lot of due diligence.  I'd certainly rather wait for examples than try and cope with ones that had been rushed through.

I'll get back to you (hopefully with input from +Jarno van Driel) about whether or not the property copying mechanisms address common use cases once I've time to work with it a bit more (and successfully test any equivalent microdata itemref models).

What a great example +Evgeniy Orlov:  it didn't occur to me before but, wow, the structure of a classic HTML <table> is just an poster child for itemref waiting to happen, isn't it?

I added a type declaration and modified properties to fashion MusicEvent code that validated:
http://bit.ly/14QxbvT

If you nix the structured data for the second band, the SDTT even generates an event rich snippet - sweet!
http://bit.ly/18ANkCM
 
+Manu Sporny ok, i read the 2 paragraphs :-) I thought the microdata api was about as simple as it gets. JSON-LD is a serialization for RDF, it's not an API.
 
Here is the RDFa equivalent of (+Aaron Bradley's version of) +Evgeniy Orlov's itemref example:

https://gist.github.com/niklasl/6255163

This pattern (supported since RDFa 1.0) illustrates that, on the contrary to various claims, RDFa has always recognized the sometimes messy, dispersed nature of data in HTML, by allowing you to repeat the same subject in different places of the page.
 
+Ed Summers actually, JSON-LD has an API, and it is also an RDF serialization. With framing, it's also a query processor.
 
+Ed Summers What +Gregg Kellogg said. Once JSON-LD 1.0 and the JSON-LD 1.0 API hit REC, we'll shift our focus to JSON-LD Framing (query by example, re-layout of data for your application):

http://json-ld.org/spec/latest/json-ld-framing/

and RDF Graph Normalization (digital signatures, verifiable claims, detecting differences in sets of claims, comparing sets of claims, hashing graphs, etc.): 

http://json-ld.org/spec/latest/rdf-graph-normalization/

You can see the full stack of specs that we're working on here:

http://json-ld.org/spec/

We've found that there are a number of very effective ways to use Linked Data without having to use the graph database/SPARQL sledgehammer. Note that many people get lots done using just JSON in the browser w/o needing to fall back to a database to do processing and querying. The same thought process applies to the JSON-LD API. We can give people the tools to transform RDFa, Microdata, and Microformats to a common JSON-LD format. At that point, it doesn't matter where you got your data, you can merge all of them into JSON-LD and work with them using a simple API from there.

It's also true that you can put all of that data in a graph database and use SPARQL to work with the data, but that's typically a solution that's only fit for back-end solutions and very advanced programmers.

The JSON-LD APIs are supposed to work without the need for heavyweight systems (graph stores, SPARQL, etc.) to store and process the Linked Data.
 
+Niklas Lindström   Wow, rdfa.info/play shows your markup very clear! But the problem which was already adressed here is, that the Google's testing tool gives errors out on trying to check RDFa markup. So with your markup too: http://goo.gl/7ZNx6O. Can you probably make your RDFa markup conform to the Google's testing tool output? In case the conformity of this markup with Googles tool can be achieved, this would blow off my single objection against using RDFa for SEO!

+Aaron Bradley  The best thing on using of itemref for me is: if we go the concrete SEO way and are sure, that the most importent informations must be placed as near as possible to the opening <body> tag, so we design the markup on following art: http://goo.gl/2YKO8N . Thus we don't need to be sad about html structure, which isn't always changable (from us) but are sure, that search engines will get immediately and firstly informations which we mean important.
 
I have a very limited, bystander's view into the whole debate, so my position may be rooted in me just being ill-informed. My initial puzzlement over the decision to drop Microdata JS API support from Blink and Webkit was because I had accepted that the whole debate around how to do semantic stuff right had settled on Microdata through the sheer power of Google. If it's now RDFa Lite instead, I'm absolutely fine with that, too – as +Manu Sporny rightly points out, the differences seem to be negligible. But whatever it is, I very strongly believe nobody benefits from having two things that do the same, especially is one if backed by a standard and the other is backed by Google. 

I spend a lot of time advocating the use of Web technologies, and I'd love to be able to explain in 5 minutes how a developer can use semantic enrichment to make stuff even more usable. I'm absolutely not looking forward to wasting that time on a truly unproductive discussion of one variant vs. the other.
 
Google doesn't implement property copying. The rest of rdfa 1.1 should be ok...
 
Hi all,
thanks for the various mentions :-) I am currently not able to review the full conversation, but here are a few quick comments on issues raised:

1. Property copying in RDFa 1.1 would help a lot, but as far as I understand, it does not have the full power of itemref, since while you can attach the same properties to multiple entities, you cannot point to the same entity from multiple other entities with varying properties, and that is often needed.

2. Having worked on a bigger project recently and done that in parallel in RDFa and Microdata, I must say that my productivity in Microdata is much higher than in RDFa, despite the fact that I work with RDFa markup since its debut in 2008. 

That is not mainly because of the differences at the level of the specifications, but because the Google validators and other tools do, from my experience, not fully support RDFa as they should. The main source of trouble seems to be the fact that the RDFa processors at Google do not link entities based on subject or object identifiers (URIs), but on the position in the DOM tree of the HTML. 

In general, I strongly disagree with the assumption that Microdata is dead. Should anybody from the RDFa camps try to sink the Microdata ship by spreading respective rumors, it will fire back on the adoption of RDFa, because it will impede the adoption of structured markup as a whole. It is really not going to help to try to force mindsets, specifications, and other relicts from the Semantic Web vision into the "structured data in Web content" movement. Structured data at Web scale is a much more complicated socio-technical thing than reflected in the Semantic Web technology stack.

If it is true that Microdata support in browsers is reduced, that would be bad news.

Martin
 
+Martin Hepp said:
you cannot point to the same entity from multiple other entities with varying properties

Unless I'm misunderstanding what you're saying, this is not correct. We specifically designed it so that you could copy one pattern to multiple objects (each of which having multiple different properties). I think you're misinformed, but we should make sure that we haven't missed something.

+Martin Hepp said:
I strongly disagree with the assumption that Microdata is dead

I did not say that Microdata is dead. From the article:

To be clear, I’m not saying Microdata is dying (4 million out of 329 million domains use it), just that not having these basic things in place will be very problematic for the future of Microdata.

+Martin Hepp said:
If it is true that Microdata support in browsers is reduced

It is true, see the links in the article. I include references to conversations and bugs filed within the browser teams for all of the claims made in the article.
 
One more thing, Manu: You say

"Implementers should be aware that a simplistic implementation of the pattern-copy rule may lead to an infinite loop when handling circular dependencies. A processor should cease the pattern-copy rule when no unique triples are generated."

Did you analyze what this actually means for an RDFa processor? I imagine that in e.g. a large, distributed RDF store, it is costly to check whether this condition has been reached.
 
+Martin Hepp +Manu Sporny  What you can do is create a pattern that references another object. For example see RDFa test 0325: http://rdfa.info/test-suite/test-cases/rdfa1.1/html5/0325.html

<body vocab="http://schema.org/">
  <div resource="#foo" typeof=""><link property="rdfa:copy" resource="_:a"/></div>
  <div resource="#bar" typeof=""><link property="rdfa:copy" resource="_:a"/></div>
  <div resource="_:a" typeof="rdfa:Pattern">
    <div property="schema:refers-to" typeof="">
      <span property="schema:name">Amanda</span>
    </div>
  </div>
</body>
 
One more: I just re-read your
blogpost

    http://manu.sporny.org/2013/microdata-downward-spiral/

and I must say that it leaves me unconvinced, to say the least. First of all, while I appreciate the work being done by hundreds of hard-working volunteers at the W3C, one should not overestimate the relevance of the W3C on technology adoption on the Web. How many sites use XHTML 1.0 Strict? XHTML 1.1? Does Facebook's OGP standard really comply with RDFa 1.1 or does it rather reuse the parts and ideas it needs?

As Dan nicely pointed out above, we should not waste our energies on blaming the other camps, but I sometimes wonder why RDFa, as it stands (1.0 and 1.1) is defended to vigorously against criticism and competing approaches, instead of taking the best from the history and build a new, common RDFa 2 standard that takes the best lessons learned and breaks with the past where necessary.

RDFa still carries a lot of ballast from the original RDF / Semantic Web work done in the past ten years, part of which is unnecessary for the basic challenge of exposing structured data with globally unique entity IDs on the Web.

Image the OIL folks fighting equally hard against OWL 1 or OWL 2 ...

Martin
 
+Martin Hepp said:
I thought what you cannot do is reuse an entity node (e.g. a http://schema.org/ContactPoint) as the object of multiple triples.

Gregg gave you an example of one approach. If that doesn't work for you, perhaps you can give us a basic Microdata example that demonstrates the problem you're referring to and we can take a look at it to see if it is actually an issue. 

+Martin Hepp said:

Did you analyze what this actually means for an RDFa processor? I imagine that in e.g. a large, distributed RDF store, it is costly to check whether this condition has been reached.

Yes, we did analyze what that would mean for an RDFa processor. That specific spec text was placed in there after we did the analysis. The conclusion of the group, after several implementations, was that the pattern copying rule was implementable in a variety of ways that didn't lead to unacceptable complexity for the implementers.

The problem doesn't have to do with large distributed RDF stores since the pattern copying rule is something that is done by the RDFa processor while processing (before the triples hit a graph store).

+Martin Hepp said:
Does Facebook's OGP standard really comply with RDFa 1.1 or does it rather reuse the parts and ideas it needs?

It uses a subset of RDFa 1.1 and it is consistent with how it uses it and the authoring guidelines they give their authors. There is nothing wrong with this approach. Companies use the bits of a standard that they find useful for their specific use case, and as long as they don't do anything contrary to the spec, it's fine.

+Martin Hepp said:
How many sites use XHTML 1.0 Strict? XHTML 1.1?

Sites using XHTML (55.3%) vs. HTML (45.5%):
http://w3techs.com/technologies/history_overview/markup_language/ms/q

Sites using XHTML Transitional (78.7%) vs. XHTML Strict (21.1%):
http://w3techs.com/technologies/details/ml-xhtml/all/all

one should not overestimate the relevance of the W3C on technology adoption on the Web

One shouldn't under estimate it either. Of the XHTML vs. HTML5 debate, you'll note that 100% of those specs are now under the stewardship of the W3C. I'm definitely not saying that you need the W3C for a successful Web technology, but it doesn't hurt to go through the process because despite all of its warts and annoyances, it's the best approach we have today to gain consensus among hundreds of technology companies and thousands of Web developers that are interested in building the future of the Web.

I sometimes wonder why RDFa, as it stands (1.0 and 1.1) is defended to vigorously against criticism and competing approaches

There are many reasons. The first and foremost is probably because the community believes in the technical merit of the approach, thinks it's worth defending, and wants to see adoption decisions made on solid facts and an understanding of the state of these technologies. The second is probably because a large number, but not all, of the criticisms against RDFa are technically unsound.

 taking the best from the history and build a new, common RDFa 2 standard that takes the best lessons learned and breaks with the past where necessary.

That's more or less what RDFa Lite was.

The plan for RDFa 2 is to try and be a bit more aggressive with culling unused features. However, we can't break backwards compatibility with features that are widely used. In order to figure out which features are widely used, we need data. In order to get data, we need data over a number of years and a good set of crawling metrics (like Common Crawl).

The rough plan in my head for RDFa 2 is to wait for the dust to settle around HTML+RDFa 1.1 and get some deployment numbers so that we can be more data-driven with RDFa 2. It's very difficult to be data-driven when you are inventing something new. However, once that's out there, you can start to put statistical significance behind each feature. RDFa 2 is going to have a mode that allows backwards compatibility, so people that are deploying RDFa 1.1 today won't have to worry about RDFa 2.0 mucking with their current deployments. There may be a feature in RDFa 2.0 that allows processors to use a simpler set of processing rules, making processing implementations easier and markup simpler. For example, to avoid markup fragility, we could condense RDFa 2.0's features into element-level constructs making it much less susceptible to markup errors. So, doing something like this:

<head context="http://schema.org/">
...
<body>
<span property="martin.name">Martin Hepp</span>
...

Would generate this triple:

<#martin> <http://schema.org/name> "Martin Hepp" .

Again, that's just an idea of the sort of update that we could do to address the copying fragility of Microdata, Microformats, and RDFa. That said, we're a number of years off from that because we need to gather proper data on usage so that we don't accidentally "fix" something that isn't broken.

All of these things take time and energy, which is why having a solid community (like the one that RDFa has) is so important to the future of any standard. You need the people that are going to be involved in the project long term, who will evangelize the technology, educate people to its uses, keep the implementations up to date, rev the specification when necessary, and just do the muck work of keeping the technology relevant. The point of the article I wrote was to point out that Microdata does not have this community and part of keeping a technology relevant is getting things done, which is highly dependent on having such a community.
 
+Martin Hepp regarding "The main source of trouble seems to be the fact that the RDFa processors at Google do not link entities based on subject or object identifiers (URIs), but on the position in the DOM tree of the HTML. " - this is a known bug, and under investigation. 
 
+Martin Hepp said: "As for the property copying: I thought what you cannot do is reuse an entity node (e.g. a http://schema.org/ContactPoint) as the object of multiple triples". Using the same thing as the object of multiple triples is the basic 101 of RDF, since it is a graph model and not a tree model. Please elaborate if this is not what you mean.

+Gregg Kellogg: thus I think your example is misleading here, since you can accomplish this without using property copying, by just giving the ContactPoint an identifier and link to that directly, like:

<body vocab="http://schema.org/">
  <div resource="#foo" typeof="Thing"><link property="address" resource="#contact"/></div>
  <div resource="#bar" typeof="Thing"><link property="address" resource="#contact"/></div>
  <div resource="#contact" typeof="ContactPoint">
    <span property="telephone">000-123</span>
  </div>
</body>

+Dan Brickley fixing that bug should solve the issue that +Evgeniy Orlov stumbled upon in checking my (fully valid) RDFa version of his case (which illustrates that RDFa has never needed anything like @itemref for the very common case of describing the same thing in different places of a page).
 
+Niklas Lindström +Martin Hepp As +Manu Sporny suggested an example of exactly what the desired behavior should be would be useful. It may be that what you want to so can be done without property copying.

When developing RDFa property copying, we looked at all microdata test cases to be sure we had feature equivalence. The key was to depend on entailment rules rather than DOM manipulation.
 
Hi, first: as said before, the main sources of additional effort from translating my Microdata examples into RDFa 1.1 was to find out patterns that work with the Google Structured Data Testing tool. I think I wrote that to +Manu Sporny in a private conversation already that IF both the @resource mechanism and property copying were officially and reliably supported by both the Google Structured Data Testing Tool AND the Google operational systems (!), I would be much more productive with RDFa 1.1. As long as this is lacking, using RDFa is much more complicated for me, because I have to figure out - often with many iterations - which syntactical variant of expressing the same data is properly understood by Google as the main target for schema.org markup.

Second: (+Niklas Lindström ;-) - quite clearly, I understand the 101 of RDF and RDFa. But that is not the point. First, ever since Google has supported RDFa, the proper processing of RDFa was constrained to a subset in which the linkage based on entity URIs was not used, i.e. where the nesting of the DOM tree matched the data structure. In the (understandable) absence of a clear explanation from Google, we can only speculate on the exact processing of RDFa and the motivation for this limitation, but I assume that it is not a mere technical limitation but also has to do with the reliability of the extracted data (e.g. that proximity of lexical values in the HTML document indicates that Google can trust the data). The issue is similar to the question when "hidden" markup  (e.g. via @content or meta) is actually accepted or not - you need a lot of expertise to craft markup that is fully understood and honored by Google.

Third: I see the big pile of RDF work behind the will to establish RDFa for structured markup at Web scale, but I am unconvinced that the link between RDF and structured data at Web scale is that clear. While it is nice that schema.org can be used in RDF and that Microdata can be translated into RDF, this is not a mission-critical paradigm for the broad adoption of structured data by site-owners. I am pretty sure that none of the big search engines manages site data  in RDF ;-)

So all we need from a mechanism for exposing data is the notion of typed entities (ideally, yet not necessarily with globally unique IDs), properties and maybe datatypes. RDF is an example of a data model that meets this requirement, but it is not the only one.

Many of the techniques proposed by advocates of the Semantic Web and Linked Data technologies aim at the direct consumption of RAW data from the Web, as it is published. However, it is a very open question whether structured data at Web scale will ever be published in a degree of quality and consistency that meaningful operations can be done without intense heuristics for cleansing, augmenting, and consolidating the data. Then, most of the mantras of the RDF community like "URIs for everything" loose their relevance.

Now, what does this have to do with the "RDFa over Microdata"-battle? Well, many of the intellectual challenges of using RDFa arise from the fact that you have to weave in (1) a particular graph structure into a (2) an HTML tree following (3) the structures given by a certain vocabulary. I personally think that a mechanism like itemref that explicitly relies on the DOM tree for references between content elements in the HTML document is intellectually much simpler to grasp than linking the elements by means of a unique ID. In Microdata, you hardly need the notion of a global entity identifier (hence the limited relevance of @itemid). I think that is a plus.

As for my missing examples: I cannot copy-and-paste the respective code, since it is part of a development project and I do not have the time to document all the experiments I carried out until I met all requirements. What I have to admit, frankly, is that I find the property copying mechanism not very appealing, from the pure look-and-feel, despite its potentially valid mechanics.

Martin
Add a comment...