Why should you use JSON-LD?

This is a question that we've been getting increasingly now that Gmail has adopted JSON-LD in a big way. A recent thread kicked off on Twitter that asked the question again, so I thought I'd take a bit of time to more directly answer the basic rationale here. Here's the thread that started it:

robinberjon :  @bertails I’m still struggling to figure out why I’d add the LD part of JSON-LD, other than for brownie points. What do I get?

The fundamental questions that you should ask yourself when thinking about using JSON-LD are these:

1. Do I want this data to be open and published in a non-proprietary format?
2. Is interoperability important to me? Will the data be useful to people outside of the community it was intended for?
3. Is it important that the data format be extensible? Should anyone be able to add their own sort of data to each entry?

If the answer to the first two questions is no, then don't use JSON-LD - it'll unnecessarily complicate your application. JSON-LD is primarily for systems that are publishing data that are intended to be interoperable with other systems and communities. It's also designed to be used if decentralized extensibility (the ability to extend and remix data) is important to you.

manusporny: @robinberjon @bertails @tobie If you're just building a data silo that doesn't need to interoperate w/ others - don't bother w/ Linked Data

manusporny: @robinberjon @bertails @tobie Interoperability with an ecosystem. If you don't need interoperability, you don't need JSON-LD.

manusporny:  @robinberjon @bertails @tobie For example, good use cases for JSON-LD: Web Payments, Web Keys, Activity Streams, preference storage, etc

robinberjon: @manusporny So, say github.com/tobie/specref switched to JSON-LD. What would happen that isn’t happening already?@bertails @tobie

The website that Robin is mentioning is a repository for all of the references to the specifications that  constitute the Internet and the Web.

We typically refer to these documents (and link to them) when we write new technology for the Web, so there is an argument there that this is information that a number of people distributed across the Web would find useful. Saying that, though, might be a bit of a stretch. They don't /have/ to use JSON-LD for the Web Service to accomplish its primary goal.

It's important to note that the specref data is a particular class of data that is important in that it basically "lives" on the Web. It is a class of data that could be construed as Linked Data. You can typically tell this sort of data apart from other data because it uses URLs extensively in the data. So, it would be natural to publish it as JSON-LD, but it may not be necessary to do so to achieve the primary goal.

Now the question is, what does using JSON-LD do to this data?

One of the first things that it could do is to re-use the hard work done by the library communities over the last decade. There is already a vocabulary that exists for expressing bibliographic indexes called bibo. It contains terms to identify things like the title of the work, date it was published, authors, etc.

Currently, instead of re-using a standard way of publishing this sort of data, the specref site has decided to create their own terminology. Again, that's not a terrible thing in and of itself, it just means that nobody else will be able to map the specref data to bibliography entries without some work.

That said, with JSON-LD, it's easy enough to do  by applying a JSON-LD context to the speref JSON. Unfortunately, the core data is such that the top-level indexing is going to be an issue when it comes to JSON-LD. So, even if someone wanted to translate the specref data, they'd have to take the data and shove it into another object to translate it to JSON-LD. Not a big deal, but it would be easier if they could just use the specref data as JSON-LD.

So, the specref JSON basically re-invents the wheel that was created by the library community over the past decade or so (you could argue centuries, but that would be approaching hyperbole :P). It is also not compatible w/ JSON-LD because it uses indexes at the top level, which is a minor annoyance.

Not using JSON-LD also means that anyone that wants to use the data for something else is going to have to write a specific application to convert the data into another format. Even if they convert it, they don't know if what specref means by 'author' is what their application means by 'author'. Now, it may be the case that the specref data isn't really that useful to anyone but the people that created the site. In that case, no need to use JSON-LD.

However, if other folks want to create applications on top of the specref site and tie that data in with something else, then publishing in JSON-LD would allow them to use a number of RDF toolchains available in a variety of languages to import the data, process it, and output something that is useful to another community.

It would also allow them to annotate the information since the subject of the bibliography entry is identified clearly using a URL. This is the decentralized extensibility aspect of JSON-LD. Decentralized extensibility is an open world assumption where, when you publish data, you expect other people to build upon it. The specref site makes no such affordance currently, so in order to extend the data, you have to write something specific to the specref site.

So, here's what going to JSON-LD gets the specref site:

1. It re-uses existing vocabularies, instead of creating yet another bibliography database vocabulary.
2. It allows people to re-use the data in a standards-compliant way.
3. It enables decentralized extensibility, which will make the data more valuable over time.

Then again, I understand that it's a pain in the ass to think about "the bigger picture" when you're just trying to solve a specific problem, and it's not clear whether or not there is a bigger picture. JSON-LD isn't for everyone... it's only for a subset of people that are building distributed systems using JSON.

Shared publiclyView activity