Based on that last comment, Gary, it seems to me that you have two different situations in terms of the level 3, the actual object storage, and these are supporting sites, and non-supporting sites.
For example, non-supporting sites include (at the moment, until and unless they add support), Shapeways, physible.com
, Thingiverse, etc. Supporting sites include github and private hosting that have files that follow the format.
So if someone wanted to host their data on Thingiverse, and only keep it in one place, but still be able to have it found by trackers prior to Thingiverse implementing support for it, then they need to be able to host a file (JSON, as we've been discussing), that has in itself all the relevant metadata for their Thingiverse entries). Based on Mark's example, this would be one file containing data on all the models they are tracking on Thingiverse.
Now, if someone else has a github, it would make a lot more sense for them to have one tracker file with URL's, maybe names, and very, very little else; and then to have a second file (perhaps the one pointed at by the URL, perhaps merely one on that HTML page linked to in a canonical or recognizable way, perhaps one tagged in some secondary way in the initial JSON file) for each
item, stored with that item, so that they can comfortably edit it when they edit the item, and not have to do additional work.
Given a reasonable standard for that file (such as Mark's JSON format, but with only one entry each time), a script could easily be written to create and update that primary file that is submitted to or linked to by other trackers.
So based on that, it seems to me that we want both - to have an utterly simple link, or a complete set of metadata, but that we also want to have a sort of import by reference.
All this, it seems to me, is pretty closely handled by Mark's format already, assuming (not being familiar with JSON), that those attributes in the schema preceeded by an asterisk are optional? There is only one major exception I see, that is probably important to get in early, and that is some indicator in the initial link level as to whether the link should be followed as part of building an index, searching, etc.
That is, we need to be able to say, if, for example, we are pointing at someone's site that prohibits crawlers, or merely where the linked file is not of a useful format, whether or not the URL provided should be pulled and processed for further metadata.
Particularly, it would be nice in the github case to be able to tell a search engine or other crawler that you really must
process this URL to have a complete entry, to force it to pull in the linked JSON entries for the individuals, and likewise in the Thingiverse case to be able to say that I've included all the necessary metadata right here, please don't crawl further and annoy my hosting provider. Or in the case of private hosting of files, if someone provides a service that lets you upload tracker files publicly, it would be nice to be able to say I've included everything in this tracker, no need to crawl the linked URL's and push me over my bandwidth cap, let's save that for the interested parties that are actually looking for my particular item.