With pilot program, Google now allows publishers to describe CSV and other tabular datasets

Google is now making use of the schema.org/Dataset class as a means by which publishers can provide datasets to the search engine:

Google's focus here in on the provision of structured data markup do describe what a dataset is about - with the aim of improving "data discovery, leading scientists to the information they need for their work."

To this end a download link to the actual dataset is not required, but is supported by the schema.org/distribution property (with expected type schema.org/DataDownload) when a download is available.

Similarly, if the dataset being described is part of a larger dataset repository, this information can be provided using schema.org/includedInDataCatalog (with expected type DataCatalog).

Currently "Dataset markup is available for you to experiment with before it's released to general availability", with previews appearing in the Structured Data Testing Tool. Here's what the Testing Tool currently returns for this markup:

As you can see, it does indeed validate in the Testing Tool - with the exception of the schema.org/variablesMeasured property, which hasn't yet being published (available at http://pending.schema.org/variablesMeasured).

I find it a little odd that for Dataset the sameAs property, described in this context as "Other URLs that can be used to access the dataset page", is required (and I find that use of sameAs a little odd too). The "Location of a page describing the dataset" is already declared using the url property: what if that's the only page describing the dataset?

See also:
Improving Dataset descriptions · Issue #1083

A final note, regarding that issue and the variablesMeasured now in pending, that Google accepts either text or a URL for a value here, whereas the property in pending currently has an expected type of only text.
Shared publiclyView activity