The OECD (Organisation for Economic Development and Co-operation) has released a white paper with the refreshingly jargon-free title, “We need publishing standards for datasets and data tables” (.pdf). (http://dx.doi.org/10.1787/603233448430).
Some of the paper talks about publishing issues, especially the options for OECD participation in CrossRef’s doi system and the challenges with that compared to publishing e-journal articles (primarily the dynamic nature of many datasets, compared to the static nature of e-journal articles).
For all you metadata geeks out there, there is a section with a proposed metadata set for data. It is pretty bare-bones, kind of on a par with simple Dublin Core for online and digital information.
It accommodates parent-child relationships (for example, a table linked to its parent dataset, linked in turn to its parent collection of datasets). That seemed to be important to OECD.
There was a field called “variable index” I was wondering if that would be something like the metadata in Lexis-Nexis Statistical, which allows you to search or browse by data breakdowns such as geographic region (e.g. “by country” “by city”), demographics (e.g. “by age” “by ethnicity” “by educational attainment”) or economic (e.g. “by industry” “by occupation”). That is really, really useful.
There is also the suggestion to use a thesaurus of controlled key terms to describe datasets, the one suggested is called “JEL” but it’s not spelled out anywhere. Is that the “Journal of Economic Literature” thesaurus? Is that a common one to use if you’re an economist?
The OECD paper proposes that metadata include a field called “periodicity,” which in some cases is mandatory, in others is optional. I wasn’t sure what that meant. Does that mean that data is available on a yearly basis, possibly in different files or data sets, or that it is presented in the described dataset with yearly rows or columns? It seems to be the former, because this metadata field is considered “not appropriate” in the case of static tables.
In Lexis-Nexis Statistical, the latter type of metadata is supported. It means you can search or browse for a chart or table of information, say unemployment statistics, presented in a single view “by year” “by quarter” or “by month”. Invaluable feature!!