JSON structure for/of metadata import/export#

file version: 10 Feb 2023

current software versions: slk version 3.3.83; slk_helpers version 1.7.4

Introduction#

A few commands of `slk` and of the slk_helpers use JSON files to represent metadata. These commands are:

  • slk tag -display

  • slk_helpers hsm2json

  • slk_helpers json2hsm

The first two commands extract metadata of one or more files from StrongLink and print it out as JSON. The latter command takes a JSON file and writes the contained metadata into the metadata schemata of selected files.

On this help page, the JSON structure is described. Please see other doc pages in order to get help using slk tag -display, slk_helpers hsm2json and slk_helpers json2hsm.

The JSON structure might change over time. Therefore, we version it. The current version is 2.1.1.

Format version 2.1.1#

[
  {
    "location": "hsm",
    "path": "/arch/ab1234/c567890/test.txt",
    "id": 12345678,
    "protocol": "slk",
    "mime_type": "plain/text",
    "tags": {
      "document.Characters": 42,
      "netcdf_header.Sdate": "2000-01-03"
    },
    "provenance": {
      "format_version": "2.1.1",
      "software": "slk_helpers",
      "softwareVersion": "1.7.4",
      "fingerPrint": 2871504,
      "programCall": ""
    }
  },
  { ... NEXT RECORD ... },
  { ... NEXT RECORD ... }
]

Note

NetCDF files have different MIME types depending on the underlying format. NetCDF classic files are application/x-netcdf and netCDF4 files are application/x-hdf. If you make different experiances, please contact support@dkrz.de .

Details:

  • id: resource ID of the file in StrongLink/HSM; Either id and path have to be set for the metadata import via slk_helpers json2hsm

  • path: absolute path of the resource including filename

  • location: should indicate on which storage the file is stored; meant for workflows, which is not implemented yet

  • protocol: needed when intake catalogues are created; should be slk if files are stored in the HSM

  • mime_type: MIME type of the file

  • tags: contains all metadata fields that should be set; each field has to be provided as "SCHEMA_NAME.FIELD_NAME": VALUE

  • provenance: provenance information of this json (not of the data file or of the metadata). All fields in provenance are optional. However, we strongly recommend to set at least format_version,