Reference: metadata schemata#

file version: 08 Dec 2023

Introduction#

Files can be searched, found and retrieved based on their metadata. The metadata are stored in metadata fields. Each metadata field is part of one metadata schema. A search is defined by a JSON-formatted search query (JSON = JavaScript Object Notation) that is submitted to StrongLink via slk search. A search operation is a key-value pair surrounded by {} (e.g. '{"key": "value"}'). The key can be a metadata field (e.g. resources.name or document.Author) or an operator (e.g. $gt (>) or $and (logical and)).

See also

Details on search queries and operators are provided on Reference: StrongLink query language. Several example search queries are provided in slk Usage Examples. Please see the page Metadata for a general overview.

Basic file metadata, e.g. owner and size, are automatically extracted from any archived file and stored in the schema resources. Depending on the file type, additional file-type-specific metadata are automatically extracted, too. At the moment, this feature is enabled for files of type netCDF, with the corresponding schemata netcdf and netcdf_header detailed below. All metadata stored in the netcdf and netcdf_header metadata schemata can be manually modified by the user via slk tag (see Set metadata). All global attributes and their values of a netCDF file are stored read-only in the field netcdf.Data.

A resource might be associated to more than one metadata schema - e.g. document, example_schema_abc and example_schema_xyz. One metadata field might exist multiple times amongst several metadata schemata - e.g. document.Author, example_schema_abc.Author and example_schema_xyz.Author. A file associated to these three exemplary metadata schemata might have three different values for *.Author - e.g.: document.Author: "Max Mustermann", example_schema_abc.Author: "Maxima Musterfrau" and example_schema_xyz.Author: "Mr. and Mrs. Muster". Thus, if netcdf.Title is modified manually, netcdf_header.Title should be modified manually as well.

Warning

The names of metadata fields are case sensitive.

schema: netcdf#

This schema contains core metadata of each archived netCDF file and summarizes the most relevant information of the file for human viewers. Please see the table below for available metadata fields. The column Source indicates where the information comes from (details next paragraphs). Except for the field netcdf.Errata, all fields are be filled automatically if the corresponding attributes are used in the netCDF file. This will not update the archived netCDF file. The content of all metadata fields can edited manually by slk tag and slk_helpers json2hsm.

Details:

  • Most metadata fields are filled with the content of global attributes. These are indicated by : as first character in the column Source of the table below. Some metadata fields are mapped to more than one global attribute. If (concatenate) is written in the end of column Source, the content of these attributes (when available) is concatenated to a comma-separated list. If (concatenate) is not present, the most left-listed global attribute, which is available in the NetCDF file, will be used as source for the respective metadata field.

  • netcdf.Time_Min and netcdf.Time_Max are extracted from the time variable. Currently, only a variable with the name time is recognized as time variable.

  • netcdf.Var_Long_Name and netcdf.Var_Std_Name contain a comma-separated list of the values of the variable attributes long_name and standard_name, respectively. Coordinate and auxiliary coordinate variables are also considered.

  • netcdf.Var_Name is a comma-separated list of all variables in the netCDF file.

Files with these MIME-types are considered as “netcdf”:

  • x-netcdf

  • x-hdf (if netCDF-4 format)

schema and metadata import as it should work#

netcdf schema, global attribute mapping as it will be implemented in future#

Name

Type

Description

Source

netcdf.Creation_Date

string

Creation Date

:creation_date, :date_created

netcdf.Pid

string

ID of Data Set

:doi, :id, :tracking_id

netcdf.License

string

License

:license, :licence

netcdf.Creator

string

Creator

:creator, :originator, :creator_name, :creator_email, :creator_url, :creator_type, :creator_institution (concatenate)

netcdf.Project

string

Project Identifier

:project, :mip_era, :project_id

netcdf.Institution

string

Institution

:institution, :institute

netcdf.Institution_Id

string

Institution Identifier

:institution_id, :institute_id, :centre, :center

netcdf.Source

string

Source

:source, :model

netcdf.Realm

string

Realm

:realm, :model_realm

netcdf.Experiment_Id

string

Experiment Identifier

:experiment_id, :expid, :exp_id

netcdf.External_Description

string

External_Description

:metadata_link, :further_info_url

netcdf.Contact

string

Contact

:contact, :contact_email (concatenate)

netcdf.Errata

string

Errata

netcdf.Model_Git_Hash

string

Model Git Hash

:git_hash, :model_version, :hash

netcdf.Title

string

Title of Data Set

:title, :titel

netcdf.Var_Long_Name

string

Var Long Name

concatenation of the long_name attribute of all variables (comma separated)

netcdf.Var_Name

string

Var Name

:concatenation of all variable names (comma separated)

netcdf.Var_Std_Name

string

Var Std Name

concatenation of the standard_name attribute of all variables (comma separated)

netcdf.Time_Min

date

Time Min

first value of time variable converted based on the units attribute

netcdf.Time_Max

date

Time Max

last value of time variable converted based on the units attribute

netcdf.Type

string

Type

:type, :dataType

netcdf.Class

string

Class

:class

netcdf.Data

special

all global attributes

:*

schema: netcdf_header#

This schema contains more than 100 metadata fields, should cover most common netCDF metadata and is meant to allow for automated evaluation. All global attributes that have the same name as the metadata field will be automatically ingested. The mapping is case-insensitive. Please see the table below for available metadata fields. All of these metadata fields can also be filled manually by slk tag and slk_helpers json2hsm. This will not update the archived netCDF file. In the background, all global attributes are stored but cannot be searched by users.

If metadata fields, which are important for your use case, are missing, please contact us. We will collect these proposals for the next revision of the metadata schema.

Files with these MIME-types are considered as “netcdf_header”:

  • x-netcdf

  • x-hdf (if netCDF-4 format)

schema and metadata import as it should work#

netcdf_header schema, global attribute mapping as it will be implemented in future#

Name

Type

Description

netcdf_header.Sub_Experiment_Id

string

Sub Experiment Id

netcdf_header.Summary

string

Summary

netcdf_header.Table_Id

string

Table Id

netcdf_header.Target_Mip

string

Target Mip

netcdf_header.Time_Coverage_Duration

string

Time Coverage Duration

netcdf_header.Time_Coverage_End

string

Time Coverage End

netcdf_header.Time_Coverage_Resolution

string

Time Coverage Resolution

netcdf_header.Time_Coverage_Start

string

Time Coverage Start

netcdf_header.Time_Min

date

Time Min

netcdf_header.Time_Max

date

Time Max

netcdf_header.Data_Specs_Version

string

Data Specs Version

netcdf_header.Title

string

Title

netcdf_header.Dataset_Category

string

Dataset Category

netcdf_header.Tracking_Id

string

Tracking Id

netcdf_header.Dataset_Version_Number

string

Dataset Version Number

netcdf_header.Date_Created

string

Date Created

netcdf_header.Date_Issued

string

Date Issued

netcdf_header.Date_Metadata_Modified

string

Date Metadata Modified

netcdf_header.Date_Modified

string

Date Modified

netcdf_header.Doi

string

Doi

netcdf_header.Experiment

string

Experiment

netcdf_header.Experiment_Id

string

Experiment Id

netcdf_header.Activity_Id

string

Activity Id

netcdf_header.Cdm_Data_Type

string

Cdm Data Type

netcdf_header.Channel_File_Type

string

Channel File Type

netcdf_header.Comment

string

Comment

netcdf_header.Contributor_Name

string

Contributor Name

netcdf_header.Conventions

string

Conventions

netcdf_header.Creation_Date

string

Creation Date

netcdf_header.Creator_Institution

string

Creator Institution

netcdf_header.Creator_Name

string

Creator Name

netcdf_header.Featuretype

string

Featuretype

netcdf_header.Forcing_Index

string

Forcing Index

netcdf_header.Frequency

string

Frequency

netcdf_header.Further_Info_Url

string

Further Info Url

netcdf_header.Gcm

string

Gcm

netcdf_header.Gcm_Horizontal_Mode

string

Gcm Horizontal Mode

netcdf_header.Gcm_Start_Date_Time

string

Gcm Start Date Time

netcdf_header.Gcm_Timestep

string

Gcm Timestep

netcdf_header.Gcm_Vertical_Mode

string

Gcm Vertical Mode

netcdf_header.Gdnam

string

Gdnam

netcdf_header.Geospatial_Bounds

string

Geospatial Bounds

netcdf_header.Geospatial_Lat_Max

string

Geospatial Lat Max

netcdf_header.Geospatial_Lat_Min

string

Geospatial Lat Min

netcdf_header.Geospatial_Lat_Resolution

string

Geospatial Lat Resolution

netcdf_header.Geospatial_Lat_Units

string

Geospatial Lat Units

netcdf_header.Geospatial_Lon_Max

string

Geospatial Lon Max

netcdf_header.Geospatial_Lon_Min

string

Geospatial Lon Min

netcdf_header.Geospatial_Lon_Resolution

string

Geospatial Lon Resolution

netcdf_header.Geospatial_Lon_Units

string

Geospatial Lon Units

netcdf_header.Geospatial_Vertical_Max

string

Geospatial Vertical Max

netcdf_header.Geospatial_Vertical_Min

string

Geospatial Vertical Min

netcdf_header.Geospatial_Vertical_Positive

string

Geospatial Vertical Positive

netcdf_header.Geospatial_Vertical_Resolution

string

Geospatial Vertical Resolution

netcdf_header.Geospatial_Vertical_Units

string

Geospatial Vertical Units

netcdf_header.Grid

string

Grid

netcdf_header.Grid_Label

string

Grid Label

netcdf_header.Grid_Resolution

string

Grid Resolution

netcdf_header.Id

string

Id

netcdf_header.Initialization_Index

string

Initialization Index

netcdf_header.Institution

string

Institution

netcdf_header.Institution_Id

string

Institution Id

netcdf_header.Instrument

string

Instrument

netcdf_header.Keywords

string

Keywords

netcdf_header.Lat_Min

string

Lat Min

netcdf_header.Lat_Max

string

Lat Max

netcdf_header.Level_Min

string

Level Min

netcdf_header.Level_Max

string

Level Max

netcdf_header.License

string

License

netcdf_header.Lon_Min

string

Lon Min

netcdf_header.Lon_Max

string

Lon Max

netcdf_header.Metadata_Link

string

Metadata Link

netcdf_header.Mip_Era

string

Mip Era

netcdf_header.Naming_Authority

string

Naming Authority

netcdf_header.Nominal_Resolution

string

Nominal Resolution

netcdf_header.Number_Of_Grid_Used

string

Number Of Grid Used

netcdf_header.Parent_Activity_Id

string

Parent Activity Id

netcdf_header.Parent_Experiment_Id

string

Parent Experiment Id

netcdf_header.Parent_Mip_Era

string

Parent Mip Era

netcdf_header.Parent_Source_Id

string

Parent Source Id

netcdf_header.Parent_Time_Units

string

Parent Time Units

netcdf_header.Parent_Variant_Label

string

Parent Variant Label

netcdf_header.Physics_Index

string

Physics Index

netcdf_header.Pid

string

Pid

netcdf_header.Platform

string

Platform

netcdf_header.Processing_Level

string

Processing Level

netcdf_header.Product

string

Product

netcdf_header.Product_Version

string

Product Version

netcdf_header.Program

string

Program

netcdf_header.Project

string

Project

netcdf_header.Project_Id

string

Project Id

netcdf_header.Tstep

string

Tstep

netcdf_header.User_Name

string

User Name

netcdf_header.Var-List

string

Var-List

netcdf_header.Var_Long_Name

string

Var Long Name

netcdf_header.Var_Name

string

Var Name

netcdf_header.Var_Std_Name

string

Var Std Name

netcdf_header.Variable_Id

string

Variable Id

netcdf_header.Realization_Index

string

Realization Index

netcdf_header.Variant_Label

string

Variant Label

netcdf_header.Realm

string

Realm

netcdf_header.Version

string

Version

netcdf_header.References

string

References

netcdf_header.Sdate

string

Sdate

netcdf_header.Source

string

Source

netcdf_header.Source_Id

string

Source Id

netcdf_header.Source_Type

string

Source Type

netcdf_header.Sub_Experiment

string

Sub Experiment

schema: document#

Files with these MIME-types are consideres as “document”:

  • msword

  • pdf

  • vnd.ms-excel

  • vnd.ms-office

  • vnd.ms-powerpoint

  • vnd.openxmlformats-officedocumen.presentationml.presentation

  • vnd.openxmlformats-officedocumen.spreadsheetml.sheet

  • vnd.openxmlformatsofficedocument.wordprocessingml.document

document schema#

Name

Type

Description

document.Author

String

An entity primarily responsible for creating the content of the resource

document.Title

String

Name or other identifier (such as email address) of person who created the document

document.Content creator

String

document’s creator: this could be the name of the application (e.g. OpenOffice) that created the original document

document.Version

String

Free-form version

document.Language

String

Language the document is written in

document.Last modified by

String

Name or other identifier (such as email address) of person who last modified the document

document.Revision

Int

Document revision number

document.Pages

Int

Number of pages

document.Paragraphs

Int

Number of paragraphs

document.Words

Int

Number of words

document.Characters

Int

Number of characters

document.Keywords

String

Keywords

document.Subject

String

Subject

document.Creation Date

String

Creation Date

schema: image#

Files with these MIME-types are consideres as “image”:

  • application/gif

  • application/jpeg

  • application/png

  • application/tiff

  • application/x-ms-bmp

  • application/x-pcx

  • application/x-pcxvnd.adobe.photoshop

image schema#

Name

Type

Description

image.Width

int

Image’s width in pixels

image.Height

int

Image’s height in pixels

image.Orientation

String

Orientation of the image (e.g. Landscape)

image.Compression

String

Compression format of the image

image.Bits/pixel

int

Number of bits per pixel

image.Pixel format

String

Color pixel format of the image

image.Format version

String

Image format version

image.Producer

String

Image producer

image.Thumbnail size

int

Thumbnail size

image.Compress bits/pixel

Decimal

Compressed bits per pixel

image.Depth

Int

Image’s depth in pixels

schema: video#

Files with these MIME-types are consideres as “video”:

  • mp4

  • quicktime

  • x-flv

  • x-matroska

  • x-ms-asf

  • x-msvideo

video schema#

Name

Type

Description

video.Width

int

Video’s width in pixels

video.Height

Int

Video’s height in pixels

video.Duration

String

Video’s length in hours, minutes and seconds

video.Producer

String

Video’s content producer

video.Compression

String

Compression format of the video

schema: audio#

Files with these MIME-types are consideres as “audio”:

  • basic

  • flac

  • mid

  • ogg

  • x-aiff

  • x-pn-realaudio

  • x-wav

audio schema#

Name

Type

Description

audio.Duration

String

Audio’s length in hours, minutes and seconds

audio.Language

String

Language the audio is in

audio.Channels

Int

Number of channels

audio.Sample rate

Int

Sample rate

audio.Compression

String

Compression format of the audio

audio.Format version

String

Format version

audio.Bit rate

Int

Bit rate (bits per second)

audio.Bits/sample

Int

Bits per sample

audio.Compression rate

Decimal

Compression rate

schema: camera#

Files with these MIME-types are consideres as “camera”:

  • jpeg

camera schema#

Name

Type

Description

camera.Camera aperture

Decimal

Camera aperture in decimals

camera.Camera focal

Decimal

Camera aperture in decimals

camera.Camera exposure

Decimal

Camera exposure in decimals

camera.Model

String

Camera model

camera.Manufacturer

String

Camera manufacturer

camera.Shutter speed

Decimal

Length of time when the film or camera sensor is exposed to light

camera.Aperture

Decimal

Length of aperture

camera.Exposure bias

Decimal

Camera exposure bias

camera.Focal length

Decimal

Camera focal length

camera.Camera brightness

Decimal

Camera brightness

camera.ISO speed

Int

Camera ISO speed

camera.Binning

Int

Camera Binning