Do you need Data?

Note

If you need specific large-volume data sets for your Earth System Science research project, contact DKRZ Data Management (data@dkrz.de) and we will provide you with effective and powerful data services to support your research!

Here, we provide a brief overview of DKRZ’s data provisioning service and how to request additional data to be available at the DKRZ HPC system. A summary of the motivation and the concrete benefits of this service are presented on DKRZ’s main pages.

image-needdata

1) Data request

  • large-volume data sets required by researchers in the Earth System Sciences are normally not available in local, relatively small-sized IT environments

  • DKRZ hosts a centralised collection of selected large-volume datasets often needed in the Earth System Sciences - researchers are encouraged to contact DKRZ’s Data Management department (DM) via data@dkrz.de with their data needs

1.1) Requested data are available at DKRZ

  • in case the needed data are available at DKRZ (see also 5) below), DM advises regarding access to the data. The data are available through a suite of services
    • ESGF

    • WDCC

    • Data Pool

  • the data can be efficiently processed using the compute resources at DKRZ

2) DM contacts external data producer if data are not yet available at DKRZ

  • in case of high and/or foreseeable demand for specific data sets which are not yet hosted at DKRZ, DM seeks contact to the corresponding external data provider to negotiate data acquisition and local provision

  • for some cases, external data providers impose use constraints on their data - DM negotiates the conditions for providing the data in a centralised location and assures compliance with data usage restrictions

3) DM staff acquires data from external data producer

  • once the negotiations regarding data acquisition with the data producer are successful, experienced DKRZ DM staff begins to download the corresponding data sets

  • total volumes can be in the range of several PBs per data set, which is why the built-up expertise at DKRZ is crucial for successful data download, e.g. the possibility of privileged access to ECMWF servers for bulk downloads of ERA 5 data without long queuing time

  • externally provided data are obtained in the original data format. If an increase in the usability of the data is required, e.g. file format conversions, restructuring of the data or storage of the data using an intuitive directory structure, DM staff also provides this service given manageable effort (transition from “white” to “grey” data in the above schematic)

4) DM staff transfers data to user-accessible location

  • once the data processing is complete and the data are ready for use by the community, DM staff makes the data available within the DKRZ infrastructure

5) Data is available to the research community

  • access to the data and compute resources at DKRZ is available to the Earth System Science community

Note

contact DKRZ DM department at data@dkrz.de if you have any further questions!