All Posts

Basemap on levante

Basemap has reached end-of-life and we won’t install basemap system wide (python 3 module) but you can easily install it into a personal conda environment like this

On Levante:

Read more ...


API for Accounting Data

As a project admin or even as a normal project user you may look from time to time into accounting data for resources we provide (Levante, archive, etc.).

For each project you participate in, you can find a table on https://luv.dkrz.de like the one shown below. Usually you only see your own entries. Other user’s names are hidden.

utilization table

Read more ...


How to build ICON on Levante

The current ICON release 2.6.4 has no setup for building and running on the new HPC system levante

There are multiple options to get an ICON binary and run it:

Read more ...


Using Spyder on levante

The Python IDE Spyder is hidden in the python3 module and needs additional libraries from spack to start.

Load the missing libraries via spack and load python3 (which contains spyder):

Read more ...


libGL.so.1 missing

When trying to start a gui application like gvim, you get an error message:

Your system does not support openGL forwarding over X11, and levante does not have the backup openGL library in its library path.

Read more ...


Data Migration Mistral to Levante

So that not every user or every project has to copy its data from Mistral to Levante, DKRZ will take over this in a coordinated manner. In this way, the bandwidth between the systems can be better utilised.

We will only copy data located in /work - and only for projects that have resources granted on the Levante system.

Read more ...


OpenSSH Certificate Authentication

You have probably seen the message openssh throws at you when you try to log into a new and unknown host.

Or even this dreaded error that seemingly comes out of nowhere.

Read more ...


Bus error in jobs

Update: The problem was solved by an update of the Lustre-client by our storage vendor. The workaround described below should no longer be necessary. If you one of your jobs runs into a bus error, please let us know.

When running jobs on Levante, these sometimes fail with a Bus error, similar to the example below:

Read more ...


DKRZ CDP Updates Feb 22

We proudly 🥳 announce that the CDP is extended by new sets of CMIP6 data primarily published at DKRZ.

The ensemble set of simulations from the ESM MPI-ESM1-2-LR is now completed with additional 130 Simulations. For each of the following experiments, 30 Simuluations form an ensemble of different realizations with varying initial conditions:

Read more ...


About data on curvilinear or rotated regional grids

2D Climate data can be sampled using different grid types and topologies, which might make a difference when it comes to data analysis and visualization. As the grid lines of regular or rectilinear grids are aligned with the axes of the geopgraphical lat-lon coordinate system, these model grids are relatively easy to deal with. A common, but more complex case is that of a curvilinear or a rotated (regional) grid. In this blog article we want to illuminate this case a bit; we describe how to identify a curvilinear grid, and we demonstrate how to visualize the data using the “normal” cylindric equidistant map projection.

Data can not only be stored in different file formats (e.g. netCDF, GRIB), but also in different data structures. Besides its spatial dimension (e.g. 1D, 2D, 3D), we need to have a closer look at the grid and the topology used. As the time dependency of the data is encoded as the time dimension, a variable might be called a 3D variable although the spatial grid is only 2D.

Curvilinear de-rotated grid

Read more ...


DKRZ CDP Updates Nov 21

including the new ICON-ESM-LR model primarily published at DKRZ.

A first ensemble set of simulations from the ESM ICON-ESM-LR for the DECK experiments is available including the experiments

Read more ...


How to re-enable the deprecated python kernels?

As you propably know, we will rename/remove some unused/outdated/ python modules, please see the details here. Since the jupyterhub kernels are based on modules, the deprecated kernels will no longer be available as default kernels in jupyter notebooks/labs.

NO PANIC, if you have been working with those deprecated kernels and want to continue using them in your notebooks, please follow the steps below.

Read more ...


Deprecated Python environments

Original post.

Since several years, we are offering Python environments on Mistral. Many of them are not updated any more and should not be used for new development. However, older scripts may rely on these environments and the versions of their installed packages.

Read more ...


How to install R packages in different locations?

The default location for R packages is not writeble and you can not install new packages. On demand we install new packages system-wide and for all users. However, it possible to install packages in different locations than root and here are the steps:

create a directory in $HOME e.g. ~/R/libs

Read more ...


DKRZ CDP Updates July 21

We proudly 🥳 announce that the CDP is extended by new sets of CMIP6 data primarily published at DKRZ. We also published new versions of corrected variables for the MPI-ESM1-2 Earth System Models.

The ensemble set of simulations from the ESM MPI-ESM1-2-HR for the dcppA-hindcast experiment is completed by another 5 realizations (8.5TB). In total, this set consists of about 10 realizations for 60 initialization years in the interval from 1960-2019 resulting in 595 realizations and 31 TB. For each realization, about 100 variables are available for a simulation time of about 10 years.

Read more ...


DKRZ CMIP Data Pool

We proudly announce new publications of model simulations when we publish them at our DKRZ ESGF node. We also keep you updated about the status and the services around the CMIP Data Pool. Find extensive documentions under this link.

Read more ...


How to install jupyter kernel for Matlab

In this tutorial, I will describe the steps to create a kernel for Matlab on Levante. get the matlab_kernel working in Jupyterhub on Levante.

conda environment with python 3.9

_images/matlab.png

Read more ...


Requested MovieWriter (ffmpeg) not available

Do you want to create videos / animations with ffmpeg from your jupyter notebook? you need ffmpeg-python (conda) which requires ffmpeg software on Mistral (module)

conda env with ffmpeg-python and ipykernel

Read more ...


How to containerIze your jupyter kernel?

We have seen in this blog post how to encapsulate a jupyter notebook (server) in a singularity container . In this tutorial, I am going to describe how you can run a jupyter kernel in a container and make it available in the jupyter*.

Possible use case for this is to install a supported PyTorch version and work with jupyter notebooks (see GLIBC and the container-based workaround).

Read more ...


Webpack and Django

I recently started to modernize the JavaScript part of a medium sized Django site we run at DKRZ to manage our projects. We have used a version of this site since 2002 and the current Django implementation was initially developed in 2011.

Back then JavaScript was in the form of small scripts embedded into the Django templates. jQuery was used abundantly. All in all, JavaScript was handled very haphazardly because we wanted to get back to working with Python as soon as possible.

Read more ...


Create a kernel from your own Julia installation

We already provide a kernel for Julia based on the module julia/1.7.0.

In order to use it, you only need to install ÌJulia:

Read more ...


Connect Spyder IDE to a remote kernel on Mistral

I am just describing spontaneously what worked for me to connect my local Spyder instance to a remote node on Mistral THAT YOU CAN CONNECT TO VIA SSH FROM YOUR LOCAL MACHINE!!!!

This is just a draft tutorial that will be updated/optimized afterwards.

image0

Read more ...


Python environment locations

Kernels are based on python environments created with conda, virtualenv or other package manager. In some cases, the size of the environment can tremendously grow depending on the installed packages. The default location for python files is the $HOME directory. In this case, it will quickly fill your quota. In order to avoid this, we suggest that you create/store python files in other directories of the filesystem on Mistral.

The following are two alternative locations where you can create your Python environment:

Read more ...


Transition from Mistral to Levante for projects

In this post we want to answer a few questions which may arise for project administrators and principal investigators at DKRZ. Some of the dates for requesting new resource allocations will be different in 2021. From 2022 on we will return to the usual schedule.

Your project will be automatically extended with the same resources as for the current allocation period. After July 1, 2021, you can continue working on Mistral as you did before.

Read more ...


How to quickly create a test kernel

This is a follow up on Kernels. In some cases, the process of publishing new Python modules can take long. In the meantime, you can create a test kernel to use it in Jupyterhub. Creating new conda environments and using them as kernels has been already described here. In this example, we are not going to create a new conda env but only the kernel configuration files.

in this tutorial, I will take the module python3/2021-01. as an example.

Read more ...


CF Python package added to the software tree

According to this link:

The Python cf package is an Earth Science data analysis library that is built on a complete implementation of the CF data model. The cf package implements the CF data model 1 for its internal data structures and so is able to process any CF-compliant dataset. It is not strict about CF-compliance, however, so that partially conformant datasets may be ingested from existing datasets and written to new datasets. This is so that datasets that are partially conformant may nonetheless be modified in memory.

Read more ...


SLURM update / Memory use

Slurm config on Mistral has been updated to fix an issue related to memory use.

Prior the update, some Slurm jobs continue consuming the available memory (and even swap) of the allocated node and exceed the allocated memory (set in sbatch or srun). If this occurs, it also affect other jobs/users.

error message

Read more ...


Dask jobqueue on Mistral

According to the official Web site, Dask jobqueue can be used to deploy deploy Dask on job queuing systems like PBS, Slurm, MOAB, SGE, LSF, and HTCondor. Since the queuing system on Mistral is Slurm, we are going to show how to start a Dask cluster there. The idea is simple as described here. The difference is that the workers can be distributed through multiple nodes from the same partition. Using Dask jobqueue will Dask cluster as a Slurm jobs.

In this case, Jupyterhub will often play an interface role and the Dask can use more than the allocated resources to your jupyterhub session (profiles).

Dask jobqueue

Read more ...


Jupyter notebook/lab extensions

Extensions bring additional interesting features to Jupyter*. Depending on the workflow in the notebook, users can install/enable extensions when required. Although is easy to add extensions to both Jupyter notebook an lab, the process can be sometimes annoying based on where jupyter is served from.

In general, installing and enabling extensions in your laptop or using the start-jupyter script is straightforward, especially when the developers well describe their extensions. There should be no restrictions or permissions issues, just follow the instructions.

Extensions configurator

Read more ...


Enable NCL Kernel in Jupyterhub

can’t use NCL (Python) as kernel in Jupyter

This tutorial won’t work

Read more ...


Single jupyter notebooks in containers

you are using singularity containers

you need jupyter notebooks

Read more ...


Spawner options now savable

We introduced a new feature to the preset and advanced options form. This is a nice feature especially for the advanced options form, which contain many fields. You can also reset the options to their initial values by clicking on reset. The form options are saved in the client’s browser every 10 seconds and are not lost if:

the browser crashes

_images/options_saved.gif

Read more ...


New Singularity module deployed

Recently, we deployed a new version of Singularity: 3.6.1. The old version is not available anymore due to many bugs reported by some users.

Errors like these are now fixed:

Read more ...


VS Code Remote on Mistral

vs code is your favorite IDE

interested to use the remote extension

https://code.visualstudio.com/assets/docs/remote/remote-overview/architecture.png

Read more ...


Jupyterhub log file

Each Jupyter notebook is running as a SLUM job on MIstral. By default, stdout and stderr of the SLURM batch job that is spawned by Jupyterhub is written to your HOME directory on the HPC system. In order to make it simple to locate the log file:

if you use the preset options form: the log file is named jupyterhub_slurmspawner_preset_<id>.log.

Read more ...


GLIBC and the container-based workaround

Have you ever tried to install/use a software on Mistral and seen a message like this?

This is for example one of the reasons why PyTorch is not available in our python3 module. Those software packages require a newer version of glibc. Unfortunately, most of Mistral nodes are based on CentOS 6 kernel. To check the version of glibc:

Read more ...


Simple Dask clusters in Jupyterhub

There are multiple ways to create a dask cluster, the following is only an example. Please consult the official documentation. The Dask library is installed and can be found in any of the python3 kernels in jupyterhub. Of course, you can use your own python environment.

The simplest way to create a Dask cluster is to use the distributed module:

Dask Labextension

Read more ...


DKRZ Tech Talks

It is our great pleasure to introduce the DKRZ Tech Talks. In this series of virtual talks we will present services of DKRZ and provide a forum for questions and answers. They will cover technical aspects of the use of our compute systems as well as procedures such as compute time applications and different teams relevant to DKRZ such as our machine learning specialists. The talks will be recorded and uploaded afterwards for further reference.

Go here for more information.

Read more ...


New Jupyterhub server at DKRZ

  • 03 September 2020
  • news

On August 20th, 2020 we deployed a new Jupyterhub server at the DKRZ. The new release has various new features that enhance the user experience.

Link to Jupyterhub server

Read more ...


How to prevent interuptions of ssh connections to Mistral?

If your ssh connections to mistral are interrupted after short periods without keyboard activities and you get an error message containing ‘broken pipe’ string, try to set the ServerAliveInterval parameter appropriately. This parameter can be set as a command-line option to ssh:

In the example above, ssh will send a message with a response request to the server if no packets have been received from the server in the past 60 seconds.

Read more ...


Which MPI library and compiler should I use?

For model simulations in production mode, the recommended combination is to

Choose an Intel compiler version that has been validated to work with your model. Lacking a verified version, just use the most recent version (module intel/18.0.4 at the time of this writing) and validate that yourself.

Read more ...


How do I share files with members of another project?

You can use ACLs to achieve this. As a member of project group ax0001, you would have to create a directory in your project’s work for example

It could be any other place on Lustre file systems where you have write access. Then you grant project bx0002 permissions to this directory

Read more ...


Why do I receive .Xauthority file error messages?

When you open a new terminal session on mistral with X forwarding turned on (ssh -X ...), the .Xauthority file in your home directory gets updated by the xauth program. This file is used to keep X authentication keys in order to prevent unauthorized connections to your local display.

Sometimes, the .Xauthority file cannot be updated due to issues with the lustre01 file system, where your home directory is located, on mistral and you might experience an error message like:

Read more ...


I want to add my own packages to Python or R but they won’t compile

Python and R, among other scripting languages, allow users to create customized environments including their own set of packages.

For Python you use virtualenv or conda, R can also add locally installed packages.

Read more ...


How do I log into the same login or pp node I used before

mistral.dkrz.de or mistralpp maps to a whole group of nodes to distribute the load. They all share the same file system so most of the time you do not have to care which node you are on. However, there are reasons why you may want to connect to a specific node. You first have to find out on which node you are. This may be indicated in your prompt or you can also use hostname for this purpose.

 In this case you are on login node 3. Connect to this node with

Read more ...


Python Matplotlib fails with “QXcbConnection: Could not connect to display”

Matplotlib is useful for interactive 2D plotting and also for batch production of plots inside a job. The default behavior is to do interactive plotting which requires the package to open a window on your display. For this purpose you have to log into mistral with X11 forwarding enabled.

 If you run matplotlib in a jobscript where you just want to create files of your plots, you have to tell matplotlib to use a non-interactive backend. See matplotlib’s documentation how to do that and which backends are available. Here is how to select the Agg backend (raster graphics png) inside your script. Add to the top of your imports

Read more ...


How can I avoid core files if my program crashes

Core files can be very helpful when debugging a problem but they also take a long time to get written for large parallel programs. This will limit the core size to zero, i.e. no core files are written:

Note that due to a bug in our current installation of the slurm scheduler, the option

Read more ...


Why does my job wait so long before being executed? or: Why is my job being overtaken by other jobs in the queue?

There are several possible reasons for to be queued for a long time and/or to be overtaken …

… later submitted jobs with a higher priority (usually these have used less of their share then your job).

Read more ...


When will my SLURM job start?

The SLURM squeue command with the options - -start and -j provides an estimate for the job start time:

Read more ...


How to use modules in batch scripts

The module environment is only available if the according module command was defined for the current shell. If you are using different shell as login shell and for job batch scripts (e.g. tcsh as login shell and your job scripts start with #!/bin/bash), you need to source one of the following files in your script before any invocation of the module command:

Read more ...


How to use SSHFS to mount remote lustre filesystem over SSH

In order to interact with directories and files located on the lustre filesystem, users can mount the remote filesystem via SSHFS (SSH Filesystem) over a normal ssh connection.

SSHFS is Linux based software that needs to be installed on your local computer. On Ubuntu and Debian based systems it can be installed through apt-get. On Mac OSX you can install SHFS - you will need to download FUSE and SSHFS from the osxfuse site. On Windows you will need to grab the latest win-sshfs package from the google code repository or use an alternative approach like WinSCP.

Read more ...


How to improve interactive performance of MATLAB

When using ssh X11-Forwarding (options -X or -Y), matlab can be slow to start and also have slow response to interactive use. This is because X11 sends many small packets over the network, often awaiting a response before continuing. This interacts unfavorably with medium or even higher latency connections, i.e. WiFi. When starting matlab on mistralpp nodes, another disturbing factor is overloading of these nodes.

But mistral has means to eliminate both of these issues: GPU nodes provide exclusive resources and allow for starting a remote desktop session that does not suffer from network latencies. Furthermore, the program execution can benefit from the hardware acceleration for 3D-plots or other graphics-intensive matlab sessions.

Read more ...


How to Write a shell alias or function for quick login to a node managed by SLURM

For tasks better run in a dedicated but interactive fashion, it might be advantageous to save the repeating pattern of reserving resources and starting a new associated shell in an alias or function, as explained below.

If you use bash as default shell you can place the following alias definition in your ~/.bashrc file and source this file in the ~/.bash_profile or in the ~/.profile file:

Read more ...


How to View detailed job information when the job is already running

Once your batch job started execution (i.e. is in RUNNING state) your job script is copied to the slurm admin nodes and kept until the jobs finalizes - this prevents problems that might occur if the job script gets modified while the job is running. As a side-effect you can delete the job script without interfering the execution of the job.

If you accidentally removed or modified the job script of a running job, you can use the following command to query for the script that is actually used for executing the job:

Read more ...


How to Set the default SLURM project account

On Mistral, specification of the project account (via option -A or –account) is necessary to submit a job or make a job allocation, otherwise your request will be rejected. To set the default project account you can use the following SLURM input environment variables

SLURM_ACCOUNT   - interpreted by srun command

Read more ...


How can I see on which nodes my job was running?

Yon can use the SLURM sacct command with the following options:

Read more ...


How can I run a short MPI job using up to 4 nodes?

You can use SLURM Quality of Service (QOS) express by inserting the following line into your job script:

or using the option –qos with the sbatch command:

Read more ...


How can I login to mistral, change my password and login shell?

Login to the system via:

Change your password and/or login shell via  DKRZ  online

Read more ...


How can I get a stack trace if my program crashes?

The classical approach to find the location where your program crashed is to run it in a debugger or inspect a core file with the debugger. A quick way to get the stack trace without the need for a debugger is to compile your program with the following options:

In case of segment violation during execution of the program, detailed information on the location of the problem (call stack trace with routine names and line numbers) will be provided:

Read more ...


How can I choose which account to use, if I am subscribed to more than one project?

Just insert the following line into your job script:

There is no default project account on mistral.

Read more ...


How can I check my disk space usage?

Your individual disk space usage in HOME and SCRATCH areas as well as project quota in the WORK data space can be checked in DKRZ online portal. The numbers are updated daily.

Read more ...


How can I access my Lustre data from outside DKRZ/ZMAW?

For data transfer you can use either sftp:

or  rsync command:

Read more ...


Can I run cron jobs on Mistral?

For system administration reasons users are not allowed to shedule and execute periodic jobs on Mistral using the cron utility. Our recommendation is to use the functionality provided by the workload manager SLURM for this purpose. With the option –begin of the sbatch command you can postpone the execution of your jobs until the specified time. For example, to run a job every day after 12 pm you can use the following job script re-submitting itself at the beginning of the execution:

A variety of different date and time specifications is possible with the –begin option, for example: now+1hour, midnight, noon, teatime, YYYY-MM-DD[Thh:mm:ss], 7AM, 6PM etc. For more details see manual pages of the sbatch command:

Read more ...


Is a FTP client available on mistral?

LFTP is installed on mistral for download and upload of files from/to an external server via File Transfer Protocol (FTP):

The user name for authentication can be provided via option ‘-u’ or ‘–user’, for example:

Read more ...