Skip to content

Use new apptainer singularity install to avoid transient errors#63

Open
jo-basevi wants to merge 1 commit intomainfrom
26-Use-Apptainer-Module
Open

Use new apptainer singularity install to avoid transient errors#63
jo-basevi wants to merge 1 commit intomainfrom
26-Use-Apptainer-Module

Conversation

@jo-basevi
Copy link
Copy Markdown
Collaborator

@jo-basevi jo-basevi commented Mar 17, 2026

NCI have recently installed an Apptainer-based container engine on Gadi, which uses a different driver for mounting container image. In most cases it will require just swapping module load singularity with module load apptainer. This will hopefully fix the transient errors (see #26) and so far no errors have surfaced in my local testing. It could be good to get this change into at least payu/dev so it can be further tested.

Ben Menadue also picked up that the short-circuit to detect if running inside a container in the launcher scripts doesn't correctly detect Apptainer containers. They suggested a more reliable way would be to inspect that process's status directly:

$ cat /proc/self/status | grep NoNewPrivs | awk '{print $2}'
0

This will return 0 if running outside a container or 1 if inside. (Or more precisely, 0 if launching a container will work and 1 if it won't.)

So far in my tests, there hasn't been any "FATAL: container creation failed" using apptainer, so we could maybe also remove that retry logic when launching the container?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant