Running MLE-Bench on a Slurm-based cluster with Apptainer

The goal of this fork is to run MLE-Bench agents on a Slurm-based cluster that use Apptainer instead of Docker, and circumvent the root/user separation.

What To Look Out For

The agent must not have access to private test answers. This is why the root/user separation exists in the initial MLE-Bench project (grading server and agent are being run inside the same container). To go around it, we:

Run the grading server as in an Apptainer container with access to private data
Run the agent in a different Apptainer container without private data mounted
Agent validates submissions via HTTP (http://<grading-server>:5000/validate)

Pre-requisites

We assume familiarity with MLE-Bench; if you need setup information, see the MLE-Bench ReadMe.

Step 1: Build Apptainer Image

Note: If you are on an arm64 machine, you need to add --platform=linux/amd64 when building locally.

On a machine with Docker access:

# Build Docker images
docker build -t mlebench-env -f environment/Dockerfile .
docker build -t aide agents/aide/ \
    --build-arg SUBMISSION_DIR=/home/submission \
    --build-arg LOGS_DIR=/home/logs \
    --build-arg CODE_DIR=/home/code \
    --build-arg AGENT_DIR=/home/agent

Then you can save your docker as .tar file, transfer to HPC and convert:

Transfer to HPC and convert:

apptainer build mlebench-env.sif docker-archive://mlebench-env.tar
apptainer build aide.sif docker-archive://aide.tar

For Princeton University users

```scp aide.tar netid@della.princeton.edu:/home/netid/path/to/save/```

Step 2: Start Grading Server (Manual Method)

Note: If using the heterogeneous job script (scripts_hpc/slurm_hetjob.sh), skip to Step 4. The script handles Steps 2-3 automatically.

Option A: SLURM Job

# Edit paths in script first, then:
sbatch scripts_hpc/slurm_grading_server.sh spaceship-titanic

# Check output for the grading server URL
cat slurm_output/mlebench/grading-<jobid>.out

Option B: Interactive

On a node that has access to the private test data:

COMPETITION="spaceship-titanic"
DATA_DIR="/path/to/mlebench/data"
SIF_IMAGE="/path/to/mlebench-env.sif"

apptainer exec \
    --contain \
    --cleanenv \
    --no-home \
    --bind ${DATA_DIR}:/data:ro \
    ${SIF_IMAGE} \
    /opt/conda/bin/conda run -n mleb python /mlebench/environment/run_grading_server.py \
        --competition-id ${COMPETITION} \
        --data-dir /data \
        --host 0.0.0.0 \
        --port 5000

Step 3: Run Agent (Manual Method)

Option A: SLURM Job

# With explicit grading server URL:
sbatch scripts_hpc/slurm_agent.sh spaceship-titanic http://node123:5000

# Or auto-discover from grading job ID:
sbatch scripts_hpc/slurm_agent.sh spaceship-titanic auto:<grading-job-id>

Add --nv flag for GPU support.

Step 4: Grade Submission

After the agent finishes:

mlebench grade \
    --submission ${OUTPUT_DIR}/submission/submission.csv \
    --competition ${COMPETITION}

SLURM Heterogeneous Job

Use a heterogeneous job to schedule grading server on CPU and agent on GPUs together:

sbatch scripts_hpc/slurm_hetjob.sh spaceship-titanic

Make sure to edit scripts_hpc/slurm_hetjob.sh to set your paths:

MLEBENCH_DIR: path to mle-bench repo
DATA_DIR: path to data
SIF_IMAGE: path to Apptainer image
OUTPUT_BASE: base output directory

Todo

Update and test heterogeneous scripts on della

Name		Name	Last commit message	Last commit date
Latest commit History 51 Commits
agents		agents
environment		environment
examples		examples
experiments		experiments
extras		extras
mlebench		mlebench
runs		runs
scripts_hpc		scripts_hpc
tests		tests
.dockerignore		.dockerignore
.gitattributes		.gitattributes
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
pyproject.toml		pyproject.toml
run_agent.py		run_agent.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Running MLE-Bench on a Slurm-based cluster with Apptainer

What To Look Out For

Pre-requisites

Step 1: Build Apptainer Image

Step 2: Start Grading Server (Manual Method)

Option A: SLURM Job

Option B: Interactive

Step 3: Run Agent (Manual Method)

Option A: SLURM Job

Step 4: Grade Submission

SLURM Heterogeneous Job

Todo

About

Uh oh!

Releases

Packages

Languages

License

hubstrauss/mle-bench-hpc

Folders and files

Latest commit

History

Repository files navigation

Running MLE-Bench on a Slurm-based cluster with Apptainer

What To Look Out For

Pre-requisites

Step 1: Build Apptainer Image

Step 2: Start Grading Server (Manual Method)

Option A: SLURM Job

Option B: Interactive

Step 3: Run Agent (Manual Method)

Option A: SLURM Job

Step 4: Grade Submission

SLURM Heterogeneous Job

Todo

About

Topics

Resources

License

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages