Primary tabs

Running Interactive and Batch Jobs at the Research Computing Center

Abstract

Slurm is a widely used job scheduler on high-performance computing systems. Efficient resource management is crucial to achieve productivity in the complex environment of RCC computing clusters. This workshop aims to equip users with a clear understanding of all the available compute partitions at the RCC, how to identify the right resource to use for your specific job,  how to configure a Slurm job, how to use Slurm commands, how to submit a job, and how to avoid common mistakes that may cause a job to wait for a long time in the queue before running or fail to run altogether.

Objectives:

By the end of the workshop attendees will:

  • learn the various Midway resources and partitions for running jobs
  • learn Slurm commands, how to create a Slurm batch script, and how to submit batch jobs
  • acquire a good understanding of RCC software module system and run time environments
  • learn how to submit serial single processor and parallel (OpenMP and MPI) multiple processor jobs
  • learn how to submit GPU jobs
  • learn how to submit a job that is carried on several times by a given code, differing only in the initial value of some high-level parameter for each run (Slurm job array) 
  • learn how to pack jobs and schedule independent processes inside a Slurm job allocation
  • learn how to submit Message passing parallel jobs (MPI), multi-threading (OpenMP), and hybrid jobs.
  • learn how to request a Slurm interactive session
  • learn best practices and how to debug Slurm script

Duration: 2 hours

Level: Introductory

Prerequisites: Knowledge of Slurm is helpful. An RCC account is required.

Register

Thursday, February 15, 2024 - 14:00 to 16:00