Primary tabs

Data Management for Performance Improvement & Compute Workflow Optimization

Efficient data management and sharing is essential for achieving optimal application performance and overall workflow efficiency. This hands-on workshop will focus on how data movement, both within a compute node (from in-core memory to file systems) and across multiple compute nodes, can significantly impact and enhance computational performance.

 

Participants will gain a comprehensive understanding of the Midway HPC system’s compute and storage infrastructure, how it is optimized for data-intensive workflows, the flow of data from in-core memory to the file system, and tools available on the Midway clusters for improving the overall performance of their workflows.

 

Key topics include data movement automation using scripting and scheduling techniques (Python, Bash, SLURM, cron), data sharing across CPUs, GPUs, and compute nodes, and distributed training frameworks (Horovod, DeepSpeed, TensorFlow, PyTorch). The workshop will also cover best practices for data partitioning and locality, I/O optimization, database integration, and seamless data sharing across on-premises and cloud platforms (e.g., Skyway). Participant will also learn to work with efficient data formats (NetCDF, HDF5) and apply techniques such compression and parallel transfers to reduce data transfer time.

 

Objectives:
By the end of this workshop, participants will be able to:

  • Understand the Midway’s compute and storage infrastructure and how it supports data-intensive workflows.
  • Learn techniques to manage data movement across CPUs and GPUs.
  • Automate and share data across compute nodes and systems.
  • Use techniques and tools to optimize I/O operations and data movement for improved HPC performance.
  • Prepare and share scientific data in standardized formats for collaboration.

 

Prerequisites:

Please bring your laptop. Familiarity with HPC environments at the RCC is optional

 

Duration:
2 hours

Register

Wednesday, May 7, 2025 - 13:00 to 15:00