HPC late-afternoon session
- Let’s continue with any questions that we did not finish earlier.
- Are there any questions on the materials you just watched?
- In the questions below we will talk about:
- scheduling
- submit serial and parallel jobs, array jobs
- submit interactive jobs, and switching between interactive and batch jobs for the same task
- how to estimate memory requirements of a completed Slurm job
- permissions and file sharing
- best practices in cluster computing
- scheduling
Click on a triangle to expand a question:
Question 17
Submit a serial job that runs hostname
command.
Try playing with sq
, squeue
, scancel
commands.
Question 18
Submit a serial job based on pi.c
.
Try sstat
on a currently running job. Try seff
and sacct
on a completed job.
Question 19
Using a serial job, time optimized (-O2
) vs. unoptimized code. Type your findings into the chat.
Question 20
Using a serial job, time pi.c
vs. pi.py
for the same number of terms (cannot be too large or too small – why?).
Python pros – can you speed up pi.py
?
Question 21
Submit an array job for different values ofn
(number of terms) with pi.c
. How can you have different executable for
each job inside the array?
Question 22
Submit a shared-memory job based onsharedPi.c
. Did you get any speedup? Type your answer into the chat.
Question 23
Submit an MPI job based on distributedPi.c
.
Try scaling 1 → 2 → 4 → 8 cores. Did you get any speedup? Type your answer into the chat.
Question 24
Test the serial code inside an interactive job. Please quit the job when done, as we have very few compute cores on the training cluster.
Note: we have seen the training cluster become unstable when using too many interactive resources. Strictly speaking, this should not happen, however there is a small chance it might. We do have a backup.
Question 25
Test the shared-memory code inside an interactive job. Please quit when done, as we have very few compute cores on the training cluster.Question 26
Test the MPI code inside an interactive job. Please quit when done, as we have very few compute cores on the training cluster.Question 27
Let’s talk about debugging, profiling and code optimization.Question 28
Let’s talk about file permissions and file sharing.
Share a file in your ~/projects
directory (make it readable) with all other users in def-sponsor00
group.