它是一个简单实用的工具,用来帮助用户在一个批处理脚本中提交多个单线程或多线程的任务。
它的详细介绍请参考官网:传送门。
它的下载地址:传送门。
TACC launcher 怎么用?非常推荐前往官网查看它的使用方法,有很详细的介绍。我就不再重复了,英文不好的朋友可以使用网页翻译工具翻译一下。
简单讲,就是:
我们准备一个joblist文件:myjoblist,里面写上要执行的任务,先简单些12行helloworld做测试:
echo "hello, world"echo "hello, world"echo "hello, world"echo "hello, world"echo "hello, world"echo "hello, world"echo "hello, world"echo "hello, world"echo "hello, world"echo "hello, world"echo "hello, world"echo "hello, world"echo "hello, world"echo "hello, world"我们再编写一个提交脚本sub.sh,里面写上launcher的相关命令:
#!/bin/bashexport LAUNCHER_JOB_FILE=/path/to/myjoblistexport LAUNCHER_DIR=$HOME/launcher/launcher-3.1.1export PATH=$LAUNCHER_DIR:$PATHexport LAUNCHER_PLUGIN_DIR=$LAUNCHER_DIR/pluginsexport LAUNCHER_RMI=SLURMexport LAUNCHER_SCHED=interleavedexport LAUNCHER_WORKDIR=`pwd`$LAUNCHER_DIR/paramrun说明:
1. LAUNCHER_JOB_FILE 为myjoblist路径,请改为实际路径
2. LAUNCHER_DIR 为launcher的安装路径,请改为实际路径
3. 其他的变量暂时不需要修改
说明:
1. -N 2 表示2个节点
2. -n 6 表示6个cpu核(一共6个,不是每个节点6个;另外,注意n需要能被N整除,否则报错)
3. -p debug 表示使用debug分区
使用slurm作业调度系统提交的任务会有一个默认的输出文件slurm-jobid.out,我们查看这个文件:
Launcher: Setup complete.------------- SUMMARY --------------- Number of hosts: 2 Working directory: $HOME/workdir/test Processes per host: 3 Total processes: 6 Total jobs: 12 Scheduling method: interleaved-------------------------------------Launcher: Starting parallel tasks...Launcher: Task 1 running job 2 on cn95 (echo "hello, world")Launcher: Task 0 running job 1 on cn95 (echo "hello, world")hello, worldhello, worldLauncher: Task 2 running job 3 on cn95 (echo "hello, world")hello, worldLauncher: Job 1 completed in 0 seconds.Launcher: Task 5 running job 6 on cn96 (echo "hello, world")Launcher: Task 4 running job 5 on cn96 (echo "hello, world")hello, worldhello, worldLauncher: Task 3 running job 4 on cn96 (echo "hello, world")Launcher: Job 3 completed in 0 seconds.hello, worldLauncher: Job 2 completed in 0 seconds.Launcher: Job 6 completed in 0 seconds.Launcher: Job 5 completed in 0 seconds.Launcher: Job 4 completed in 0 seconds.Launcher: Task 0 running job 7 on cn95 (echo "hello, world")hello, worldLauncher: Task 2 running job 9 on cn95 (echo "hello, world")hello, worldLauncher: Task 1 running job 8 on cn95 (echo "hello, world")hello, worldLauncher: Task 5 running job 12 on cn96 (echo "hello, world")hello, worldLauncher: Task 3 running job 10 on cn96 (echo "hello, world")hello, worldLauncher: Task 4 running job 11 on cn96 (echo "hello, world")hello, worldLauncher: Job 7 completed in 0 seconds.Launcher: Job 9 completed in 0 seconds.Launcher: Job 8 completed in 0 seconds.Launcher: Job 12 completed in 0 seconds.Launcher: Job 10 completed in 0 seconds.Launcher: Job 11 completed in 0 seconds.Launcher: Task 0 done. Exiting.Launcher: Task 2 done. Exiting.Launcher: Task 1 done. Exiting.Launcher: Task 5 done. Exiting.Launcher: Task 3 done. Exiting.Launcher: Task 4 done. Exiting.Launcher: Done. Job exited without errors说明:
| 欢迎光临 黑马程序员技术交流社区 (http://bbs.itheima.com/) | 黑马程序员IT技术论坛 X3.2 |