...
Run autoscaling for each group id in background mode with nohup
(no hangup), with max 140 nodes in total across all group idsesi_sar@tpfe2:~/github/hysds-hec-utils> nohup pbs_auto_scale_up.sh s2037 140 > pbs_auto_scale_up-s2037.log 2>&1 &
esi_sar@tpfe2:~/github/hysds-hec-utils> nohup pbs_auto_scale_up.sh s2310 140 > pbs_auto_scale_up-s2310.log 2>&1 &
esi_sar@tpfe2:~/github/hysds-hec-utils> nohup pbs_auto_scale_up.sh s2252 140 > pbs_auto_scale_up-s2252.log 2>&1 &
note: these commands are wrapped in the following shell script
esi_sar@tpfe2:~/github/hysds-hec-utils> ./all_pbs_auto_scale_up.sh <num_workers>
Daily purge of older job work dirs
...
stop auto-scaling scripts
revoke job type: job-request-s1gunw-topsapp-local-singularity:ARIA-446_singularity in mozart-figaro that are in running/queued states.
qdel all jobs
https://github.com/hysds/hysds-hec-utils/blob/master/qdel_all.sh
qstat -u esi_sar | awk '{ if ($8 == "R" || $8 == "Q") print "qdel "$1; }' | sh
then nuke all of the work dirs for the three group ids:
/nobackupp12/esi_sar/s2037/worker/2020/11/**
/nobackupp12/esi_sar/s2252/worker/2020/11/**
/nobackupp12/esi_sar/s2310/worker/2020/11/**
retry all failed topsapp jobs / on-demand submit from runconfig-topsapp
start up auto scaling scripts