Page Comparison

...

from mamba-factotum, run screen comment, then inside the screen session, ssh with tunnel to tpfe2 head node.

Basic use of screen

https://tomlee.co/2011/10/gnu-screen-splitting/
screen -ls
screen -U -R -D <screen_id>
screen -x <screen_id> # shared terminal
to split screen: ctrl-a and then shift-s
to detach screen: ctrl-a and then d

Troubleshooting

Problem(s):

SSH tunnel is down

Signs(s):

topsapp queues are stuck; jobs are not being unacked, but the queues are full
nothing reported in mozart/figaro – job,topsapp,job-started
Port checker shell script (in hysds-hec-utils repo) indicates that ports are not forwarded (all should give pass):

Code Block

esi_sar@tpfe2:~/github/hysds-hec-utils> ./hysds_pcm_check_port_forwarded_tunnel_services.sh

[pass] mozart rabbitmq AMQP
[pass] mozart rabbitmq REST
[pass] mozart elasticsearch for figaro
[pass] mozart redis for ES figaro
[pass] mozart rest api
[pass] grq elasticsearch for tosca
[pass] grq http api
[pass] metrics redis for ES metrics
connect_to 100.67.33.56 port 25: failed.
# [fail] factotum smtp http://tpfe2.nas.nasa.gov:10025
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current 
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
curl: (52) Empty reply from server

Note that the mail server failed to respond. (That’s nominal. Mail service no longer used.) The output above indicates nominal status.

Remedy:

from mamba-factotum, open screen
1. optionally, attach existing screen (32207, tagged pleiades)
ssh with tunnel to tpfe2 head node: ssh tpfe2-tunnel
1. note the command is aliased: alias pleiades='ssh tpfe2-tunnel'
run sudo -u esi_sar /bin/bash
detach the screen (ctrl-a + d)

Auto-scaling job-workers singularity via PBS scripts

...

Pleiades has allocated us a quota of 100 200 TB and 5000000 files. This script finds and deletes all files older than ~~3-days~~ 2.1 days and under each of the group id worker directories.

...

Versions Compared

Old Version 12

New Version Current

Key

Basic use of screen

Troubleshooting

Auto-scaling job-workers singularity via PBS scripts