...
...
...
...
...
...
...
...
...
...
...
Related Github Repos and tickets
Tickets:
Jira Legacy server System JIRA serverId 88de5227-42b1-365b-8364-d731c8efaf35 key ARIA-47
Job Runtime
Depends on how many SLCs are being processed
1+ hours for 8 SLCs
GNU Parallel
c5d.9xlarge (36 vCPU, 72 GiB) | c5.24xlarge (96 vCPU, 192 GiB) | |
1 year (~30 SLCS, 4 bursts) | 7 hrs, 24 mins, 46 secs | 4 hrs, 38 mins, 33 secs |
2 year (~60 SLCS, 4 bursts) | 13 hrs, 37 mins, 39 secs | 8 hrs, 16 min, 46 secs |
Multiprocessing
c5d.9xlarge (36 vCPU, 72 GiB) | c5.24xlarge (96 vCPU, 192 GiB) | |
1 year (~30 SLCS, 4 bursts) | 6 hrs, 43 mins, 27 secs | 4 hrs, 19 mins, 30 secs |
2 year (~60 SLCS, 4 bursts) | 12 hrs, 58 mins, 32 secs | 8 hrs, 30 mins, 43 secs |
3 year (~90 SLCS, 4 bursts) | 18 hrs, 14 mins, 44 secs | 10 hrs, 56 mins, 5 secs |
Objective
Creating a stack of SLCs
Only within the same track (over a time period)
Prerequisite to STAMPS processing
How to set up the inputs
Facets to get SLC inputs
region (ex. Hawaii) (optional)
track_number
ortrackNumber
(depends)datatype:
SLC
If SLCs track number do not match, then it would throw this error in the job:
Code Block raise Exception('Could not determine a suitable burst offset')
There must only be one track in your SLC inputs
correct facet SLC inputs incorrect facet SLC inputs
...
Job Inputs:
Bbox (*required)
min_lat
max_lat
min_lon
max_lon
...
CI Integration (Jenkins)
Link: http://b-ci.grfn.hysds.io:8080/job/ops-bcluster_container-builder_aria-jpl_topsstack_master/
WARNING: If rebuilding on the same branch (master), make sure to remove docker image so that it reloads when restarting the job:
docker rmi <topsStack docker image id>
HySDS-io and Jobspec-io
hysds-io.json.topsstack
Code Block |
---|
{
"label": "topsStack Processor",
"submission_type": "individual",
"allowed_accounts": [ "ops" ],
"action-type": "both",
"params": [
{
"name": "min_lat",
"from": "submitter",
"type": "number",
"optional": false
},
{
"name": "max_lat",
"from": "submitter",
"type": "number",
"optional": false
},
{
"name": "min_lon",
"from": "submitter",
"type": "number",
"optional": false
},
{
"name": "max_lon",
"from": "submitter",
"type": "number",
"optional": false
},
{
"name":"localize_products",
"from":"dataset_jpath:",
"type":"text",
"lambda" : "lambda met: get_partial_products(met['_id'],get_best_url(met['_source']['urls']),[met['_id']+'.zip'])"
}
]
}
|
job-spec.json.topsstack
Code Block |
---|
{
"recommended_queues": ["jjacob_stack"],
"command": "/home/ops/verdi/ops/topsstack/run_stack.sh",
"imported_worker_files": {
"/home/ops/.netrc": "/home/ops/.netrc",
"/home/ops/.aws": "/home/ops/.aws"
},
"soft_time_limit": 10800,
"time_limit": 18000,
"disk_usage": "100GB",
"params": [
{
"name": "min_lat",
"destination": "context"
},
{
"name": "max_lat",
"destination": "context"
},
{
"name": "min_lon",
"destination": "context"
},
{
"name": "max_lon",
"destination": "context"
},
{
"name":"localize_products",
"destination": "localize"
}
]
}
|
Job Outputs
Main file that gets executed is run_stack.sh
Copies all SLCs
.zip
files tozip/
sub-directoryruns
get_bbox.py
and exports 8 coordinates as inputs for the science coderead MINLAT MAXLAT MINLON MAXLON MINLAT_LO MAXLAT_HI MINLON_LO MAXLON_HI <<< $TOKENS
Runs 10 steps to complete the stack processor
run.py -i ./run_files/run_1_unpack_slc_topo_master -p 8
run.py -i ./run_files/run_2_average_baseline -p 8
run.py -i ./run_files/run_3_extract_burst_overlaps -p 8
run.py -i ./run_files/run_4_overlap_geo2rdr_resample -p 8
run.py -i ./run_files/run_5_pairs_misreg -p 8
run.py -i ./run_files/run_6_timeseries_misreg -p 8
run.py -i ./run_files/run_7_geo2rdr_resample -p 8
run.py -i ./run_files/run_8_extract_stack_valid_region -p 8
run.py -i ./run_files/run_9_merge -p 8
run.py -i ./run_files/run_10_grid_baseline -p 8
Output directory structure
...
Output structure of merged/
Code Block |
---|
merged/
baselines/
20190506/
20190518/
20190530/
20190530
20190530.full.vrt
20190530.vrt
20190530.xml
geom_master/
*.rdr.aux.xml
*.rdr.full
*.rdr.full.aux.xml
*.rdr.full.vrt
*.rdr.full.xml
SLC/
20190506/
20190518/
20190530/
20190530.slc.full
20190530.slc.full.aux.xml
20190530.slc.full.vrt
20190530.slc.full.xml
20190530.slc.hdr |
Datasets.json entry
file located in
.sds/files/datasets.json
Code Block |
---|
{
"ipath": "ariamh::data/STACK",
"match_pattern": "/(?P<id>coregistered_slcs-(?P<year>\\d{4})(?P<month>\\d{2})(?P<day>\\d{2})(?P<time>\\d{6}).+)$",
"alt_match_pattern": null,
"extractor": null,
"level": "NA",
"type": "stack",
"publish": {
"s3-profile-name": "default",
"location": "s3://s3-us-west-2.amazonaws.com:80/##BUCKET##/datasets/{type}/{version}/{year}/{month}/{day}/{id}",
"urls": [
"http://##WEBDAV_URL##/datasets/{type}/{version}/{year}/{month}/{day}/{id}",
"s3://##S3_URL##:80/##BUCKET##/datasets/{type}/{version}/{year}/{month}/{day}/{id}"
]
},
"browse": {
"location": "davs://##WEBDAV_USER##@##WEBDAV USER##/browse/{type}/{version}/{year}/{month}/{day}/{id}",
"urls": [
"https://##WEBDAV##/browse/{type}/{version}/{year}/{month}/{day}/{id}"
],
}
} |
Running on ASG (Auto Scaling Group)
Currently using c5d.9xlarge
Not enough CPU
Takes 11.5 Hrs to run 30 scenes
May need to upgrade to c5d.18xlarge or i-instances
...
STILL TODO:
Integrate Sang-Ho/Jungkyo GNU parallel