Acquisition Validator for an AOI

 

Related Github Repos and tickets

Job Runtime

Objective

Validates if all acquisitions exist in SDS, given AOI. If missing then reports and ingests them.

How to set up the inputs

 



Job Inputs:



CI Integration (Jenkins)


scihub_acquisition_scraper

  • WARNING: If rebuilding on the same branch (master), make sure to remove docker image so that it reloads when restarting the job on an already existing worker : docker rmi <acquisition scraper docker image id>. If your job will run on a worker that will be scaled up (isn’t already running) then you don’t need to worry about this.

  • If you need to port this container over to another cluster, for e.g. from B to C cluster

HySDS-io and Jobspec-io

hysds-io.json.aoi_validate_acquisitions

{ "label": "AOI Validate Acquisitions", "submission_type":"iteration", "params" : [ { "name": "ds_cfg", "from": "value", "value": "datasets.json" }, { "name": "starttime", "from": "dataset_jpath:_source.starttime" }, { "name": "endtime", "from": "dataset_jpath:_source.endtime" }, { "name": "polygon_flag", "from": "value", "value": "--polygon" }, { "name": "polygon", "from": "dataset_jpath:_source.location", "lambda": "lambda x: __import__('json').dumps(x).replace(' ','')" }, { "name": "ingest_flag", "from": "value", "value": "--ingest" }, { "name": "purpose_flag", "from": "value", "value": "--purpose" }, { "name": "purpose", "from": "value", "value": "validate" } ] }

job-spec.json.aoi_validate_acquisitions

{ "command": "/home/ops/verdi/ops/scihub_acquisition_scraper/acquisition_ingest/scrape_apihub_opensearch.py", "imported_worker_files": { "/home/ops/.netrc": "/home/ops/.netrc" }, "required-queues": [ "factotum-job_worker-apihub_scraper_throttled" ], "disk_usage":"10GB", "soft_time_limit": 3300, "time_limit": 3600, "params" : [ { "name": "ds_cfg", "destination": "positional" }, { "name": "starttime", "destination": "positional" }, { "name": "endtime", "destination": "positional" }, { "name": "polygon_flag", "destination": "positional" }, { "name": "polygon", "destination": "positional" }, { "name": "ingest_flag", "destination": "positional" }, { "name": "purpose_flag", "destination": "positional" }, { "name": "purpose", "destination": "positional" } ] }



Job Outputs

Main file that gets executed is scrape_apihub_opensearch.py

  • <steps>

Output directory structure

<screen shot>



Output structure of merged/



STILL TODO: