E2E Standard Product Pipeline Walkthrough

This is a step by step procedure for the bulk processing of a new area of interest (AOI) to obtain all the interferograms (represented by S1-GUNW datasets).

An ISCE super-user could obtain all GUNWs over a single AOI within 1 week on a JPL server. It is the job of the operator to consistently beat this time.

The standard product pipeline performs radar interferometry and related processing to measure centimeter scale deformations on the ground using pairs of single-look complex images (SLCs). The final products S1-GUNWs are used by scientists for this analysis. For those operators curious about what radar interferometry is and its relationship to ground deformation, here is a brief introductory video from a leading expert in radar as well as a longer, more technical one from an expert at JPL.

This currently is a document that collects all the learnings related to the standard product pipeline on AWS. There are generally two ways to run this pipeline:

Running all the steps as on-demand jobs
Setting up trigger rules based on datasets downstream of the enumerator

The two methods are elaborated simultaneously.

Walkthrough

Create an AOI

walk through this procedure

Run AOI based scraper jobs

Running these jobs ensures that GRQ has the most recent metadata about the acqs that cover the AOI

once the AOI has been created, facet on the AOI in Tosca and run the following jobs:
- acq scraper jobs
  - Action: AOI based submission of acq scraper jobs [develop]
  - Queue: factotoum-job_worker-small
  - Result: this job will submit individual acq scraper jobs over the AOI
- IPF scraper jobs
  - Action: AOI based submission of IPF scraper jobs [develop]
  - Queue: factotoum-job_worker-small
  - Result: this job will submit individual IPF scraper jobs over the AOI
these update the acquisition-S1-IW_SLC dataset
Notes:
- An “overloaded” term is AOI and track. Here are the definitions:
  - An AOI is the area which we want to cover with GUNWs. The SLCs to be downloaded and paired will be according to the enumeration strategy determined by the enumerator submitter below.
  - A track is the path in which the satellite follows and repeats during its orbits around Earth. Here is an image for Sentinel 6’s tracks. These track numbers are called path numbers in ASF Search (see the Filters menu; the example in the linked search shows Track 48 in January)
  - How are they related? We purposefully create AOIs that align with a particular track so that all the SLCs come from a given track. This is ensured in the enumeration because we have a minimum coverage threshold for SLCs and only those within the track will satisfy that threshold. Note that every SLC is collected on a given track so every SLC has a track number.
  - We can facet on tracks within Tosca using:
    - metadata.track_number:<track_number> or metadata.trackNumber:<track_number> (depending on the ES dataset)
- After creating AOIs, the spatial extent of the AOI dataset in tosca will be the single most important way to query the subsequent downstream datasets related to an AOI. Here is the rough process to do so:
  - Select the recently created AOI in Tosca
  - Click Query Region within said dataset
  - Facet on other dataset or query strings for further filtering.

Clean up Upstream Datasets (optional)

Context: Important datasets for monitoring are:

Acquisition Lists (created by the enumerator) : Ifg-cfgs (created by acq-list evaluator) : GUNWs (created by topsApp)

If the upstream datasets, namely acq-lists and ifg-cfgs, have been generated previous to the enumeration job (next step), then it will be difficult to track. Ultimately, the GUNWs are what we deliver and the acquisition lists and ifg-cfgs are internal. Thus, it is safe and recommended to delete these upstream datasets that intersect the extent of an AOI prior to running the enumeration. Do not delete GUNWs. More specifically, for the ops report discussed in this link, we start at the upstream dataset (acquisition lists) and then go to datasets downstream to track where processing is failing using the one-to-one correspondence above. Therefore, if you facet on:

AOI extent
Track number
Upstream dataset: acq-lists or ifg-cfg

And you find acquisition lists and ifg-cfgs within your AOI extent, then it will be helpful to purge these datasets. Such datasets could be due to previous AOI processing or migration from previous AWS clusters.

Facet on (a) AOI extent, (b) Track number, and (c) acq-list or ifg-cfg
- Action: Purge Dataset [develop-es]
- Queue: systems-jobs-queue

Run AOI based enumeration job

once the scraper jobs have completed, facet on the AOI in Tosca and run the following job:
- Action: AOI Enumerator Submitter [develop]
- Queue: factotoum-job_worker-small
- track_number: the track number of the AOI you are processing (provided by scientist/customer)
- enumeration_job_version: develop
- enumerator_queue: aria-standard_product-enumerator
  - note the default queue is stale
- min_match: number of nearest neighbors (provided by scientist/customer)
- acquisition_version: v2.0
- skipDays: number of days to skip before pairing (provided by scientist/customer)
- Result: this job will iterate over the S1-AUX_POEORBs that cover the AOI and submit individual enumeration jobs (aoi_track_acquisition_enumerator). The individual enumeration jobs produce products in the following datasets:
  - S1-GUNW-acq-list: I think of these as a shopping cart that carry the IDs of the SLCs needed to produce an S1-GUNW
    - each of these correspond to a unique ifg-cfg and an S1-GUNW
  - S1-GUNW-acqlist-audit_trail: these are evaluation assessments of each viable pair of SLCs evaluated by the enumerator
    - they are later used by data accountability tools
Note: If you plan to use trigger rules to improve operation efficiency, prior to running enumration, ensure that you have:
1. a trigger rule for acq-list-evaluator (faceted on SLC) - typically this is turned on
2. a trigger rule for slc-localizer (faceted on acq-lists) - typically this is turned on
3. a trigger rule for topsApp (faceted on ifg-cfg, track number, and AOI extent) - we generally have to create this one because we increase the number of facets to ensure topsApp is not accidentally run due to it’s high cost.

Run AOI based enumeration job with periods (an alternative enumeration scheme)

(Alternative enumeration strategy to above where we want SLCs to be within month range)

once the scraper jobs have completed, facet on the starttime of S1-AUX_POEORB. Here is a sample query to get January 1 to April 1: (starttime: {2014-01-01T00:00:00 TO 2014-04-01T00:00:00}) OR (starttime: {2015-01-01T00:00:00 TO 2015-04-01T00:00:00}) OR … OR (starttime: {2020-01-01T00:00:00 TO 2020-04-01T00:00:00}). You have to fill in those …! Here is an example.
- Action: Standard Product S1-GUNW - aoi_track_acquisition_enumerator [develop]
- Queue: TBD
- track_number: the track number of the AOI you are processing (provided by scientist/customer)
- AOI Name: determined via customer
- enumeration_job_version: develop
- enumerator_queue: aria-standard_product-enumerator
  - note the default queue is stale
- min_match: number of nearest neighbors (provided by scientist/customer)
- acquisition_version: v2.0
- skipDays: number of days to skip before pairing (provided by scientist/customer)
- Result: this job will iterate over the S1-AUX_POEORBs that we faceted over AND cover the AOI. The individual enumeration jobs produce products in the following datasets:
  - S1-GUNW-acq-list: I think of these as a shopping cart that carry the IDs of the SLCs needed to produce an S1-GUNW
    - each of these correspond to a unique ifg-cfg and an S1-GUNW
  - S1-GUNW-acqlist-audit_trail: these are evaluation assessments of each viable pair of SLCs evaluated by the enumerator
    - they are later used by data accountability tools

Download SLCs

once all acq-lists have been generated, facet on such acq-lists in Tosca
- you can query by the AOI id and then facet on the S1-GUNW-acq-list dataset
then submit the localizer jobs:
- Action: Standard Product S1-GUNW slc_localizer [develop]
- Queue: aria-standard_product-localizer (Perhaps create a dedicated queue on factotum like we did for the evaluator.)
- asf_ngap_download_queue: factotum-job_worker-slc_sling-asf
  - Note the other queue slc-sling-extract-asf is an ASG
- esa_download_queue: factotum-job_worker-slc_sling-scihub
  - Note the other queue slc-sling-extract-scihub is an ASG
- spyddder_sling_extract_version: develop
- Result: this job will iterate over the SLCs listed in the acq-list and submit a data sling job
  - these sling jobs take acquisition-S1-IW_SLCs as an input and will download the corresponding SLC from ASF (relatively old acquisition) or Scihub (acquisition is less than 2 weeks old) to s3 and register the SLC in the S1-IW_SLC dataset in GRQ
- Notes:
  - Acquisition lists are in one-to-one correspondence with ifg-cfgs
  - SLCs can be shared among acquisition lists and ifg-cfg's within an AOI. Therefore, #SLCs < # acq-lists = #ifg-cfgs within your AOI. As an example, within an AOI, there were ~700 SLCs for 2300 ifg-cfgs.
  - Say you run the localizer and you see that you have a bunch ifg-cfgs haven’t been created even though most of the sling jobs have been completed successfully. You may have only a few SLCs to download (or much less than the ifg-cfgs that are missing). Check the unique SLCs in the ops report.
  - If you have the proper trigger rules set up and activated, every time a new SLC is slinged and put into the system, then an ifg-cfg is created. This is a helpful trigger rule to have. Currently it is called acqlist_evaluator.
if you ever need to download a particular SLC, facet on the corresponding acquisition-S1-IW_SLC (the SLC is a substring of the acquisition id) and submit the following job
- Action: Data Sling and Extract for {asf, Scihub} [develop]
- Queue: factotum-job_worker-{large,small}
sling jobs have a tendency to fail since certain products are archived in the DAACs
- retrying/resubmitting the failed jobs a little later will usually complete

Generate ifg-cfg products

facet on the SLCs in Tosca and submit the following job:
- Action: Standard Product S1-GUNW - acqlist_evaluator_ifgcfg [develop]
- Queue: factotum-job_worker-standard_product-slc_evaluator
- acqlist_version: v2.0.0
- acquisition_version: v2.0
- slc_version: v1.1
- Result: this job will look at an SLC and check the acqlists to see if this SLC “completes” it (ie. was the last SLC needed to be ingested in order to proceed). If all SLCs have been downloaded, a S1-GUNW-ifg-cfg product will be produced. These produced are the input/configuration parameters for the topsapp PGE.
- Notes:
  - This is extremely fast.
  - Sometimes it’s helpful to facet on the AOI and run the SLCs that you have staged to see what new ifg-cfgs are ready to be processed.
note that these jobs are automatically submitted by the trigger rule acqlist_evaluator and you should not normally need to submit them on-demand
if you do need to facet on the SLCs over an AOI to submit jobs on-demand, do the following:
- query by the AOI region
- refine your facet by adding metadata.trackNumber: <track number> to the query box
  - note other data products have the field named metadata.track_number instead

Run topsapp

at this point, we have completed the necessary processing to now run topsapp and generate an ifg
facet on the ifg-cfgs and run the following job
- Action: TopsApp PGE in *Standard-Product* Pipeline for S1-GUNW Interferograms [develop]
- Queue: topsapp jobs take a while and run on expensive machines – therefore, this PGE significantly drives up costs for the pipeline! We have designated queues to tag the jobs with different accounts so customers can pay for these charges.
  - Current Recommended Queues (last updated 3/2021):
    - aria-standard_product-s1gunw-topsapp-NSLCT_Bekaert
    - aria-standard_product-s1gunw-topsapp-Access_Bekaert
    - aria-standard_product-s1gunw-topsapp-Volcano_Lundgren
    - aria-standard_product-s1gunw-topsapp-Rise_Limonadi
    - Note the last token in the above queues indicate the project name but more tags can be seen in the Autoscaling group setup in AWS.
- dataset_tag: this is a comma-delimited list of tags that will be added to the produced S1-GUNW metadata.dataset_tags field and can be used to facet on the product in the future
  - for the standard product pipeline on AWS, standard_product,aws should always be included in this parameter
- Result: a S1-GUNW product will be produced
- Notes on Trigger Rules:
  - General trigger rules with topsApp must be created with care because making a trigger rule that is too lenient can really run up costs. Here are some general rules. For topsApp trigger rule use the following facets:
    - Spatial extent of the AOI
    - The track number associated with the AOI
    - TODO: temporal spans associate with the enumerator
  - Due to the creation of the coseismic pipeline, there are some shared datasets. It is important to use NOT "Coseismic" in the query box to ensure coseismic datasets are ignored. More specific pipelines must ignore the machine tag called s1-gunw-coseismic.
- Notes on Errors:
  - There are some error types that are worth mentioning as they can arise even if the pipeline has been run correctly. Make sure the errors match exactly to those examples found below as ISCE errors are very, very hard to catch and a slight difference in the error output can mean be the result of totally different sources (note both error examples below mention “burst”):
    - Burst overlap errors like this job - the SLCs (on two different dates) do not have an overlap. This occurs when the metadata used to enumerate the job and create the IFG-CFG was slightly off from what is on the ground and/or the overlap is just not sufficient for ISCE2 to do it’s processing. This means that the IFG-CFG is malformed and should be ignored.
    - DEM download errors like this job - this is likely a transient error and will go away on a re-run. Simply, the DEM was not downloaded successfully from our S3 bucket during processing. If problems persist, please reach out to Nicholas Arenas.
    - Clobber errors like this job - although there are “short circuits” within the topsApp PGE exist, the PGE checks the completed GUNW database. Therefore, if two identical topsApp jobs were called on the same ifg-cfg before either could complete, then we will get these clobber errors. Note the clobber errors will generally not all be identical because it depends what file is uploaded first. However, an easy way to determine if such an error was due to duplication in the operator faceting, facet on a single input ifg-cfg and check the related topsApp jobs. Here is an example of such faceting in figaro.
  - If the errors are beyond the scope of those listed above, the relevant logs will be saved on Tosca using triaging HySDS functionality which is currently running for the topsApp PGE; here is an example of triaged job datasets. Facet on one of the failing ifg-cfg’s and send to current topsApp maintainer (as of March 2021, this is charlie.z.marshak@jpl.nasa.gov).
- Trigger Rule:
  - Generally, you want to set a trigger rule related to topsApp prior to running the enumerator.
  - Trigger rules that are so narrowly faceted can be hard to create if no existing dataset exists. Thus, here is a template (to copy and paste):

Generate AOI-Tracks product

AOI tracks are often too large to be covered by 1 S1-GUNW for a given date-pair
once all the ifgs for a specific date-pair are generated an S1-GUNW-AOI_TRACK product is produced
to generate these products, facet on the S1-GUNW products and submit the following job:
- Action: Standard Product S1-GUNW - S1-GUNW Completeness Evaluator [develop]
- Queue: factotum-job_worker-large
- Result: this job will look at an S1-GUNW and check if it “completes” the track for a given date-pair. If so S1-GUNW-AOI_TRACK product is generated. Otherwise, the evaluator silently completes without producing anything.
note that these jobs are automatically submitted by the trigger rule s1gunw-aws-s1gunw-completeness-evaluator and you should not normally need to submit them on-demand
once an S1-GUNW-AOI_TRACK product is produced, the S1-GUNW products will be published to ASF and ARIA-products via the following pipeline: TODO: delivery pipeline
Checking the Delivery to ASF
- In addition to using the various ops reports, you can go directly to ASF: https://search.asf.alaska.edu/ and us their search by “list” feature. Copying the GUNW ids into this feature can illustrate the delivery publically! This is generally a good method of “delivering” the final AOI to science customers. Here is an example.
Notes on Errors:
- This is not an error, but a confusing behavior. If any GUNW from an entire date pair is abset, then none of the GUNWs will deliver. So even if you facet on missing GUNWs, this process will complete without error, but not deliver your desired ASF.

Clean up

Purge localized SLCs as done here.
Delete trigger rules associated with TopsApp - clutters trigger rules.

Notes

Faceting in Tosca and Figaro

It is important to “debug” and view the various progress of all the datasets in the pipeline. There are some very crucial tools that will make life much, much easier.
Spatial queries - you can use any dataset to query a region via Query Region. As mentioned in the section on creating an AOI above, this can help find the related datasets in Tosca.
- This type of query can be done with any dataset. Here are a few examples:
  - all the staged SLCs associated with an ifg-cfg, use the extent of the ifg-cfg
  - all the possible SLCs from an acquisition list, use the extent of the acquisition list
  - See the GUNWs that use a given SLC, use the extent of the SLC.
Timestamp queries - generally if you have just created some number of datasets, its helpful to use a timestamp to query based on when they were generated, not by location. Note these are all in UTC!
- Figaro example: @timestamp: {2021-04-02T00:00:00 TO *} (in UTC)
- Tosca example: creation_timestamp: {2021-04-05T00:00:00 TO *} (in UTC)
- If your query is not working with the said examples, the general formula is <ES KEY>: {<start> TO <end>}.
- Examples use-cases:
  - You want to facet on only ifg-cfgs for topsApp jobs that you created in the last hour to avoid calling topsApp jobs for-ifg-cfgs that have already been created
  - You want to determine any given jobs of a particular job type that were called/triggered in a certain time frame to determine what type of failures occured.
Metadata Queries - sometimes its helpful to select a particular value in Tosca. One that comes up a lot is selecting a given a track number for datasets to ensure that a trigger rule is sufficiently constrained. The general format is <ES key with nesting indcated with .>: <ES value>.
- Examples:
  - metadata.track_number:<track_number> or metadata.trackNumber:<track_number> (depending on the ES dataset). This is useful for creating a trigger rule for TopsApp because an AOI, which is created to align with a track, will have this extra constraint and this ensures that no other ifg-cfgs created in the system will generate GUNWs.

General Notes on Cost Management and Processing

The two most important sources of cost within the pipeline are:

s3 storage of the staged SLCs
topsApp jobs

Importantly, the cost from s3 storage is dependent on time. The longer staged SLCs stay on the system, the more they cost. The operator has to manage these costs in the following way:

Ensuring that SLCs for an AOI are downloaded en masse. That is every acquisition list has all its SLCs. Of course, getting all the SLCs on the system is never attainable in practice. However, the more SLCs from an AOI that are downloaded, the more directly and thus faster the topsApp processing can be done. Also the purging can be done more quickly.
speed of processing staged SLCs (post enumeration) into GUNWs using topsapp - in other words, ensuring the topsApp jobs are run quickly once the SLCs have been staged so that you are not waiting
- this is most efficiently done with trigger rules on ifg-cfgs (see the topsapp section above).
Purging SLCs that are no longer needed
- Removing the datasets also purges the SLCs from S3
- While it is beneficial to purge SLCs that are no longer needed, note figuring which are needed and which are not is complicated and is why its best to get as many of the required SLCs downloaded at once
- If you have a small number of GUNWs that are missing it’s best to purge the existing SLCs and repeat the pipeline on the related acquisition lists/ifg-cfgs as those required to produce the GUNWs.