...
System applies ML to detect potential anomalies in displacement time-series, and publishes detected results back to GRQ catalog.
User logs into system to browse potential anomalies.
Science story: Anomaly detection will inform scientists of possible volcanoes to monitor closely.
System Requirements
Dependencies for ARIA-tools:
Installation instructions for ARIA-tools are outlined here: https://github.com/aria-tools/ARIA-tools#installation. The requirements.txt and environment.yml files contain the full list of requirements and dependencies, but here are the requirements as laid out on the README page of the git:
Packages:
Code Block |
---|
Python >= 3.5 (3.6 preferred)
[PROJ 4](https://github.com/OSGeo/proj) github >= 6.0
[GDAL](https://www.dgal.org/) and its Python bindings >=3.0 |
Python dependencies:
Code Block |
---|
[SciPy](https://scipy.org/)
[netcdf4](http://unidata.github.io/netcdf4-python/netCDF4/index.html)
[requests](https://2.python-requests.org/en/master/) |
Additionally, you need to download the ARIA-tools documentation repository:
Code Block |
---|
https://github.com/aria-tools/ARIA-tools-docs.git |
Dependencies and/or requirements for MintPy:
Installation instructions for various OS (as well as notes for Docker users) are encompassed here: https://github.com/insarlab/MintPy/blob/master/docs/installation.md
The following requirements are in the requirements.txt file in the MintPy github repository:
Code Block |
---|
cvxopt
dask>=1.0
dask-jobqueue>=0.3
defusedxml
h5py
lxml
matplotlib
numpy
pyproj
pykml
pyresample
scikit-image
scikit-learn
scipy |
MintPy and PyAPS repositories:
Code Block |
---|
https://github.com/insarlab/MintPy.git
https://github.com/yunjunz/pyaps3.git |
The configuration file necessary to run ARIA data through MintPy is attached here:
View file | ||
---|---|---|
|
Example templates can also be found on the MintPy website: https://mintpy.readthedocs.io/en/latest/examples/input_files/
Processing Pipeline
create new diagram of the steps of (a) processing steps and (b) data
what are the key steps needed from S1-GUNW -> aria-tools -> MintPy -> L3 displacement time series
for each step, find out what input and outputs are, and what condition(s) is used to trigger that step.
what are the key datasets
type
dataset naming convention
identify source code of each step.
Draft diagram of Processing Pipeline - to be refined as more information is learned:
...
This diagram branches off from the larger system diagram on Standard Product S1-GUNW Processing Pipeline.
Running ARIA data through MintPy (the below section is in progress)
This section gives an overview of the necessary steps to run S1-GUNW products from a complete AOI track through the MintPy displacement time-series calculations.
Nominally, for a user outside of HySDS, the first step would be to download the data products of interest using ariaDownload.py from ARIA-tools (https://nbviewer.jupyter.org/github/aria-tools/ARIA-tools-docs/blob/master/JupyterDocs/ariaDownload/ariaDownload_tutorial.ipynb). This command allows the user to specify a data range either spatially (with a specified bounding box or link to a shapefile) or temporally (with start/stop dates or a temporal baseline). However, for the purposes of automation, it is far more efficient to not have to download S1-GUNW files from S3.
Step 1:
When an S1-GUNW product is created and within a specified AOI for the purposes of monitoring volcanic activity, it triggers the AOI track evaluator. The AOI track evaluator creates a JSON file containing, among other metadata, a list of GUNW products when the full track for the AOI has been completed.
Alternatively, S1-GUNW production for a specified AOI track could also be triggered upon the user first defining the AOI. The system will pull relevant SLCs and other data to create the S1-GUNWs for the AOI, if the GUNWs for that area were not already extant.
Step 2:
The output JSON file can be passed into the PGE for displacement time-series calculation. The first step within the PGE should be to reformat the data and metadata present in the S1-GUNW files for MintPy processing, using the ARIA-tools command ariaTSsetup.py.
...
Need to clarify which products we want to do the time-series calculation on - just the most recent pair, or annual/seasonal pairs and nominal nearest 2-neighbors? --Paul and Grace have said nearest 2-neighbors is fine for all four identified volcanic areas (Domuyo, Taal, Galapagos Islands, Kilauea), with additional annual pairs for Domuyo from Jan 15 - April 15.
Step 3:
The ariaTSsetup.py code can take a text file containing the urls of each data product as input. The text file should contain one url per line, and no other information (no headers, footers, labels, etc).
The bounding box of the AOI can be input using (-b ‘coordinates in SNWE’). To extract meta-data layers from the input data, the user needs to download a DEM (--dem Download). There is also a functionality to download a mask (--mask Download) to remove any water bodies from the data.
Calling ariaTSsetup.py should look like:
Code Block | ||
---|---|---|
| ||
ariaTSsetup.py -f 'nameOfTextFile.txt' -b '37.25 38.1 -122.6 -121.75' --mask Download --dem Download |
There is also an option to specify a working directory (-w) in which the intermediate products and final outputs are saved. If not otherwise specified, the default working directory is the current directory.
Questions to answer:
Do we want to use a bounding box in this step? If so, that information is also present in the JSON string from the AOI-track-evaluator, and so can be read in from there.
ariaTSsetup.py also takes minimum overlap (-mo) as an input, defined in units of km^2. Do we want to include this? If so, do we want a single defined value, or should this change depending on the AOI? Who should decide this value?
Step 4:
Once the data has been prepared, it can be run through the main MintPy application, smallbaselineApp.py. The input to this command is the custom configuration file (here named smallbaselineApp.cfg). The custom configuration file is attached here:
View file | ||
---|---|---|
|
Further information about configuration templates is on the MintPy website: https://mintpy.readthedocs.io/en/latest/examples/input_files/ The figure below shows the nominal flow of smallbaselineApp.py. Optional steps are indicated by dotted line boundaries. These optional steps are omitted unless otherwise directed.
...
(Figure from Yunjen et al., 2019)
An example call to smallbaselineApp is shown below:
Code Block | ||
---|---|---|
| ||
smallbaselineApp.py smallbaselineApp.cfg |
The output product is an HDF-EOS5 format file containing the displacement time-series with geometry information.
Questions to answer:
...
The functionality of smallbaselineApp allows for starting and stopping, so we could skip processing steps the scientists do not want. (from meeting on 7/10/2020, they indicated a preference for raw displacement time-series data, so we could skip atmospheric corrections, etc)
Need to clarify with science team which steps they want to include or skip - do they want only the raw phase time series? (blue portion of figure above) Or the displacement time series (green portion of figure above)?
From Emre - we can edit the template file to tell the program to skip troposhperic correction, DEM error correction, etc. That way we just edit the template (configuration file) rather than having to edit the call to smallbaselineApp
After initial calculation of displacement time-series, we want to update the time-series at a defined cadence. This brings up a few decisions:
...
Do we want to have the time-series update at a user input cadence (individual for each AOI), or just have the cadence pre-designated for all AOI, then design the system to update at that cadence?
How do we update the time-series? Re-process all old data? Or combine time-series HDF-EOS5 files after they’re created by MintPy?
...
If combining, how to accomplish this?
...
Implementation Notes (Alex Dunn )
...