ths_import
Console script for extracting NSHM hazard curves to parquet dataset format - either from a given General Task - or for a single HDF5 file (as used in AWS batch jobs)
NSHM specific prerequisites are
- that hazard producer metadata is available from the NSHM toshi-api via nshm-toshi-client library
- NSHM model characteristics are available in the nzshm-model library
Process outline:¶
- Given a general task containing hazard calcs used in NHSM, we want to iterate over the sub-tasks and do¶
the setup required for importing the hazard curves:¶
- pull the configs and check we have a compatible producer config (or ...) cmd producers
¶
- optionally create new producer configs automatically, and record info about these¶
- if new producer configs are created, then it is the users responsibility to assign¶
a CompatibleCalculation to each¶
- Hazard curves are extracted from the original HDF5 files stored in Toshi API¶
- Hazard curves are output as a parquet dataset.¶
ths_import¶
Console script for extracting NSHM hazard curves to parquet dataset format.
- either for a given General Task, or
- a single HDF5 file (as used in runzi AWS batch jobs).
Usage:
ths_import [OPTIONS] COMMAND [ARGS]...
Options:
--help Show this message and exit.
extract¶
Extract openquake hazard curves for the given GT_ID, writing to OUTPUT_FOLDER in parquet format.
Arguments:
GT_ID: is an NSHM General task id containing HazardAutomation Tasks
COMPATIBLE_CALC_ID: FK of the compatible calculation.
WORK_FOLDER: is used to cache and process the downloaded artefacts.
OUTPUT_FOLDER: path to the output file OR S3 URI.
Notes:
-
pull the configs and check we have a compatible producer config
-
optionally, create any new producer configs
Usage:
ths_import extract [OPTIONS] GT_ID COMPATIBLE_CALC_ID
Options:
-W, --work_folder TEXT defaults to current directory
-O, --output TEXT local or S3 target
-v, --verbose
-d, --dry-run
-CID, --partition-by-calc-id
-f64, --use-64bit
-ff, --skip-until-id TEXT
--debug turn on debug logging
--help Show this message and exit.
producers¶
Prepare and validate Producer Configs a given GT_ID
GT_ID is an NSHM General task id containing HazardAutomation Tasks\n compatible_calc_fk is the unique key of the compatible_calc
Notes: - pull the configs and check we have a compatible producer config - optionally, create any new producer configs
Usage:
ths_import producers [OPTIONS] GT_ID COMPATIBLE_CALC_FK
Options:
-W, --work_folder TEXT defaults to current directory
-U, --update overwrite existing producer record.
-v, --verbose
--help Show this message and exit.
store-hazard¶
Extract openquake hazard curves from HDF5_PATH writing to OUTPUT in parquet format.
Compatablity metadata is extracted from the CONFIG_PATH.
Arguments:
HDF5_PATH: path to the hazard realization HDF5 file.
CONFIG_PATH: path to the oq_config.json
file.
COMPATIBLE_CALC_ID: FK of the compatible calculation.
HAZARD_CALC_ID: FK of the hazard calculation.
ECR_DIGEST: AWS ECR SHA256 digest of the hazard docker image.
e.g sha256:db023d95e7ec6707fe3484c7b3c1f8fd4d1c134d5a6d7ec5e939700b625293d9
OUTPUT: path to the output file OR S3 URI.
Usage:
ths_import store-hazard [OPTIONS] HDF5_PATH CONFIG_PATH COMPATIBLE_CALC_ID
HAZARD_CALC_ID ECR_DIGEST OUTPUT
Options:
--help Show this message and exit.