Skip to content

ths_import

Console script for extracting NSHM hazard curves to parquet dataset format - either from a given General Task - or for a single HDF5 file (as used in AWS batch jobs)

NSHM specific prerequisites are
  • that hazard producer metadata is available from the NSHM toshi-api via nshm-toshi-client library
  • NSHM model characteristics are available in the nzshm-model library

Process outline:

- Given a general task containing hazard calcs used in NHSM, we want to iterate over the sub-tasks and do

the setup required for importing the hazard curves:

- pull the configs and check we have a compatible producer config (or ...) cmd producers

- optionally create new producer configs automatically, and record info about these

- if new producer configs are created, then it is the users responsibility to assign

a CompatibleCalculation to each

- Hazard curves are extracted from the original HDF5 files stored in Toshi API

- Hazard curves are output as a parquet dataset.

ths_import

Console script for extracting NSHM hazard curves to parquet dataset format.

  • either for a given General Task, or
  • a single HDF5 file (as used in runzi AWS batch jobs).

Usage:

ths_import [OPTIONS] COMMAND [ARGS]...

Options:

  --help  Show this message and exit.

extract

Extract openquake hazard curves for the given GT_ID, writing to OUTPUT_FOLDER in parquet format.

Arguments:

GT_ID: is an NSHM General task id containing HazardAutomation Tasks

COMPATIBLE_CALC_ID: FK of the compatible calculation.

WORK_FOLDER: is used to cache and process the downloaded artefacts.

OUTPUT_FOLDER: path to the output file OR S3 URI.

Notes:

  • pull the configs and check we have a compatible producer config

  • optionally, create any new producer configs

Usage:

ths_import extract [OPTIONS] GT_ID COMPATIBLE_CALC_ID

Options:

  -W, --work_folder TEXT        defaults to current directory
  -O, --output TEXT             local or S3 target
  -v, --verbose
  -d, --dry-run
  -CID, --partition-by-calc-id
  -f64, --use-64bit
  -ff, --skip-until-id TEXT
  --debug                       turn on debug logging
  --help                        Show this message and exit.

producers

Prepare and validate Producer Configs a given GT_ID

GT_ID is an NSHM General task id containing HazardAutomation Tasks\n compatible_calc_fk is the unique key of the compatible_calc

Notes: - pull the configs and check we have a compatible producer config - optionally, create any new producer configs

Usage:

ths_import producers [OPTIONS] GT_ID COMPATIBLE_CALC_FK

Options:

  -W, --work_folder TEXT  defaults to current directory
  -U, --update            overwrite existing producer record.
  -v, --verbose
  --help                  Show this message and exit.

store-hazard

Extract openquake hazard curves from HDF5_PATH writing to OUTPUT in parquet format.

Compatablity metadata is extracted from the CONFIG_PATH.

Arguments:

HDF5_PATH: path to the hazard realization HDF5 file.

CONFIG_PATH: path to the oq_config.json file.

COMPATIBLE_CALC_ID: FK of the compatible calculation.

HAZARD_CALC_ID: FK of the hazard calculation.

ECR_DIGEST: AWS ECR SHA256 digest of the hazard docker image.

e.g sha256:db023d95e7ec6707fe3484c7b3c1f8fd4d1c134d5a6d7ec5e939700b625293d9

OUTPUT: path to the output file OR S3 URI.

Usage:

ths_import store-hazard [OPTIONS] HDF5_PATH CONFIG_PATH COMPATIBLE_CALC_ID
                        HAZARD_CALC_ID ECR_DIGEST OUTPUT

Options:

  --help  Show this message and exit.