Skip to content

Hazard Metadata Models

These models manage metadata about hazard calculations and producers, stored as JSON files.

Compatible Hazard Calculation

Provides a unique identifier for compatible hazard calculations (calculations that produce comparable results).

Bases: BaseModel

Provides a unique identifier for compatible Hazard Calculations.

Attributes:

Name Type Description
unique_id str

A unique identifier for the Hazard Calculation.

notes optional

Additional information about the Hazard Calculation.

created_at datetime

The date and time this record was created. Defaults to utcnow.

updated_at datetime

The date and time this record was last updated. Defaults to utcnow.

Source code in toshi_hazard_store/model/hazard_models_pydantic.py
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
class CompatibleHazardCalculation(BaseModel):
    """
    Provides a unique identifier for compatible Hazard Calculations.

    Attributes:
        unique_id: A unique identifier for the Hazard Calculation.
        notes (optional): Additional information about the Hazard Calculation.
        created_at: The date and time this record was created. Defaults to utcnow.
        updated_at: The date and time this record was last updated. Defaults to utcnow.
    """

    unique_id: str  # NB Field(...) means that this field is required, no default value.
    notes: str | None = None
    created_at: datetime = Field(default_factory=lambda: datetime.now(timezone.utc))
    updated_at: datetime = Field(default_factory=lambda: datetime.now(timezone.utc))

Hazard Curve Producer Config

Records characteristics of hazard curve producers/engines for compatibility and reproducibility.

Bases: BaseModel

Records characteristics of Hazard Curve producers/engines for compatibility and reproducibility.

For hazard curve compatibility, we use both: - compatible_calc_fk: curves sharing this fk are compatible because the software/science is compatible. - config_digest: the config digest tells us the PSHA software configuration is compatible (see nzshm-model for details).

For hazard curve reproducibility, use both: - ecr_image_digest: a hexdigest from the ecr_image (this is stored in the dataset). - ecr_image: we can run the same inputs against this Docker image to reproduce the outputs.

Attributes:

Name Type Description
compatible_calc_fk str

Foreign key to a CompatibleHazardCalculation (must map to a valid unique_id).

ecr_image_digest str

Docker image digest (sha256:...).

config_digest str

Configuration digest.

created_at datetime

The date and time this record was created. Defaults to utcnow.

updated_at datetime

The date and time this record was last updated. Defaults to utcnow.

ecr_image AwsEcrImage | None

Optional AwsEcrImage for reproducibility.

notes str | None

Optional additional information.

POSSIBLE in future
  • if necessary we can extend this with a GithubRef / DockerImage alternative to AwsEcrImage.
Source code in toshi_hazard_store/model/hazard_models_pydantic.py
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
class HazardCurveProducerConfig(BaseModel):
    """Records characteristics of Hazard Curve producers/engines for compatibility and reproducibility.

    For hazard curve compatibility, we use both:
        - compatible_calc_fk: curves sharing this fk are `compatible` because the software/science is compatible.
        - config_digest: the config digest tells us the PSHA software configuration is compatible
          (see nzshm-model for details).

    For hazard curve reproducibility, use both:
        - ecr_image_digest: a hexdigest from the ecr_image (this is stored in the dataset).
        - ecr_image: we can run the same inputs against this Docker image to reproduce the outputs.

    Attributes:
        compatible_calc_fk: Foreign key to a CompatibleHazardCalculation (must map to a valid unique_id).
        ecr_image_digest: Docker image digest (sha256:...).
        config_digest: Configuration digest.
        created_at: The date and time this record was created. Defaults to utcnow.
        updated_at: The date and time this record was last updated. Defaults to utcnow.
        ecr_image: Optional AwsEcrImage for reproducibility.
        notes: Optional additional information.

    POSSIBLE in future:
        - if necessary we can extend this with a GithubRef / DockerImage alternative to AwsEcrImage.
    """

    compatible_calc_fk: str  # must map to a valid CompatibleHazardCalculation.unique_id
    ecr_image_digest: str
    config_digest: str

    created_at: datetime = Field(default_factory=lambda: datetime.now(timezone.utc))
    updated_at: datetime = Field(default_factory=lambda: datetime.now(timezone.utc))

    ecr_image: AwsEcrImage | None = None
    notes: str | None = None

    @property
    def unique_id(self) -> str:
        """The unique ID should not include any non cross-platform characters (for filename compatablity)."""
        assert self.ecr_image_digest[:7] == "sha256:"
        return self.ecr_image_digest[7:]

Usage

These metadata models are used together to identify compatible hazard curves:

  1. CompatibleHazardCalculation - Created when setting up a new hazard calculation
  2. HazardCurveProducerConfig - Created when running a calculation, linked to a CompatibleHazardCalculation

Two hazard curves are considered "compatible" if they share the same compatible_calc_id and config_digest.

The managers handle CRUD operations via JSON files in the resources/metadata/ directory.