Skip to content

Validation Utilities

Pydantic validation models and utility functions for ensuring data integrity.

Validation Models

AllocationInputs

fair_shares.library.validation.models.AllocationInputs

Bases: BaseModel

Validates all input data before allocation runs.

This model encapsulates input validation that was previously scattered across decorators and function bodies. It validates:

  • Data structure (index, columns)
  • Value constraints (positive values, Gini in [0,1])
  • Year coverage (required years exist)
  • Data completeness (no nulls in required fields)

Examples:

Python Console Session
>>> inputs = AllocationInputs(
...     population_ts=pop_df,
...     gdp_ts=gdp_df,
...     first_allocation_year=2020,
... )
>>> # Validation runs automatically on model creation
>>> # Will raise ValidationError if data is invalid
validate_population classmethod
Python
validate_population(
    v: TimeseriesDataFrame,
) -> TimeseriesDataFrame

Validate population data structure and values.

validate_gdp classmethod
Python
validate_gdp(
    v: TimeseriesDataFrame | None,
) -> TimeseriesDataFrame | None

Validate GDP data structure and values (if provided).

validate_gini classmethod
Python
validate_gini(v: DataFrame | None) -> DataFrame | None

Validate Gini data structure and range (if provided).

validate_country_emissions classmethod
Python
validate_country_emissions(
    v: TimeseriesDataFrame | None,
) -> TimeseriesDataFrame | None

Validate country emissions data structure and columns (if provided).

validate_world_emissions classmethod
Python
validate_world_emissions(
    v: TimeseriesDataFrame | None,
) -> TimeseriesDataFrame | None

Validate world scenario emissions has MultiIndex with 'unit' (if provided).

validate_year_coverage
Python
validate_year_coverage() -> AllocationInputs

Ensure required years exist in all provided data.

All datasets must cover the allocation period from first_allocation_year to last_allocation_year.

validate_gini_year_in_bounds
Python
validate_gini_year_in_bounds() -> AllocationInputs

Validate that Gini reference year is reasonable.

Gini data is typically only available for recent historical periods.

AllocationOutputs

fair_shares.library.validation.models.AllocationOutputs

Bases: BaseModel

Validates allocation output data after allocation runs.

This model ensures allocation outputs are mathematically valid and complete:

  • Shares sum to 1.0 for each year (within numerical tolerance)
  • No null values in shares (unless expected post-net-zero)
  • Data structure is consistent with input requirements

Examples:

Python Console Session
>>> outputs = AllocationOutputs(
...     shares=shares_df,
...     dataset_name="Equal Per Capita Shares",
... )
>>> # Validation runs automatically on model creation
>>> # Will raise ValidationError if shares invalid
Python Console Session
>>> # Allow post-net-zero NaN values
>>> outputs = AllocationOutputs(
...     shares=shares_df,
...     dataset_name="Pathway Shares",
...     first_year=2020,
...     reference_data=world_pathway_df,  # NaN pattern template
... )
validate_shares_sum
Python
validate_shares_sum() -> AllocationOutputs

Validate that shares sum to 1.0 for each year.

Uses the tolerance parameter for floating point comparison. Years containing NaN values are skipped (e.g., post-net-zero periods).

validate_no_unexpected_nulls
Python
validate_no_unexpected_nulls() -> AllocationOutputs

Validate that there are no unexpected null values in shares.

If reference_data is provided, NaN values in shares are allowed if they match the NaN pattern in reference_data (e.g., for post-net-zero pathway years where global emissions are zero).

See Also