Validation Utilities¶
Pydantic validation models and utility functions for ensuring data integrity.
Validation Models¶
AllocationInputs¶
fair_shares.library.validation.models.AllocationInputs ¶
Bases: BaseModel
Validates all input data before allocation runs.
This model encapsulates input validation that was previously scattered across decorators and function bodies. It validates:
- Data structure (index, columns)
- Value constraints (positive values, Gini in [0,1])
- Year coverage (required years exist)
- Data completeness (no nulls in required fields)
Examples:
>>> inputs = AllocationInputs(
... population_ts=pop_df,
... gdp_ts=gdp_df,
... first_allocation_year=2020,
... )
>>> # Validation runs automatically on model creation
>>> # Will raise ValidationError if data is invalid
validate_population
classmethod
¶
Validate population data structure and values.
validate_gdp
classmethod
¶
Validate GDP data structure and values (if provided).
validate_gini
classmethod
¶
Validate Gini data structure and range (if provided).
validate_country_emissions
classmethod
¶
Validate country emissions data structure and columns (if provided).
validate_world_emissions
classmethod
¶
Validate world scenario emissions has MultiIndex with 'unit' (if provided).
validate_year_coverage ¶
validate_year_coverage() -> AllocationInputs
Ensure required years exist in all provided data.
All datasets must cover the allocation period from first_allocation_year to last_allocation_year.
validate_gini_year_in_bounds ¶
validate_gini_year_in_bounds() -> AllocationInputs
Validate that Gini reference year is reasonable.
Gini data is typically only available for recent historical periods.
AllocationOutputs¶
fair_shares.library.validation.models.AllocationOutputs ¶
Bases: BaseModel
Validates allocation output data after allocation runs.
This model ensures allocation outputs are mathematically valid and complete:
- Shares sum to 1.0 for each year (within numerical tolerance)
- No null values in shares (unless expected post-net-zero)
- Data structure is consistent with input requirements
Examples:
>>> outputs = AllocationOutputs(
... shares=shares_df,
... dataset_name="Equal Per Capita Shares",
... )
>>> # Validation runs automatically on model creation
>>> # Will raise ValidationError if shares invalid
>>> # Allow post-net-zero NaN values
>>> outputs = AllocationOutputs(
... shares=shares_df,
... dataset_name="Pathway Shares",
... first_year=2020,
... reference_data=world_pathway_df, # NaN pattern template
... )
validate_shares_sum ¶
validate_shares_sum() -> AllocationOutputs
Validate that shares sum to 1.0 for each year.
Uses the tolerance parameter for floating point comparison. Years containing NaN values are skipped (e.g., post-net-zero periods).
validate_no_unexpected_nulls ¶
validate_no_unexpected_nulls() -> AllocationOutputs
Validate that there are no unexpected null values in shares.
If reference_data is provided, NaN values in shares are allowed if they match the NaN pattern in reference_data (e.g., for post-net-zero pathway years where global emissions are zero).
See Also¶
- Budget Allocations: Functions that use validation
- Pathway Allocations: Functions that use validation