Output Schema¶

This page documents the structure of allocation outputs and how parquet files are transformed to CSV format.

Output Files¶

All allocation runs produce four output files:

File	Format	Description
`allocations_relative.parquet`	Parquet	Country fractions summing to 1.0 (all columns)
`allocations_absolute.parquet`	Parquet	Shares × global target in Mt CO2e (all columns)
`allocations_wide.csv`	CSV	Simplified wide format for spreadsheet use
`param_manifest.csv`	CSV	Parameter combinations used in the run

Parquet Format (Full Detail)¶

The parquet files contain the complete allocation results with all metadata and parameter columns.

Identifier Columns¶

Column	Type	Description
`iso3c`	string	Country code (e.g., 'USA', 'CHN')
`approach`	string	Allocation method (e.g., 'equal-per-capita-budget')
`climate-assessment`	string	Normalised temperature target from the scenario source (e.g., '1.5C', '2C' for AR6)
`quantile`	string	Normalised probability percentile from the scenario source (e.g., '0.5' for AR6 C1 median)
`emission-category`	string	Emission type (e.g., 'co2-ffi', 'all-ghg')

Data Source Columns¶

Column	Type	Description
`source-id`	string	Composite data source identifier
`emissions-source`	string	Historical emissions dataset
`gdp-source`	string	GDP dataset
`population-source`	string	Population dataset
`gini-source`	string	Gini coefficient dataset
`target-source`	string	Target type ('rcbs', 'ar6', etc.)

Parameter Columns¶

Individual parameter columns for each allocation approach:

Column	Type	Used By
`allocation-year`	int	Budget approaches
`first-allocation-year`	int	Pathway approaches
`preserve-allocation-year-shares`	bool	Budget approaches
`preserve-first-allocation-year-shares`	bool	Pathway approaches
`pre-allocation-responsibility-weight`	float	Adjusted approaches
`capability-weight`	float	Adjusted approaches
`pre-allocation-responsibility-year`	int	Adjusted approaches
`pre-allocation-responsibility-per-capita`	bool	Adjusted approaches
`capability-per-capita`	bool	Adjusted approaches
`pre-allocation-responsibility-exponent`	float	Adjusted approaches
`capability-exponent`	float	Adjusted approaches
`pre-allocation-responsibility-functional-form`	string	Adjusted approaches
`capability-functional-form`	string	Adjusted approaches
`max-deviation-sigma`	float	All approaches (optional)
`income-floor`	float	Gini-adjusted approaches
`max-gini-adjustment`	float	Gini-adjusted approaches
`convergence-year`	int	Convergence approaches
`convergence-speed`	float	Convergence approaches

Year Columns¶

Column Pattern	Type	Description
`"2020"` etc.	float	Allocation values for each year (strings)

Note: Year columns are stored as strings, not integers. Always use "2020", not 2020.

Quality Columns¶

Column	Type	Description
`warnings`	string	Quality warnings (if any)
`missing-net-negative-mtco2e`	float	Missing net-negative emissions

CSV Format (Simplified)¶

The allocations_wide.csv file simplifies the parquet structure for spreadsheet use by:

Combining relative and absolute into one file with a data-type column
Collapsing parameter columns into approach-short
Collapsing metadata columns into variable
Excluding redundant columns
Converting to kebab-case for consistency

Column Transformation¶

1. Data Type Column (Added)¶

Column	Type	Values
`data-type`	string	'relative', 'absolute'

Indicates whether values are fractions (relative) or Mt CO2e (absolute).

2. Approach Short (Collapsed)¶

The approach-short column collapses all parameter columns into a compact code:

Format: {approach-code}-{param1}{value1}-{param2}{value2}...

Example transformations:

Parquet Columns	CSV `approach-short`
`approach="equal-per-capita"`, `first-allocation-year=2020`	`EPC-y2020`
`approach="per-capita-adjusted"`, `pre-allocation-responsibility-weight=0.5`, `capability-weight=0.5`	`PC-Adj-rw0.5-cw0.5`
`approach="cumulative-per-capita-convergence"`, `convergence-speed=0.42`	`CPCC-cs0.42`

Approach codes:

Full Approach Name	Short Code
`equal-per-capita`	`EPC`
`per-capita-adjusted`	`PC-Adj`
`per-capita-adjusted-gini`	`PC-Adj-Gini`
`per-capita-convergence`	`PCC`
`cumulative-per-capita-convergence`	`CPCC`
`cumulative-per-capita-convergence-adjusted`	`CPCC-Adj`
`cumulative-per-capita-convergence-gini-adjusted`	`CPCC-Adj-Gini`
`equal-per-capita-budget`	`EPC-B`
`per-capita-adjusted-budget`	`PC-Adj-B`
`per-capita-adjusted-gini-budget`	`PC-Adj-Gini-B`

Parameter prefixes:

Parameter	Prefix
`first-allocation-year`	`y`
`allocation-year`	`ay`
`preserve-first-allocation-year-shares`	`pfa`
`preserve-allocation-year-shares`	`pa`
`convergence-year`	`c`
`convergence-speed`	`cs`
`pre-allocation-responsibility-weight`	`rw`
`capability-weight`	`cw`
`pre-allocation-responsibility-year`	`hr`
`pre-allocation-responsibility-per-capita`	`rpc`
`capability-per-capita`	`cpc`
`pre-allocation-responsibility-exponent`	`re`
`capability-exponent`	`ce`
`pre-allocation-responsibility-functional-form`	`rff`
`capability-functional-form`	`cff`
`max-deviation-sigma`	`s`
`income-floor`	`if`
`max-gini-adjustment`	`ga`

3. Variable Column (Collapsed)¶

The variable column encodes key metadata in pipe-delimited format:

Format: {emission-category}|{target-source}|{source}|{climate-assessment}|{quantile}|{approach-short}

Example:

Text Only

co2-ffi|rcbs|primap-hist-v2.5|1.5C|0.5|EPC-y2020

This enables filtering and pivoting in spreadsheet software.

4. CSV Column Order¶

The CSV uses this column order:

Data Context: source-id, allocation-folder
Source Data: emissions-source, gdp-source, population-source, gini-source
Target: target-source
Approach: data-type, approach-short, variable
Identity: iso3c, unit
Quality: warnings, missing-net-negative-mtco2e
Years: 2020, 2021, ..., 2100 (in chronological order)

5. Excluded Columns¶

These columns from parquet are excluded from CSV:

Parameter columns (collapsed into approach-short):

All individual parameter columns listed in Parquet Format section

Metadata columns (collapsed into variable or redundant):

approach (encoded in approach-short)
emission-category (encoded in variable)
climate-assessment (encoded in variable)
quantile (encoded in variable)
source (encoded in variable)

Customizing CSV Output¶

The CSV transformation can be customized when calling convert_parquet_to_wide_csv():

Custom Parameter Prefixes¶

Python

from fair_shares.library.utils.data.parquet_to_csv import convert_parquet_to_wide_csv

# Use custom prefixes for specific parameters
custom_prefixes = {
    "pre-allocation-responsibility-weight": "resp",
    "capability-weight": "cap",
}

convert_parquet_to_wide_csv(
    allocations_dir="output/example/allocations/run-001/",
    config_prefixes=custom_prefixes
)

Result: PC-Adj-resp0.5-cap0.5 instead of PC-Adj-rw0.5-cw0.5

Custom Approach Names¶

Python

# Use custom short codes for approaches
custom_names = {
    "equal-per-capita": "Equal",
    "per-capita-adjusted": "Adjusted",
}

convert_parquet_to_wide_csv(
    allocations_dir="output/example/allocations/run-001/",
    approach_names=custom_names
)

Result: Equal-y2020 instead of EPC-y2020

Working with Outputs¶

In Python (Parquet)¶

Python

import pandas as pd

# Load full detail
df_relative = pd.read_parquet("allocations_relative.parquet")
df_absolute = pd.read_parquet("allocations_absolute.parquet")

# Filter by approach
epc = df_relative[df_relative["approach"] == "equal-per-capita"]

# Filter by parameter value
high_responsibility = df_relative[df_relative["pre-allocation-responsibility-weight"] >= 0.5]

In Python (CSV)¶

Python

import pandas as pd

# Load simplified format
df = pd.read_csv("allocations_wide.csv")

# Separate relative and absolute
relative = df[df["data-type"] == "relative"]
absolute = df[df["data-type"] == "absolute"]

# Filter by approach-short
epc = df[df["approach-short"].str.startswith("EPC")]

In R (Parquet)¶

S

library(arrow)

# Load full detail
df_relative <- read_parquet("allocations_relative.parquet")
df_absolute <- read_parquet("allocations_absolute.parquet")

# Filter by approach
epc <- df_relative %>%
  filter(approach == "equal-per-capita")

In Excel (CSV)¶

Open allocations_wide.csv directly
Filter by data-type column to see relative or absolute
Filter by approach-short to compare approaches
Use pivot tables with variable column for cross-cutting analysis

Output Schema¶

Output Files¶

Parquet Format (Full Detail)¶

Identifier Columns¶

Data Source Columns¶

Parameter Columns¶

Year Columns¶

Quality Columns¶

CSV Format (Simplified)¶

Column Transformation¶

1. Data Type Column (Added)¶

2. Approach Short (Collapsed)¶

3. Variable Column (Collapsed)¶

4. CSV Column Order¶

5. Excluded Columns¶

Customizing CSV Output¶

Custom Parameter Prefixes¶

Custom Approach Names¶

Working with Outputs¶

In Python (Parquet)¶

In Python (CSV)¶

In R (Parquet)¶

In Excel (CSV)¶

See Also¶