Inference Config

The following is an example configuration for running inference. While you can use absolute paths in the config yamls (we encourage it!), the example uses paths relative to the directory you run the command. The example assumes you are running in a directory structure like:

.
├── ckpt.tar
├── initial_condition
│   └── data.nc  # name must reflect the path in the config
└── forcing
    ├── data1.nc  # files can have any name, but must sort into time-sequential order
    ├── data2.nc  # can have any number of netCDF files
    └── ...

The .nc files correspond to data files like 2021010100.nc in the Zenodo repository, while ckpt.tar corresponds to a file like ace_ckpt.tar in that repository.

The specified initial condition file should contain a time dimension of at least length 1, but can also contain multiple times. If multiple times are present and start_indices is not specified in the configuration, the inference will run an ensemble using all times in the initial condition file. Selections from initial conditions can be made using the start_indices parameter in the configuration.

While netCDF files are specified in the example, Zarr datasets are also compatible. E.g., specifying data.zarr as the path setting engine to zarr in the dataset configuration.

Example YAML Configuration

experiment_dir: inference_output
n_forward_steps: 400  # 100 days
forward_steps_in_memory: 50
checkpoint_path: ckpt.tar
logging:
  log_to_screen: true
  log_to_wandb: false
  log_to_file: true
  project: ace
  entity: your_wandb_entity
initial_condition:
  path: initial_condition/data.nc
forcing_loader:
  dataset:
    data_path: forcing
  num_data_workers: 8
data_writer:
  save_prediction_files: false
  save_monthly_files: false

We use the Builder pattern to load this configuration into a multi-level dataclass structure. The configuration is divided into several sub-configurations, each with its own dataclass. The top-level configuration is the fme.ace.InferenceConfig class.

class fme.ace.InferenceConfig(experiment_dir: str, n_forward_steps: int, checkpoint_path: str, logging: ~fme.core.logging_utils.LoggingConfig, initial_condition: ~fme.ace.inference.inference.InitialConditionConfig, forcing_loader: ~fme.core.data_loading.inference.ForcingDataLoaderConfig, forward_steps_in_memory: int = 10, data_writer: ~fme.ace.inference.data_writer.main.DataWriterConfig = <factory>, aggregator: ~fme.core.aggregator.inference.main.InferenceAggregatorConfig = <factory>, ocean: ~fme.core.ocean.OceanConfig | None = None)[source]

Bases: object

Configuration for running inference.

experiment_dir

Directory to save results to.

Type:: str

n_forward_steps

Number of steps to run the model forward for.

Type:: int

checkpoint_path

Path to stepper checkpoint to load.

Type:: str

logging

Configuration for logging.

Type:: fme.core.logging_utils.LoggingConfig

initial_condition

Configuration for initial condition data.

Type:: fme.ace.inference.inference.InitialConditionConfig

forcing_loader

Configuration for forcing data.

Type:: fme.core.data_loading.inference.ForcingDataLoaderConfig

forward_steps_in_memory

Number of forward steps to complete in memory at a time.

Type:: int

data_writer

Configuration for data writers.

Type:: fme.ace.inference.data_writer.main.DataWriterConfig

aggregator

Configuration for inference aggregator.

Type:: fme.core.aggregator.inference.main.InferenceAggregatorConfig

ocean

Ocean configuration for running inference with a different one than what is used in training.

Type:: fme.core.ocean.OceanConfig | None

The sub-configurations are:

class fme.ace.LoggingConfig(project: str = 'ace', entity: str = 'ai2cm', log_to_screen: bool = True, log_to_file: bool = True, log_to_wandb: bool = True, log_format: str = '%(asctime)s - %(name)s - %(levelname)s - %(message)s', level: str | int = 20)[source]

Bases: object

Configuration for logging.

project

name of the project in Weights & Biases

Type:: str

entity

name of the entity in Weights & Biases

Type:: str

log_to_screen

whether to log to the screen

Type:: bool

log_to_file

whether to log to a file

Type:: bool

log_to_wandb

whether to log to Weights & Biases

Type:: bool

log_format

format of the log messages

Type:: str

class fme.ace.InitialConditionConfig(path: str, engine: Literal['netcdf4', 'h5netcdf', 'zarr'] = 'netcdf4', start_indices: InferenceInitialConditionIndices | ExplicitIndices | TimestampList | None = None)[source]

Bases: object

Configuration for initial conditions.

Note

The data specified under path should contain a time dimension of at least length 1. If multiple times are present in the dataset specified by path, the inference will start an ensemble simulation using each IC along a leading sample dimension. Specific times can be selected from the dataset by using start_indices.

path

The path to the initial conditions dataset.

Type:: str

engine

The engine used to open the dataset.

Type:: Literal[‘netcdf4’, ‘h5netcdf’, ‘zarr’]

start_indices

optional specification of the subset of initial conditions to use.

Type:: fme.core.data_loading.inference.InferenceInitialConditionIndices | fme.core.data_loading.inference.ExplicitIndices | fme.core.data_loading.inference.TimestampList | None

class fme.ace.ForcingDataLoaderConfig(dataset: XarrayDataConfig, num_data_workers: int = 0)[source]

Bases: object

Configuration for the forcing data.

dataset

Configuration to define the dataset.

Type:: fme.core.data_loading.config.XarrayDataConfig

num_data_workers

Number of parallel workers to use for data loading.

Type:: int

class fme.ace.XarrayDataConfig(data_path: str, file_pattern: str = '*.nc', n_repeats: int = 1, engine: ~typing.Literal['netcdf4', 'h5netcdf', 'zarr'] | None = None, spatial_dimensions: ~typing.Literal['healpix', 'latlon'] = 'latlon', subset: ~fme.core.data_loading.config.Slice | ~fme.core.data_loading.config.TimeSlice = <factory>, infer_timestep: bool = True)[source]

Bases: object

data_path

Path to the data.

Type:: str

file_pattern

Glob pattern to match files in the data_path.

Type:: str

n_repeats

Number of times to repeat the dataset (in time). It is up to the user to ensure that the input dataset to repeat results in data that is reasonably continuous across repetitions.

Type:: int

engine

Backend for xarray.open_dataset. Currently supported options are “netcdf4” (the default) and “h5netcdf”. Only valid when using XarrayDataset.

Type:: Literal[‘netcdf4’, ‘h5netcdf’, ‘zarr’] | None

spatial_dimensions

Specifies the spatial dimensions for the grid, default is lat/lon.

Type:: Literal[‘healpix’, ‘latlon’]

subset

Slice defining a subset of the XarrayDataset to load. This can either be a Slice of integer indices or a TimeSlice of timestamps.

Type:: fme.core.data_loading.config.Slice | fme.core.data_loading.config.TimeSlice

infer_timestep

Whether to infer the timestep from the provided data. This should be set to True (the default) for ACE training. It may be useful to toggle this to False for applications like downscaling, which do not depend on the timestep of the data and therefore lack the additional requirement that the data be ordered and evenly spaced in time. It must be set to True if n_repeats > 1 in order to be able to infer the full time coordinate.

Type:: bool

class fme.ace.DataWriterConfig(log_extended_video_netcdfs: bool = False, save_prediction_files: bool = True, save_monthly_files: bool = True, names: Sequence[str] | None = None, save_histogram_files: bool = False, time_coarsen: TimeCoarsenConfig | None = None)[source]

Bases: object

Configuration for inference data writers.

log_extended_video_netcdfs

Whether to enable writing of netCDF files containing video metrics.

Type:: bool

save_prediction_files

Whether to enable writing of netCDF files containing the predictions and target values.

Type:: bool

save_monthly_files

Whether to enable writing of netCDF files containing the monthly predictions and target values.

Type:: bool

names

Names of variables to save in the prediction, histogram, and monthly netCDF files.

Type:: Sequence[str] | None

save_histogram_files

Enable writing of netCDF files containing histograms.

Type:: bool

time_coarsen

Configuration for time coarsening of written outputs.

Type:: fme.ace.inference.data_writer.time_coarsen.TimeCoarsenConfig | None

class fme.ace.InferenceAggregatorConfig(time_mean_reference_data: str | None = None)[source]

Bases: object

Configuration for inference aggregator.

time_mean_reference_data

Path to reference time means to compare against.

Type:: str | None

class fme.ace.OceanConfig(surface_temperature_name: str, ocean_fraction_name: str, interpolate: bool = False, slab: SlabOceanConfig | None = None)[source]

Bases: object

Configuration for determining sea surface temperature from an ocean model.

surface_temperature_name

Name of the sea surface temperature field.

Type:: str

ocean_fraction_name

Name of the ocean fraction field.

Type:: str

interpolate

If True, interpolate between ML-predicted surface temperature and ocean-predicted surface temperature according to ocean_fraction. If False, only use ocean-predicted surface temperature where ocean_fraction>=0.5.

Type:: bool

slab

If provided, use a slab ocean model to predict surface temperature.

Type:: fme.core.ocean.SlabOceanConfig | None