API Reference

fme

class fme.Packer(names)[source]

Responsible for packing tensors into a single tensor.

Parameters:

names (list[str]) –

pack(tensors, axis=0)[source]

Packs tensors into a single tensor, concatenated along a new axis.

Parameters:
  • tensors (dict[str, Tensor]) – Dict from names to tensors.

  • axis (default: 0) – index for new concatenation axis.

Raises:

DataShapesNotUniform – when packed tensors do not all have the same shape.

Return type:

Tensor

class fme.StandardNormalizer(means, stds, fill_nans_on_normalize=False, fill_nans_on_denormalize=False)[source]

Responsible for normalizing tensors.

Parameters:
  • means (dict[str, torch.Tensor]) –

  • stds (dict[str, torch.Tensor]) –

  • fill_nans_on_normalize (bool) –

  • fill_nans_on_denormalize (bool) –

classmethod from_state(state)[source]

Loads state from a serializable data structure.

Return type:

StandardNormalizer

get_state()[source]

Returns state as a serializable data structure.

fme.get_device()[source]

If CUDA is available, return a CUDA device. Otherwise, return a CPU device unless FME_USE_MPS is set, in which case return an MPS device if available.

Return type:

device

fme.gradient_magnitude(tensor, dim=())[source]

Compute the magnitude of gradient across the specified dimensions.

Return type:

Tensor

Parameters:
fme.gradient_magnitude_percent_diff(truth, predicted, weights=None, dim=())[source]

Compute the percent difference of the weighted mean gradient magnitude across the specified dimensions.

Return type:

Tensor

Parameters:
  • truth (Tensor) –

  • predicted (Tensor) –

  • weights (Tensor | None) –

  • dim (int | Iterable[int]) –

fme.rmse_of_time_mean(truth, predicted, weights=None, time_dim=0, spatial_dims=(-2, -1))[source]

Compute the RMSE of the time-average given truth and predicted.

Parameters:
  • truth (Tensor) – truth tensor

  • predicted (Tensor) – predicted tensor

  • weights (Optional[Tensor], default: None) – weights to use for computing spatial RMSE

  • time_dim (int | Iterable[int], default: 0) – time dimension

  • spatial_dims (int | Iterable[int], default: (-2, -1)) – spatial dimensions over which RMSE is calculated

Return type:

Tensor

Returns:

The RMSE between the time-mean of the two input tensors. The time and

spatial dims are reduced.

fme.root_mean_squared_error(truth, predicted, weights=None, dim=())[source]

Compute a weighted root mean square error between truth and predicted.

Namely:

sqrt((weights * ((xhat - x) ** 2)).mean(dims))

Parameters:
  • truth (Tensor) – torch.Tensor whose last dimensions are to be weighted

  • predicted (Tensor) – torch.Tensor whose last dimensions are to be weighted

  • weights (Optional[Tensor], default: None) – torch.Tensor to apply to the squared bias.

  • dim (int | Iterable[int], default: ()) – Dimensions to average over.

Return type:

Tensor

Returns:

A tensor of weighted RMSEs.

fme.spherical_area_weights(lats, num_lon)[source]

Computes area weights given the latitudes of a regular lat-lon grid.

Parameters:
  • lats (ndarray | Tensor) – tensor of shape (…, num_lat,) with the latitudes of the cell centers.

  • num_lon (int) – Number of longitude points.

Return type:

Tensor

Returns:

a torch.tensor of shape (num_lat, num_lon).

fme.time_and_global_mean_bias(truth, predicted, weights=None, time_dim=0, spatial_dims=(-2, -1))[source]

Compute the global- and time-mean bias given truth and predicted.

Parameters:
  • truth (Tensor) – truth tensor

  • predicted (Tensor) – predicted tensor

  • weights (Optional[Tensor], default: None) – weights to use for computing the global mean

  • time_dim (int | Iterable[int], default: 0) – time dimension

  • spatial_dims (int | Iterable[int], default: (-2, -1)) – spatial dimensions over which global mean is calculated

Return type:

Tensor

Returns:

The global- and time-mean bias between the predicted and truth tensors. The

time and spatial dims are reduced.

fme.weighted_mean(tensor, weights=None, dim=(), keepdim=False)[source]

Computes the weighted mean across the specified list of dimensions.

Parameters:
  • tensor (Tensor) – torch.Tensor

  • weights (Optional[Tensor], default: None) – Weights to apply to the mean.

  • dim (int | Iterable[int], default: ()) – Dimensions to compute the mean over.

  • keepdim (bool, default: False) – Whether the output tensor has dim retained or not.

Return type:

Tensor

Returns:

a tensor of the weighted mean averaged over the specified dimensions dim.

fme.weighted_mean_bias(truth, predicted, weights=None, dim=())[source]

Computes the mean bias across the specified list of dimensions assuming that the weights are applied to the last dimensions, e.g. the spatial dimensions.

Parameters:
  • truth (Tensor) – torch.Tensor

  • predicted (Tensor) – torch.Tensor

  • dim (int | Iterable[int], default: ()) – Dimensions to compute the mean over.

  • weights (Optional[Tensor], default: None) – Weights to apply to the mean.

Return type:

Tensor

Returns:

a tensor of the mean biases averaged over the specified dimensions dim.

fme.weighted_mean_gradient_magnitude(tensor, weights=None, dim=())[source]

Compute weighted mean of gradient magnitude across the specified dimensions.

Return type:

Tensor

Parameters:
  • tensor (Tensor) –

  • weights (Tensor | None) –

  • dim (int | Iterable[int]) –

fme.weighted_std(tensor, weights=None, dim=())[source]

Computes the weighted standard deviation across the specified list of dimensions.

Computed by first computing the weighted variance, then taking the square root.

weighted_variance = weighted_mean((tensor - weighted_mean(tensor)) ** 2)) ** 0.5

Parameters:
  • tensor (Tensor) – torch.Tensor

  • weights (Optional[Tensor], default: None) – Weights to apply to the variance.

  • dim (int | Iterable[int], default: ()) – Dimensions to compute the standard deviation over.

Return type:

Tensor

Returns:

a tensor of the weighted standard deviation over the

specified dimensions dim.

fme.ace

class fme.ace.AdditionalInferenceConfig(name, config)[source]
Parameters:
class fme.ace.AtmosphereCorrectorConfig(conserve_dry_air=False, zero_global_mean_moisture_advection=False, moisture_budget_correction=None, force_positive_names=<factory>, total_energy_budget_correction=None)[source]

Configuration for the post-step state corrector.

conserve_dry_air enforces the constraint that:

\[global\_dry\_air = global\_mean(ps - sum_k((ak\_diff + bk\_diff \* ps) \* wat_k))\]

in the generated data is equal to its value in the input data. This is done by adding a globally-constant correction to the surface pressure in each column. As per-mass values such as mixing ratios of water are unchanged, this can cause changes in total water or energy. Note all global means here are area-weighted.

zero_global_mean_moisture_advection enforces the constraint that:

\[global\_mean(tendency\_of\_total\_water\_path\_due\_to\_advection) = 0\]

in the generated data. This is done by adding a globally-constant correction to the moisture advection tendency in each column.

moisture_budget_correction enforces closure of the moisture budget equation:

\[\begin{split}tendency\_of\_total\_water\_path = (evaporation\_rate - precipitation\_rate \\\\ + tendency\_of\_total\_water\_path\_due\_to\_advection)\end{split}\]

in the generated data, where tendency_of_total_water_path is the difference between the total water path at the current timestep and the previous timestep divided by the time difference. This is done by modifying the precipitation, evaporation, and/or moisture advection tendency fields as described in the moisture_budget_correction attribute. When advection tendency is modified, this budget equation is enforced in each column, while when only precipitation or evaporation are modified, only the global mean of the budget equation is enforced.

When enforcing moisture budget closure, we assume the global mean moisture advection is zero. Therefore zero_global_mean_moisture_advection must be True if using a moisture_budget_correction option other than None.

Parameters:
  • conserve_dry_air (bool, default: False) – If True, force the generated data to conserve dry air by subtracting a constant offset from the surface pressure of each column. This can cause changes in per-mass values such as total water or energy.

  • zero_global_mean_moisture_advection (bool, default: False) – If True, force the generated data to have zero global mean moisture advection by subtracting a constant offset from the moisture advection tendency of each column.

  • moisture_budget_correction (Optional[Literal['precipitation', 'evaporation', 'advection_and_precipitation', 'advection_and_evaporation']], default: None) –

    If not “None”, force the generated data to conserve global or column-local moisture by modifying budget fields. Options are:

    • precipitation: multiply precipitation by a scale factor to close the global moisture budget.

    • evaporation: multiply evaporation by a scale factor to close the global moisture budget.

    • advection_and_precipitation: after applying the “precipitation” global-mean correction above, recompute the column-integrated advective tendency as the budget residual, ensuring column budget closure.

    • advection_and_evaporation: after applying the “evaporation” global-mean correction above, recompute the column-integrated advective tendency as the budget residual, ensuring column budget closure.

  • force_positive_names (list[str], default: <factory>) – Names of fields that should be forced to be greater than or equal to zero. This is useful for fields like precipitation.

  • total_energy_budget_correction (Optional[EnergyBudgetConfig], default: None) – If not None, force the generated data to conserve an idealized version of total energy using the provided configuration.

class fme.ace.AugmentationConfig(rotate_probability=0.0, additional_directional_names=<factory>)[source]

Configuration for data augmentation.

Parameters:
  • rotate_probability (float) –

  • additional_directional_names (list[str]) –

rotate_probability

The probability of rotating the sphere by 180 degrees, as a value between 0.0 and 1.0.

additional_directional_names

Names of variables whose sign is flipped when the poles are reversed. By default this includes known directional names as stored in RotateModifier.FLIP_NAMES.

class fme.ace.CappedGELUConfig(cap_value=10, enable_nhwc=False, enable_healpixpad=False)[source]

Configuration for the CappedGELU activation function.

Parameters:
  • cap_value (int, default: 10) – Cap value for the GELU function, default is 10.

  • enable_nhwc (bool, default: False) – Flag to enable NHWC data format, default is False.

  • enable_healpixpad (bool, default: False) – Flag to enable HEALPix padding, default is False.

build()[source]

Builds the CappedGELU activation function.

Return type:

Module

Returns:

CappedGELU activation function.

class fme.ace.CheckpointConfig(after_n_forward_steps=inf, kwargs=<factory>)[source]

Configuration for activation checkpointing.

Trades increased computation in exchange for lowered memory consumption during training by recomputing activations in the backward pass.

Parameters:
  • after_n_forward_steps (float, default: inf) – Number of forward steps to generate before activation checkpointing is applied. Activation checkpointing is not used unless this number is less than the number of forward steps in the optimization.

  • kwargs (Mapping[str, Any], default: <factory>) – Keyword arguments to pass to torch.utils.checkpoint.checkpoint. Note that use_reentrant=False is always explicitly passed as is recommended by the docs.

build(step)[source]

Builds a checkpoint function.

Parameters:

step (int) – The current zero-indexed step number.

Return type:

Checkpoint | NoCheckpoint

Returns:

A checkpoint function.

class fme.ace.CheckpointStepperConfig(checkpoint_path)[source]

Defines a stepper by loading its configuration from a saved checkpoint.

Does not affect weight initialization, which is handled in a different configuration (likely parameter initialization under stepper_training).

Parameters:

checkpoint_path (str) – Path to a serialized checkpoint containing a stepper.

class fme.ace.ConcatDatasetConfig(concat, strict=True)[source]

Configuration for concatenating multiple datasets across time.

Parameters:
  • concat (Sequence[XarrayDataConfig]) – List of XarrayDataConfig objects to concatenate.

  • strict (bool, default: True) – Whether to enforce that the datasets to be concatenated have the same dimensions and spatial coordinates.

property available_labels: set[str] | None

Return the labels that are available in the dataset.

class fme.ace.ConstantConfig(amplitude=1.0)[source]

Configuration for a constant perturbation.

Parameters:

amplitude (float) –

class fme.ace.ConvBlockConfig(in_channels=3, out_channels=1, kernel_size=3, dilation=1, n_layers=1, stride=2, upscale_factor=4, latent_channels=None, upsampling=None, activation=None, enable_nhwc=False, enable_healpixpad=False, block_type='BasicConvBlock')[source]

Configuration for the convolutional block.

Parameters:
  • in_channels (int, default: 3) – Number of input channels, default is 3.

  • out_channels (int, default: 1) – Number of output channels, default is 1.

  • kernel_size (int, default: 3) – Size of the kernel, default is 3.

  • dilation (int, default: 1) – Dilation rate, default is 1.

  • n_layers (int, default: 1) – Number of layers, default is 1.

  • upsampling (Optional[UpsamplingBlockConfig], default: None) – Upsampling factor for TransposedConvUpsample, default is 2.

  • upscale_factor (int, default: 4) – Upscale factor for ConvNeXtBlock and SymmetricConvNeXtBlock, default is 4.

  • latent_channels (Optional[int], default: None) – Number of latent channels, default is None.

  • activation (Optional[CappedGELUConfig], default: None) – Activation configuration, default is None.

  • enable_nhwc (bool, default: False) – Flag to enable NHWC data format, default is False.

  • enable_healpixpad (bool, default: False) – Flag to enable HEALPix padding, default is False.

  • block_type (Literal['BasicConvBlock', 'ConvNeXtBlock', 'SymmetricConvNeXtBlock', 'ConvThenUpsample', 'TransposedConvUpsample'], default: 'BasicConvBlock') – Type of block, default is “BasicConvBlock”.

  • stride (int) –

build()[source]

Builds the convolutional block model.

Return type:

Module

Returns:

Convolutional block model.

class fme.ace.CopyWeightsConfig(include=<factory>, exclude=None)[source]

Configuration for copying weights from a base model to a target model.

Used during training to overwrite weights after every batch of data, to have the effect of “freezing” the overwritten weights. When the target parameters have longer dimensions than the base model, only the initial slice is overwritten.

This is used to achieve an effect of freezing model parameters that can freeze a subset of each weight that comes from a smaller base weight. This is less efficient than true parameter freezing, but layer freezing is all-or-nothing for each parameter.

Parameters:
  • include (list[str], default: <factory>) – list of wildcard patterns to overwrite, if given then only these parameters are overwritten

  • exclude (Optional[list[str]], default: None) – list of wildcard patterns to exclude from overwriting, if given then all parameters except these are overwritten. Cannot be given together with include.

apply(weights, modules)[source]

Apply base weights to modules according to the include/exclude lists of this instance.

In order to “freeze” the weights during training, this must be called after each time the weights are updated in the training loop.

Parameters:
  • weights (list[Mapping[str, Any]]) – list of base weights to apply

  • modules (list[Module]) – list of modules to apply the weights to

class fme.ace.CorrectorSelector(type, config)[source]

A dataclass containing all the information needed to build a CorrectorConfigABC, including the type of the CorrectorConfigABC and the data needed to build it.

This is helpful as CorrectorSelector can be serialized and deserialized without any additional information, whereas to load a CorrectorConfigABC you would need to know the type of the CorrectorConfigABC being loaded.

It is also convenient because CorrectorSelector is a single class that can be used to represent any CorrectorConfigABC, whereas CorrectorConfigABC is an ABC that can be implemented by many different classes.

Parameters:
  • type (str) – the type of the CorrectorConfigABC

  • config (Mapping[str, Any]) – data for a CorrectorConfigABC instance of the indicated type

classmethod get_available_types()[source]

This class method is used to expose all available types of Correctors.

Return type:

set[str]

classmethod register(type_name)[source]

Register a virtual subclass of an ABC.

Returns the subclass, to allow usage as a class decorator.

class fme.ace.DataLoaderConfig(dataset, batch_size, num_data_workers=0, prefetch_factor=None, augmentation=<factory>, sample_with_replacement=None, time_buffer=0)[source]

Configuration for a data loader for training/validation.

Parameters:
  • dataset (ConcatDatasetConfig | MergeDatasetConfig | XarrayDataConfig) – Could be a single dataset configuration, or a sequence of datasets to be concatenated using the keyword concat, or datasets from different sources to be merged using the keyword merge.

  • batch_size (int) – Number of samples per batch.

  • num_data_workers (int, default: 0) – Number of parallel workers to use for data loading.

  • prefetch_factor (Optional[int], default: None) – how many batches a single data worker will attempt to hold in host memory at a given time.

  • augmentation (AugmentationConfig, default: <factory>) – Configuration for data augmentation.

  • sample_with_replacement (Optional[int], default: None) – If provided, the dataset will be sampled randomly with replacement to the given size each period, instead of retrieving each sample once (either shuffled or not).

  • time_buffer (int, default: 0) – How many more continuous timesteps to load in memory than the required number of timesteps for a single batch. Setting this to greater than 0 should improve data loading performance, however, it also decreases the independence of subsequent batches if shuffled batches are desired.

Note

Setting time_buffer to a value greater than 0 results in pre-loading samples of length time_buffer + n_timesteps_required, where n_timesteps_required is the number of timesteps required for training the model (initial condition(s) plus forward step(s)). These pre-loaded samples become a window from which samples of the required length are drawn without replacement. The windows will overlap by an amount such that no samples are skipped, with exception of the last window, which is dropped if incomplete. This is useful for improving data loading throughput and reducing the number of reads. There must be enough pre-loaded samples in the dataset to produce at least one batch at the configured batch size. Independent data will be seen every time_buffer + 1 batches, i.e., this is the number of samples in each pre-loaded window.

property available_labels: set[str] | None

Return the labels that are available in the dataset.

property zarr_engine_used: bool

Whether any of the configured datasets are using the Zarr engine.

class fme.ace.DataWriterConfig(save_prediction_files=True, save_monthly_files=True, names=None, time_coarsen=None, files=None)[source]

Configuration for inference data writers.

Parameters:
  • save_prediction_files (bool, default: True) – Whether to enable writing of netCDF files containing the predictions and target values.

  • save_monthly_files (bool, default: True) – Whether to enable writing of netCDF files containing the monthly predictions and target values.

  • names (Optional[Sequence[str]], default: None) – Names of variables to save in the prediction and monthly netCDF files.

  • time_coarsen (Optional[TimeCoarsenConfig], default: None) – Configuration for time coarsening of written outputs to the raw data writer.

  • files (Optional[list[FileWriterConfig]], default: None) – Configuration for a sequence of individual data writers. Each data writer must have a unique label to avoid filename collisions.

class fme.ace.DerivedForcingsConfig(insolation=None)[source]

Configuration for computing derived forcings.

Parameters:

insolation (Optional[InsolationConfig], default: None) – Optional configuration for computing derived insolation.

build(dataset_info)[source]

Build a ForcingDeriver insstance with the current configuration.

Parameters:

dataset_info (DatasetInfo) – Dataset information associated with the Stepper.

Return type:

ForcingDeriver

update_requirements(requirements)[source]

Add or remove names from the requirements associated with derived forcings.

Parameters:

requirements (DataRequirements) – The requirements to update.

Return type:

DataRequirements

validate_replacement(replacement)[source]

Check that a replacement configuration is compatible with the current.

Parameters:

replacement (DerivedForcingsConfig) – The configuration replacing the current configuration.

Raises:

ValueError – If the insolation_name of the replacement configuration is incompatible with the current.

Return type:

None

class fme.ace.DownsamplingBlockConfig(block_type, pooling=2, enable_nhwc=False, enable_healpixpad=False)[source]

Configuration for the downsampling block. Generally, either a pooling block or a striding conv block.

Parameters:
  • block_type (Literal['MaxPool', 'AvgPool']) – Type of recurrent block, either “MaxPool” or “AvgPool”

  • pooling (int, default: 2) – Pooling size

  • enable_nhwc (bool, default: False) – Flag to enable NHWC data format, default is False.

  • enable_healpixpad (bool, default: False) – Flag to enable HEALPix padding, default is False.

build()[source]

Builds the recurrent block model.

Return type:

Module

Returns:

Recurrent block.

class fme.ace.EMAConfig(decay=0.9999, resume_ema_ckpt_path=None)[source]

Configuration for exponential moving average of model weights.

Parameters:
  • decay (float, default: 0.9999) – decay rate for the moving average

  • resume_ema_ckpt_path (Optional[str], default: None) – Optional path to a training checkpoint (ckpt.tar) whose EMA running state (averaged weights and update counter) should be loaded into the freshly-built EMATracker for fine-tuning. The current config’s decay is kept; only the running state is transferred. Intended for non-resuming jobs; preemption resume in the Trainer overrides this state via EMATracker.from_state.

class fme.ace.ExplicitIndices(list)[source]

Configure indices providing them explicitly.

Parameters:

list (Sequence[int]) – List of integer indices.

class fme.ace.FileWriterConfig(label, names=None, lat_extent=None, lon_extent=None, time_selection=None, save_reference=True, time_coarsen=None, format=<factory>, separate_ensemble_members=False)[source]

Configuration for writing output data.

Parameters:
  • label (str) – A label used for the filename output for this output dataset.

  • names (Optional[list[str]], default: None) – The names of the variables to save. If not specified, all available variables will be saved.

  • lat_extent (Optional[Sequence[float]], default: None) – The latitude extent of the region as (min_lat, max_lat). If not set, all latitudes are included.

  • lon_extent (Optional[Sequence[float]], default: None) – The longitude extent of the region as (min_lon, max_lon). If not set, all longitudes are included.

  • time_selection (UnionType[Slice, MonthSelector, TimeSlice, None], default: None) – Optional time selection criteria. Can be an Slice, MonthSelector, or TimeSlice. If None, all times are selected. A Slice can select an index range of steps in an inference, the MonthSelector can be used to target specific seasons or months for outputs, and a TimeSlice allows for datetime range selection.

  • save_reference (bool, default: True) – Whether to save the reference/target data alongside predictions. If true, “_target” will be appended to the label for the target data, and “_predictions” will be appended to the label for the predictions data. Ignored if building a single writer via the build method.

  • time_coarsen (UnionType[TimeCoarsenConfig, MonthlyCoarsenConfig, None], default: None) – Configuration for time averaging of outputs.

  • format (NetCDFWriterConfig | ZarrWriterConfig, default: <factory>) – Configuration for the output format (i.e. netCDF or zarr).

  • separate_ensemble_members (bool, default: False) – Option to write ensemble members to separate files. In this case, time is a datetime coordinate. Only supported when using zarr format. Filenames will have the suffix _ic{member_index} appended before the file extension.

build(experiment_dir, initial_condition_times, n_timesteps, timestep, variable_metadata, coords, dataset_metadata)[source]

Build a FileWriter object for saving data within the specified region.

Parameters:
  • experiment_dir (str) – The directory where experiment outputs are saved.

  • initial_condition_times (ndarray[tuple[Any, ...], dtype[datetime]]) – 1D array of initial condition times (start time for each inference run).

  • n_timesteps (int) – Total number of inference forward steps.

  • timestep (timedelta) – The time delta between each timestep.

  • variable_metadata (Mapping[str, VariableMetadata]) – Metadata for each variable.

  • coords (Mapping[str, ndarray]) – Coordinate arrays for the dataset. These should be the coordinates of the entire global domain, not the subset region coordinates.

  • dataset_metadata (DatasetMetadata) – Metadata for the entire dataset.

Return type:

Union[FileWriter, TimeCoarsen]

class fme.ace.FillNaNsConfig(method='constant', value=0.0)[source]

Configuration to fill NaNs with a constant value or others.

Parameters:
  • method (Literal['constant'], default: 'constant') – Type of fill operation. Currently only ‘constant’ is supported.

  • value (float, default: 0.0) – Value to fill NaNs with.

class fme.ace.FloeNetBuilder(latent_dimension=256, activation='SiLU', meshes=6, M0=4, bias=True, radius_fraction=1.0, layernorm=True, processor_steps=4, residual=True, is_ocean=True)[source]

Configuration for the M2Lines FloeNet architecture.

Parameters:
  • latent_dimension (int) –

  • activation (str) –

  • meshes (int) –

  • M0 (int) –

  • bias (bool) –

  • radius_fraction (float) –

  • layernorm (bool) –

  • processor_steps (int) –

  • residual (bool) –

  • is_ocean (bool) –

build(n_in_channels, n_out_channels, dataset_info)[source]

Build a nn.Module given information about the input and output channels and the dataset.

Parameters:
  • n_in_channels (int) – number of input channels

  • n_out_channels (int) – number of output channels

  • dataset_info (DatasetInfo) – Information about the dataset, including img_shape, horizontal coordinates, vertical coordinate, etc.

Returns:

a nn.Module

class fme.ace.ForcingDataLoaderConfig(dataset, num_data_workers=0, perturbations=None, persistence_names=None)[source]

Configuration for the forcing data.

Parameters:
  • dataset (XarrayDataConfig | MergeNoConcatDatasetConfig) – Configuration to define the dataset.

  • num_data_workers (int, default: 0) – Number of parallel workers to use for data loading.

  • perturbations (Optional[SSTPerturbation], default: None) – Configuration for SST perturbations used in forcing data.

  • persistence_names (Optional[Sequence[str]], default: None) – Names of variables for which all returned values will be the same as the initial condition. When evaluating initial condition predictability, set this to forcing variables that should not be updated during inference (e.g. surface temperature).

class fme.ace.FrozenParameterConfig(include=<factory>, exclude=None)[source]

Configuration for freezing parameters in a model.

Parameter names are the names used in the module’s state_dict. Here they can include wildcards, e.g. “encoder.*” will select all parameters in the encoder, while “encoder.*.bias” will select all bias parameters in the encoder.

An exception is raised when this configuration is applied (e.g. at the start of training) if both lists are non-empty.

By default no parameters are frozen.

Parameters:
  • include (list[str], default: <factory>) – list of parameter names to freeze (set requires_grad = False), if given then all other parameters are left unfrozen

  • exclude (Optional[list[str]], default: None) – list of parameter names to ignore, if given then all other parameters are frozen. Cannot be given if include is given.

class fme.ace.GreensFunctionConfig(amplitude=1.0, lat_center=0.0, lon_center=0.0, lat_width=10.0, lon_width=10.0)[source]

Configuration for a single sinusoidal patch of a Green’s function perturbation. See equation 1 in Bloch‐Johnson, J., et al. (2024).

Parameters:
  • amplitude (float, default: 1.0) – The amplitude of the perturbation, maximum is reached at (lat_center, lon_center).

  • lat_center (float, default: 0.0) – The latitude at the center of the patch in degrees.

  • lon_center (float, default: 0.0) – The longitude at the center of the patch in degrees.

  • lat_width (float, default: 10.0) – latitudinal width of the patch in degrees.

  • lon_width (float, default: 10.0) – longitudinal width of the patch in degrees.

class fme.ace.GriddedOperations[source]
classmethod from_state(state)[source]

Given a dictionary with a “type” key and a “state” key, return the GriddedOperations it describes.

The “type” key should be the name of a subclass of GriddedOperations, and the “state” key should be a dictionary specific to that subclass.

Parameters:

state (dict[str, Any]) – A dictionary with a “type” key and a “state” key.

Return type:

GriddedOperations

Returns:

An instance of the subclass.

abstract get_initialization_kwargs()[source]

Get the keyword arguments needed to initialize the instance.

Return type:

dict[str, Any]

class fme.ace.HEALPixRecUNetBuilder(encoder, decoder, presteps=1, input_time_size=0, output_time_size=0, delta_time='6h', reset_cycle='24h', n_constants=2, decoder_input_channels=1, prognostic_variables=7, enable_nhwc=False, enable_healpixpad=False)[source]

Configuration for the HEALPixRecUNet architecture used in DLWP.

Parameters:
  • presteps (int, default: 1) – Number of pre-steps, by default 1.

  • input_time_size (int, default: 0) – Input time dimension, by default 0.

  • output_time_size (int, default: 0) – Output time dimension, by default 0.

  • delta_time (str, default: '6h') – Delta time interval, by default “6h”.

  • reset_cycle (str, default: '24h') – Reset cycle interval, by default “24h”.

  • input_channels – Number of input channels, by default 8.

  • output_channels – Number of output channels, by default 8.

  • n_constants (int, default: 2) – Number of constant input channels, by default 2.

  • decoder_input_channels (int, default: 1) – Number of input channels for the decoder, by default 1.

  • enable_nhwc (bool, default: False) – Flag to enable NHWC data format, by default False.

  • enable_healpixpad (bool, default: False) – Flag to enable HEALPix padding, by default False.

  • encoder (UNetEncoderConfig) –

  • decoder (UNetDecoderConfig) –

  • prognostic_variables (int) –

build(n_in_channels, n_out_channels, dataset_info)[source]

Builds the HEALPixRecUNet model.

Parameters:
  • n_in_channels (int) – Number of input channels.

  • n_out_channels (int) – Number of output channels.

  • dataset_info (DatasetInfo) – Information about the dataset.

Return type:

Module

Returns:

HEALPixRecUNet model.

class fme.ace.IceCorrectorConfig(budget_correction=None)[source]
Parameters:

budget_correction (IceBudgetCorrectionConfig | None) –

class fme.ace.InferenceAggregatorConfig(time_mean_reference_data=None, log_global_mean_time_series=True)[source]

Configuration for inference aggregator.

Parameters:
  • time_mean_reference_data (Optional[str], default: None) – Path to reference time means to compare against.

  • log_global_mean_time_series (bool, default: True) – Whether to log global mean time series metrics.

class fme.ace.InferenceConfig(experiment_dir, n_forward_steps, checkpoint_path, logging, initial_condition, forcing_loader, forward_steps_in_memory=10, data_writer=<factory>, aggregator=<factory>, stepper_override=None, allow_incompatible_dataset=False, labels=None, n_ensemble_per_ic=1)[source]

Configuration for running inference.

Parameters:
  • experiment_dir (str) –

    Directory to save results to. This can be a local directory, like /results, or a remote directory prefixed with a protocol recognized by fsspec, like gs://bucket/results.

    Note

    While most types of output can be written to a remote experiment_dir, there are some limitations:

    • To write raw or time-coarsened data, the zarr writer must be used. See the files parameter of the fme.ace.DataWriterConfig for more details on how this can be configured. Note that monthly coarsened data cannot currently be written to zarr, and hence a remote directory, since it uses a different code path than uniformly coarsened data.

    • Piping logging output to a file in the experiment_dir is not supported. To silence the warning related to this, set log_to_file to False in the fme.ace.LoggingConfig.

    There are no restrictions on the types of output that can be written to a local experiment_dir.

  • n_forward_steps (int) – Number of steps to run the model forward for.

  • checkpoint_path (str) – Path to stepper checkpoint to load.

  • logging (LoggingConfig) – Configuration for logging.

  • initial_condition (InitialConditionConfig) – Configuration for initial condition data.

  • forcing_loader (ForcingDataLoaderConfig) – Configuration for forcing data.

  • forward_steps_in_memory (int, default: 10) – Number of forward steps to complete in memory at a time.

  • data_writer (DataWriterConfig, default: <factory>) – Configuration for data writers.

  • aggregator (InferenceAggregatorConfig, default: <factory>) – Configuration for inference aggregator.

  • stepper_override (Optional[StepperOverrideConfig], default: None) – Configuration for overriding select stepper configuration options at inference time (optional).

  • allow_incompatible_dataset (bool, default: False) – If True, allow the dataset used for inference to be incompatible with the dataset used for stepper training. This should be used with caution, as it may allow the stepper to make scientifically invalid predictions, but it can allow running inference with incorrectly formatted or missing grid information.

  • labels (Optional[list[str]], default: None) – Dataset labels to use for inference. If provided, these labels will be provided to the stepper for every initial condition.

  • n_ensemble_per_ic (int, default: 1) – Number of ensemble members per initial condition. Useful for stochastic model weather inference. n_ensemble_per_ic = 1 is default inference behavior.

class fme.ace.InferenceDataLoaderConfig(dataset, start_indices, num_data_workers=0, perturbations=None, persistence_names=None)[source]

Configuration for inference data.

This is like the DataLoaderConfig class, but with some additional constraints. During inference, we have only one batch, so the number of samples directly determines the size of that batch.

Parameters:
  • dataset (XarrayDataConfig | MergeNoConcatDatasetConfig) – Configuration to define the dataset.

  • start_indices (InferenceInitialConditionIndices | ExplicitIndices | TimestampList) – Configuration of the indices for initial conditions during inference. This can be a list of timestamps, a list of integer indices, or a slice configuration of the integer indices. Values following the initial condition will still come from the full dataset.

  • num_data_workers (int, default: 0) – Number of parallel workers to use for data loading.

  • perturbations (Optional[SSTPerturbation], default: None) – Configuration for SST perturbations.

  • persistence_names (Optional[Sequence[str]], default: None) – Names of variables for which all returned values will be the same as the initial condition. When evaluating initial condition predictability, set this to forcing variables that should not be updated during inference (e.g. surface temperature).

property zarr_engine_used: bool

Whether any of the configured datasets are using the Zarr engine.

class fme.ace.InferenceEvaluatorAggregatorConfig(log_histograms=False, log_video=False, log_extended_video=False, log_zonal_mean_images=4096, log_seasonal_means=False, log_global_mean_time_series=True, log_global_mean_norm_time_series=True, monthly_reference_data=None, time_mean_reference_data=None, log_nino34_index=True, log_step_means=<factory>)[source]

Configuration for inference evaluator aggregator.

Parameters:
  • log_histograms (bool, default: False) – Whether to log histograms of the targets and predictions.

  • log_video (bool, default: False) – Whether to log videos of the state evolution.

  • log_extended_video (bool, default: False) – Whether to log wandb videos of the predictions with statistical metrics, only done if log_video is True.

  • log_zonal_mean_images (bool | int, default: 4096) – Whether to log zonal-mean images (hovmollers) with a time dimension. If greater than 0 zonal-mean images will be logged. The value of log_zonal_mean_images is default to 4096 (2**12) and can be set with a maximum of 32768 (2**15) (limited by matplotlib).

  • log_seasonal_means (bool, default: False) – Whether to log seasonal mean metrics and images.

  • log_global_mean_time_series (bool, default: True) – Whether to log global mean time series metrics.

  • log_global_mean_norm_time_series (bool, default: True) – Whether to log the normalized global mean time series metrics.

  • monthly_reference_data (Optional[str], default: None) – Path to monthly reference data to compare against.

  • time_mean_reference_data (Optional[str], default: None) – Path to reference time means to compare against.

  • log_step_means (list[StepMeanEntry], default: <factory>) – List of StepMeanEntry objects specifying steps at which to log mean metrics.

  • log_nino34_index (bool) –

class fme.ace.InferenceEvaluatorConfig(experiment_dir, n_forward_steps, checkpoint_path, logging, loader, forward_steps_in_memory, prediction_loader=None, data_writer=<factory>, aggregator=<factory>, stepper_override=None, allow_incompatible_dataset=False, validation=None, n_ensemble_per_ic=1)[source]

Configuration for running inference including comparison to reference data.

Parameters:
  • experiment_dir (str) –

    Directory to save results to. This can be a local directory, like /results, or a remote directory prefixed with a protocol recognized by fsspec, like gs://bucket/results.

    Note

    While most types of output can be written to a remote experiment_dir, there are some limitations:

    • To write raw or time-coarsened data, the zarr writer must be used. See the files parameter of the fme.ace.DataWriterConfig for more details on how this can be configured. Note that monthly coarsened data cannot currently be written to zarr, and hence a remote directory, since it uses a different code path than uniformly coarsened data.

    • Piping logging output to a file in the experiment_dir is not supported. To silence the warning related to this, set log_to_file to False in the fme.ace.LoggingConfig.

    There are no restrictions on the types of output that can be written to a local experiment_dir.

  • n_forward_steps (int) – Number of steps to run the model forward for.

  • checkpoint_path (str) – Path to stepper checkpoint to load.

  • logging (LoggingConfig) – configuration for logging.

  • loader (InferenceDataLoaderConfig) – Configuration for data to be used as initial conditions, forcing, and target in inference.

  • prediction_loader (Optional[InferenceDataLoaderConfig], default: None) – Configuration for prediction data to evaluate. If given, model evaluation will not run, and instead predictions will be evaluated. Model checkpoint will still be used to determine inputs and outputs.

  • forward_steps_in_memory (int) – Number of forward steps to complete in memory at a time, will load one more step for initial condition.

  • data_writer (DataWriterConfig, default: <factory>) – Configuration for data writers.

  • aggregator (InferenceEvaluatorAggregatorConfig, default: <factory>) – Configuration for inference evaluator aggregator.

  • stepper_override (Optional[StepperOverrideConfig], default: None) – Configuration for overriding select stepper configuration options at inference time (optional).

  • allow_incompatible_dataset (bool, default: False) – If True, allow the forcing dataset used for inference to be incompatible with the dataset used for stepper training. This should be used with caution, as it may allow the stepper to make scientifically invalid predictions, but it can allow running inference with incorrectly formatted or missing grid information.

  • validation (Optional[ValidationConfig], default: None) – Optional configuration for running a one-step validation loop before inference. When provided, validation runs first and produces metrics prefixed with val/ (e.g. val/mean/weighted_rmse), mirroring the validation done at the end of each training epoch.

  • n_ensemble_per_ic (int, default: 1) – Number of ensemble members per initial condition. Useful for stochastic model weather inference. n_ensemble_per_ic = 1 is default inference behavior.

class fme.ace.InferenceInitialConditionIndices(n_initial_conditions, first=0, interval=1)[source]

Configuration of the indices for initial conditions during inference.

Parameters:
  • n_initial_conditions (int) – Number of initial conditions to use.

  • first (int, default: 0) – Index of the first initial condition.

  • interval (int, default: 1) – Interval between initial conditions.

class fme.ace.InitialConditionConfig(path, engine='netcdf4', start_indices=None)[source]

Configuration for initial conditions.

Note

The data specified under path should contain a time dimension of at least length 1. If multiple times are present in the dataset specified by path, the inference will start an ensemble simulation using each IC along a leading sample dimension. Specific times can be selected from the dataset by using start_indices.

Parameters:
class fme.ace.InlineInferenceConfig(loader, n_forward_steps, forward_steps_in_memory, n_ensemble_per_ic=1, epochs=<factory>, aggregator=<factory>)[source]
Parameters:
  • loader (InferenceDataLoaderConfig) – configuration for the data loader used during inference

  • n_forward_steps (int) – number of forward steps to take

  • forward_steps_in_memory (int) – number of forward steps to take before re-reading data from disk

  • n_ensemble_per_ic (int, default: 1) – number of initial condition based ensembles

  • epochs (Slice, default: <factory>) – epochs on which to run inference. By default runs inference every epoch.

  • aggregator (InferenceEvaluatorAggregatorConfig, default: <factory>) – configuration of inline inference aggregator.

class fme.ace.InsolationConfig(insolation_name, solar_constant, obliquity=23.439, eccentricity=0.0167, longitude_of_perhelion=102.932)[source]

Configuration for computing insolation.

Currently only supports computing the insolation as in GFDL’s CM4 model.

Parameters:
  • insolation_name (str) – name to assign the computed insolation; must be present as an input to your model.

  • solar_constant (NameConfig | ValueConfig) – configuration for setting the solar constant to a scalar value or loading a time-varying value from disk. Configure as a value to use the same scalar value for all time. Configure as a name to load a potentially time-varying value from disk. The computed insolation will share the same dtype as the solar constant.

  • obliquity (float, default: 23.439) – angle of the axis of rotation of the Earth with the normal to the orbital plane in units of degrees.

  • eccentricity (float, default: 0.0167) – eccentricity of the orbit of the Earth.

  • longitude_of_perhelion (float, default: 102.932) – orbital angle of perhelion in units of degrees, measured relative to the orbital position of the autumnal equinox in the Northern Hemisphere.

Descriptions of the orbital parameters are paraphrased from a PostScript-format technical document in GFDL’s Flexible Modeling System repository. Definitions align with those in Held (1982), with the one minor difference that the longitude_of_perhelion in this case is defined with respect to the autumnal equinox rather than the vernal equinox.

build(timestep, horizontal_coordinates)[source]

Build an Insolation instance with the current configuration.

Parameters:
  • timestep (timedelta) – Timestep over which to average the insolation.

  • horizontal_coordinates (HorizontalCoordinates) – Horizontal grid over which to compute the insolation.

Return type:

Insolation

build_insolation_function()[source]

Build the insolation function for the current configuration.

Return type:

CM4Insolation

update_requirements(requirements)[source]

Add or remove names from the requirements associated with the insolation.

Parameters:

requirements (DataRequirements) – The requirements to update.

Return type:

DataRequirements

class fme.ace.LRTuningConfig(lr_factor=0.5, num_batches=200, epochs=<factory>, improvement_threshold=0.001)[source]

Configuration for periodic learning rate tuning trials.

At the start of epochs contained in epochs, the trainer forks the current model into a baseline and a candidate copy. Both are trained for num_batches on the first batches of the epoch; the candidate uses a learning rate of current_lr * lr_factor. Both are then validated. If the candidate’s validation loss is less than the baseline’s by at least improvement_threshold times the baseline’s validation loss, the trainer adopts the candidate’s learning rate.

Parameters:
  • epochs (Slice, default: <factory>) – A Slice selecting which epochs to run trials on. For example, Slice(start=1, step=2) runs at epochs 1, 3, 5, … (skipping epoch 0).

  • lr_factor (float, default: 0.5) – Multiply the current LR by this to get the candidate LR.

  • num_batches (int, default: 200) – Number of training batches for each fork in the trial.

  • improvement_threshold (float, default: 0.001) – The candidate must beat the baseline’s validation loss by at least this fraction of the baseline’s validation loss (e.g. 0.01 means the candidate must be lower by at least 1% of the baseline loss).

class fme.ace.LandNetBuilder(hidden_dims=<factory>, network_type='MLP', use_positional_embedding=False)[source]

Configuration for the LandNet architecture.

Parameters:
  • hidden_dims (list[int]) –

  • network_type (Literal['MLP']) –

  • use_positional_embedding (bool) –

build(n_in_channels, n_out_channels, dataset_info)[source]

Build a nn.Module given information about the input and output channels and the dataset.

Parameters:
  • n_in_channels (int) – number of input channels

  • n_out_channels (int) – number of output channels

  • dataset_info (DatasetInfo) – Information about the dataset, including img_shape, horizontal coordinates, vertical coordinate, etc.

Returns:

a nn.Module

class fme.ace.LoggingConfig(project='ace', entity='ai2cm', log_to_screen=True, log_to_file=True, log_to_wandb=True, metrics_log_dir=None, log_format='%(asctime)s - %(name)s - %(levelname)s - %(message)s', level=20, wandb_dir_in_experiment_dir=False)[source]

Configuration for logging.

Parameters:
  • project (str, default: 'ace') – Name of the project in Weights & Biases.

  • entity (str, default: 'ai2cm') – Name of the entity in Weights & Biases.

  • log_to_screen (bool, default: True) – Whether to log to the screen.

  • log_to_file (bool, default: True) – Whether to log to a file.

  • log_to_wandb (bool, default: True) – Whether to log to Weights & Biases.

  • metrics_log_dir (Optional[str], default: None) – Directory to write scalar metrics to disk as JSONL. If None, disk metric logging is disabled.

  • log_format (str, default: '%(asctime)s - %(name)s - %(levelname)s - %(message)s') – Format of the log messages.

  • level (str | int, default: 20) – Sets the logging level.

  • wandb_dir_in_experiment_dir (bool, default: False) – Whether to create the wandb_dir in the experiment_dir or in local /tmp (default False).

configure_logging(experiment_dir, log_filename, config, resumable=True)[source]

Configure global logging settings, including WandB, and output initial logs of the runtime environment.

Parameters:
  • experiment_dir (str) – Directory to save logs to.

  • log_filename (str) – Name of the log file.

  • config (Mapping[str, Any]) – Configuration dictionary to log to WandB.

  • resumable (bool, default: True) – Whether this is a resumable run.

class fme.ace.MergeDatasetConfig(merge)[source]

Configuration for merging multiple datasets. Merging means combining variables from multiple datasets, each of which must have the same time coordinate. If multiple datasets contain the same data variable, the version from the first source is loaded and other sources are ignored.

Parameters:

merge (Sequence[ConcatDatasetConfig | XarrayDataConfig]) – List of dataset configurations to merge.

property available_labels: set[str] | None

Return the labels that are available in the dataset.

class fme.ace.MergeNoConcatDatasetConfig(merge)[source]

Configuration for merging multiple datasets. Merging means combining variables from multiple datasets, each of which must have the same time coordinate. If multiple datasets contain the same data variable, the version from the first source is loaded and other sources are ignored. For MergeNoConcatDatasetConfig, the datasets being merged may not be concatenated datasets.

Parameters:

merge (Sequence[XarrayDataConfig]) – List of dataset configurations to merge.

property available_labels: set[str] | None

Return the labels that are available in the dataset.

class fme.ace.ModuleSelector(type, config, conditional=False)[source]

A dataclass containing all the information needed to build a ModuleConfig, including the type of the ModuleConfig and the data needed to build it.

This is helpful as ModuleSelector can be serialized and deserialized without any additional information, whereas to load a ModuleConfig you would need to know the type of the ModuleConfig being loaded.

It is also convenient because ModuleSelector is a single class that can be used to represent any ModuleConfig, whereas ModuleConfig is a protocol that can be implemented by many different classes.

Parameters:
  • type (str) – the type of the ModuleConfig

  • config (Mapping[str, Any]) – data for a ModuleConfig instance of the indicated type

  • conditional (bool, default: False) – whether to condition the predictions on batch labels.

build(n_in_channels, n_out_channels, dataset_info)[source]

Build a nn.Module given information about the input and output channels and the dataset.

Parameters:
  • n_in_channels (int) – number of input channels

  • n_out_channels (int) – number of output channels

  • dataset_info (DatasetInfo) – Information about the dataset, including img_shape (shape of last two dimensions of data, e.g. latitude and longitude), horizontal coordinates, vertical coordinate, etc.

Return type:

Module

Returns:

a Module object

classmethod get_available_types()[source]

This class method is used to expose all available types of Modules.

class fme.ace.MultiCallConfig(forcing_name, forcing_multipliers, output_names)[source]

Configuration for doing ‘multi-call’ predictions where an input variable (e.g. CO2) is varied by multiplying by floats and then certain output variables (e.g. radiative heating or fluxes) are predicted.

Parameters:
  • forcing_name (str) – name of the variable to perturb in the forcing data, e.g. “co2”.

  • forcing_multipliers (dict[str, float]) – mapping from a label suffix to a multiplier that is applied to the ‘forcing_name’ variable. For example, could be {“_quadrupled_co2”: 4, “_halved_co2”: 0.5}. The suffixes will be appended to the output_names below.

  • output_names (list[str]) – names of the variables to predict given perturbed forcing. For example, [“ULWRFtoa”, “USWRFsfc”].

property names: list[str]

Return the names of all multi-called output variables, often radiative fluxes.

E.g. [‘ULWRFtoa_quadrupled_co2’].

class fme.ace.MultiCallStepConfig(wrapped_step, config=None, include_multi_call_in_loss=True)[source]

Configuration for a multi-call step.

Parameters:
  • wrapped_step (StepSelector) – The step to wrap.

  • config (Optional[MultiCallConfig], default: None) – The multi-call configuration.

  • include_multi_call_in_loss (bool, default: True) – Whether to include multi-call diagnostics in the loss.

extend_normalizer_with_multi_call_outputs(normalizer)[source]

Extend the normalizer by setting multi-call output names to use the same normalization as their base counterparts.

Return type:

StandardNormalizer

Parameters:

normalizer (StandardNormalizer) –

get_loss_normalizer(extra_names=None, extra_residual_scaled_names=None)[source]

Get the loss normalizer for the multi-call step.

Normalizer will use statistics from multi-call variables in the stats dataset, meaning the normalization for multi-call output versions will be different from the normalization for the base variables.

Parameters:
  • extra_names (Optional[list[str]], default: None) – Names of additional variables to include in the loss normalizer.

  • extra_residual_scaled_names (Optional[list[str]], default: None) – extra_names which use residual scale factors, if enabled.

Return type:

StandardNormalizer

get_step(dataset_info, init_weights)[source]
Parameters:
  • dataset_info (DatasetInfo) – Information about the training dataset.

  • init_weights (Callable[[list[Module]], None]) – Function to initialize the weights of the step before wrapping in DistributedDataParallel. This is particularly useful when freezing parameters, as the DistributedDataParallel will otherwise expect frozen weights to have gradients, and will raise an exception.

Return type:

MultiCallStep

Returns:

The state of the stepper.

load()[source]

Update configuration in-place so it does not depend on external files.

property loss_names: list[str]

Names of variables to be included in the loss function.

property next_step_input_names: list[str]

Names of variables required in next_step_input_data for .step.

property output_names: list[str]

Names of variables output by the step.

replace_prescribed_prognostic_names(names)[source]

Replace prescribed prognostic names (e.g. when loading from checkpoint).

Return type:

None

Parameters:

names (list[str]) –

class fme.ace.NameConfig(name)[source]

Configuration for specifying a solar constant name.

Parameters:

name (str) – name of a solar constant variable to load from data on disk; useful in the case that a time-varying solar constant is desired. The computed insolation will share the same dtype as the loaded solar constant.

fme.ace.NoiseConditionedSFNO

alias of NoiseConditionedModel

class fme.ace.NormalizationConfig(global_means_path=None, global_stds_path=None, means=<factory>, stds=<factory>, fill_nans_on_normalize=False, fill_nans_on_denormalize=False)[source]

Configuration for normalizing data.

Either global_means_path and global_stds_path or explicit means and stds must be provided.

Parameters:
  • global_means_path (UnionType[str, Path, None], default: None) – Path to a netCDF file containing global means.

  • global_stds_path (UnionType[str, Path, None], default: None) – Path to a netCDF file containing global stds.

  • means (Mapping[str, float], default: <factory>) – Mapping from variable names to means.

  • stds (Mapping[str, float], default: <factory>) – Mapping from variable names to stds.

  • fill_nans_on_normalize (bool, default: False) – Whether to fill NaNs during normalization. If true, on normalization NaNs in the denormalized input become zeros in the normalized output.

  • fill_nans_on_denormalize (bool, default: False) – Whether to fill NaNs during denormalization. If true, on denormalization NaNs in the normalized input become global means in the denormalized output.

load()[source]

Load the normalization configuration from the netCDF files.

Updates the configuration so it no longer requires external files.

class fme.ace.OceanConfig(surface_temperature_name, ocean_fraction_name, interpolate=False, slab=None)[source]

Configuration for determining sea surface temperature from an ocean model.

Parameters:
  • surface_temperature_name (str) – Name of the sea surface temperature field.

  • ocean_fraction_name (str) – Name of the ocean fraction field.

  • interpolate (bool, default: False) – If True, interpolate between ML-predicted surface temperature and ocean-predicted surface temperature according to ocean_fraction. If False, only use ocean-predicted surface temperature where ocean_fraction>=0.5.

  • slab (Optional[SlabOceanConfig], default: None) – If provided, use a slab ocean model to predict surface temperature.

class fme.ace.OceanCorrectorConfig(force_positive_names=<factory>, sea_ice_fraction_correction=None, surface_energy_flux_correction=None, ocean_heat_content_correction=None)[source]
Parameters:
  • force_positive_names (list[str]) –

  • sea_ice_fraction_correction (SeaIceFractionConfig | None) –

  • surface_energy_flux_correction (SurfaceEnergyFluxCorrectionConfig | None) –

  • ocean_heat_content_correction (OceanHeatContentBudgetConfig | None) –

classmethod remove_deprecated_keys(state)[source]

This method is used to remove or transform any deprecated keys from the state dict before loading it into a CorrectorConfigABC instance. It is optional to implement this method on subclasses.

Return type:

dict[str, Any]

Parameters:

state (Mapping[str, Any]) –

class fme.ace.OneStepAggregatorConfig(log_snapshots=True, log_mean_maps=True)[source]

Configuration for the validation OneStepAggregator.

Parameters:
  • log_snapshots (bool, default: True) – Whether to log snapshot images.

  • log_mean_maps (bool, default: True) – Whether to log mean map images.

class fme.ace.OptimizationConfig(optimizer_type='Adam', lr=0.001, kwargs=<factory>, enable_automatic_mixed_precision=False, scheduler=<factory>, use_gradient_accumulation=False, checkpoint=<factory>, resume_optimizer_ckpt_path=None)[source]

Configuration for optimization.

Parameters:
  • optimizer_type (Literal['Adam', 'AdamW', 'FusedAdam'], default: 'Adam') – The type of optimizer to use.

  • lr (float, default: 0.001) – The learning rate.

  • kwargs (Mapping[str, Any], default: <factory>) – Additional keyword arguments to pass to the optimizer.

  • enable_automatic_mixed_precision (bool, default: False) – Whether to use automatic mixed precision.

  • scheduler (SchedulerConfig | SequentialSchedulerConfig, default: <factory>) – The type of scheduler to use. If none is given, no scheduler will be used.

  • use_gradient_accumulation (bool, default: False) – Whether to use gradient accumulation. This must be supported by the stepper being optimized, which may accumulate gradients from separate losses to reduce memory consumption. The stepper may choose to accumulate gradients differently when this is enabled, such as by detaching the computational graph between steps. See the documentation of your stepper (e.g. Stepper) for more details.

  • resume_optimizer_ckpt_path (Optional[str], default: None) – Optional path to a training checkpoint (ckpt.tar) whose per-parameter optimizer running state (e.g. Adam moment estimates) and grad scaler state should be loaded into the freshly-built Optimization for fine-tuning. The current config’s per-group hyperparameters (lr, weight_decay, betas, …) and scheduler are kept; only the running state is transferred. Intended for non-resuming jobs; preemption resume in the Trainer overrides this state via Optimization.load_state.

  • checkpoint (CheckpointConfig) –

property has_lr_schedule: bool

Whether a learning rate scheduler is configured.

class fme.ace.OverwriteConfig(constant=<factory>, multiply_scalar=<factory>)[source]

Configuration to overwrite field values in XarrayDataset.

Parameters:
  • constant (Mapping[str, float], default: <factory>) – Fill field with constant value.

  • multiply_scalar (Mapping[str, float], default: <factory>) – Multiply field by scalar value.

class fme.ace.ParameterClassification(exclude=<factory>, frozen=<factory>)[source]

Specifies whether parameters are excluded from initialization or frozen.

Parameters:
  • exclude (list[str], default: <factory>) – list of parameter names to exclude from the loaded weights. Used for example to keep the random initialization for final layer(s) of a model, and only overwrite the weights for earlier layers. Takes values like “decoder.2.weight”.

  • frozen (FrozenParameterConfig, default: <factory>) – configuration for freezing parameters in the built model

class fme.ace.ParameterInitializationConfig(weights_path=None, parameters=<factory>, alpha=0.0, beta=0.0, exclude_parameters=None, frozen_parameters=None)[source]

A class which applies custom initialization to module parameters.

Assumes the module weights have already been randomly initialized.

Supports overwriting the weights of the built model with weights from a pre-trained model. If the built model has larger weights than the pre-trained model, only the initial slice of the weights is overwritten.

Parameters:
  • weight_path – path to a Stepper checkpoint containing weights to load

  • parameters (list[ParameterClassification], default: <factory>) – list of ParameterClassification objects, each specifying whether parameters are excluded from initialization or frozen. By default modules are unfrozen and all parameters are included. Must be provided in the same order as provided by the stepper’s .modules attribute.

  • alpha (float, default: 0.0) – L2 regularization coefficient keeping initialized weights close to their intiial values

  • beta (float, default: 0.0) – L2 regularization coefficient keeping uninitialized weights close to zero

  • exclude_parameters (Optional[list[str]], default: None) – deprecated, kept for backwards compatibility

  • frozen_parameters (Optional[FrozenParameterConfig], default: None) – deprecated, kept for backwards compatibility

  • weights_path (str | None) –

build(load_weights_and_history)[source]

Build a ParameterInitializer instance with the current configuration.

Parameters:

load_weights_and_history (Callable[[Optional[str]], tuple[Optional[list[Mapping[str, Any]]], TrainingHistory]]) – a function which loads weights and training history from a path, specifically the configured weights_path.

Return type:

ParameterInitializer

class fme.ace.PerturbationSelector(type, config)[source]
Parameters:
classmethod get_available_types()[source]

This class method is used to expose all available types of Perturbations.

class fme.ace.RecurrentBlockConfig(in_channels=3, kernel_size=1, enable_nhwc=False, enable_healpixpad=False, block_type='ConvGRUBlock')[source]

Configuration for the recurrent block.

Parameters:
  • in_channels (int, default: 3) – Number of input channels, default is 3.

  • kernel_size (int, default: 1) – Size of the kernel, default is 1.

  • enable_nhwc (bool, default: False) – Flag to enable NHWC data format, default is False.

  • enable_healpixpad (bool, default: False) – Flag to enable HEALPix padding, default is False.

  • block_type (Literal['ConvGRUBlock', 'ConvLSTMBlock'], default: 'ConvGRUBlock') – Type of recurrent block, either “ConvGRUBlock” or “ConvLSTMBlock”,

  • "ConvGRUBlock". (default is) –

build()[source]

Builds the recurrent block model.

Return type:

Module

Returns:

Recurrent block.

class fme.ace.RepeatedInterval(interval_length, start, block_length)[source]

Configuration for a repeated interval within a block. This configuration is used to generate a boolean mask for a dataset that will return values within the interval and repeat that throughout the dataset.

Parameters:
  • interval_length (int | str) – Length of the interval to return values from

  • start (int | str) – Start position of the interval within the repeat block.

  • block_length (int | str) – Total length of the block to be repeated over the length of the dataset, including the interval length.

Note

The interval_length, start, and block_length can be provided as either all integers or all strings representing timedeltas of the block. If provided as strings, the timestep must be provided when calling get_boolean_mask.

Examples

To return values from the first 3 items of every 6 items, use:

>>> fme.ace.RepeatedInterval(interval_length=3, block_length=6, start=0)  

To return a days worth of values starting after 2 days from every 7-day block, use:

>>> fme.ace.RepeatedInterval(interval_length="1d", block_length="7d", start="2d")  
get_boolean_mask(length, timestep=None)[source]

Return a boolean mask for the repeated interval.

Parameters:
  • length (int) – Length of the dataset.

  • timestep (Optional[timedelta], default: None) – Timestep of the dataset.

Return type:

ndarray

class fme.ace.ResumeResultsConfig(existing_dir, resume_wandb=False)[source]

Configuration for resuming a previously stopped or finished job.

Typically only useful for training jobs which have already finished (e.g., to train for a larger value of max_epochs than originally configured) or which were stopped (e.g., to resume training on different hardware or to change data loader settings such as number of data workers).

WARNING: We typically don’t guarantee backwards compatibility for training, so this may not work well when resuming old experiments.

Parameters:
  • existing_dir (str) – Directory with existing results to resume from.

  • resume_wandb (bool, default: False) – If true, log to the same WandB job as given in the wandb_job_id file in existing_dir, if any.

prepare_directory(experiment_dir)[source]

Recursively copies existing_dir to experiment_dir.

Parameters:

experiment_dir (str) – Directory to which existing_dir will be copied. Typically, this will be an empty directory which has been configured for saving a training job’s outputs, such as model checkpoints.

class fme.ace.SFNO_V0_1_0(spectral_transform='sht', filter_type='linear', operator_type='dhconv', scale_factor=16, embed_dim=256, num_layers=12, repeat_layers=1, hard_thresholding_fraction=1.0, normalization_layer='instance_norm', use_mlp=True, activation_function='gelu', encoder_layers=1, pos_embed='direct', big_skip=True, rank=1.0, factorization=None, separable=False, complex_activation='real', spectral_layers=1, checkpointing=0, data_grid='legendre-gauss')[source]

Configuration for the SFNO architecture in modulus-makani version 0.1.0.

Parameters:
  • spectral_transform (str) –

  • filter_type (Literal['linear']) –

  • operator_type (str) –

  • scale_factor (int) –

  • embed_dim (int) –

  • num_layers (int) –

  • repeat_layers (int) –

  • hard_thresholding_fraction (float) –

  • normalization_layer (str) –

  • use_mlp (bool) –

  • activation_function (str) –

  • encoder_layers (int) –

  • pos_embed (Literal['none', 'direct', 'frequency']) –

  • big_skip (bool) –

  • rank (float) –

  • factorization (str | None) –

  • separable (bool) –

  • complex_activation (str) –

  • spectral_layers (int) –

  • checkpointing (int) –

  • data_grid (Literal['legendre-gauss', 'equiangular', 'healpix']) –

build(n_in_channels, n_out_channels, dataset_info)[source]

Build a nn.Module given information about the input and output channels and the dataset.

Parameters:
  • n_in_channels (int) – number of input channels

  • n_out_channels (int) – number of output channels

  • dataset_info (DatasetInfo) – Information about the dataset, including img_shape, horizontal coordinates, vertical coordinate, etc.

Returns:

a nn.Module

class fme.ace.SSTPerturbation(sst)[source]

Configuration for sea surface temperature perturbations applied to initial condition and forcing data. Currently, this is strictly applied to both.

Parameters:

sst (list[PerturbationSelector]) – List of perturbation selectors for SST perturbations.

class fme.ace.SamudraBuilder(ch_width=<factory>, n_layers=<factory>, dilation=<factory>, pad='circular', norm='instance', norm_kwargs=<factory>, upscale_factor=4, checkpoint_strategy=None)[source]

Configuration for the M2Lines Samudra architecture.

Parameters:
build(n_in_channels, n_out_channels, dataset_info)[source]

Build a nn.Module given information about the input and output channels and the dataset.

Parameters:
  • n_in_channels (int) – number of input channels

  • n_out_channels (int) – number of output channels

  • dataset_info (DatasetInfo) – Information about the dataset, including img_shape, horizontal coordinates, vertical coordinate, etc.

Returns:

a nn.Module

class fme.ace.SchedulerConfig(type=None, kwargs=<factory>, step_each_iteration=False)[source]

Configuration for a scheduler to use during training.

Parameters:
  • type (Optional[str], default: None) – Name of scheduler class from torch.optim.lr_scheduler, no scheduler is used by default.

  • kwargs (Mapping[str, Any], default: <factory>) – Keyword arguments to pass to the scheduler constructor.

  • step_each_iteration (bool, default: False) – If true, step after each batch. Otherwise, just step at the end of each epoch. Schedulers that step with every iteration won’t be passed the validation loss.

build(optimizer, max_epochs)[source]

Build the scheduler.

Return type:

LRScheduler

class fme.ace.SeparateRadiationStepConfig(builder, radiation_builder, main_prognostic_names, shared_forcing_names, radiation_only_forcing_names, radiation_diagnostic_names, main_diagnostic_names, normalization, next_step_forcing_names=<factory>, ocean=None, corrector=<factory>, detach_radiation=False, residual_prediction=False)[source]

Configuration for a separate radiation stepper.

Parameters:
  • builder (ModuleSelector) – The module builder.

  • radiation_builder (ModuleSelector) – The radiation module builder.

  • main_prognostic_names (list[str]) – Names of prognostic variables. These are provided as input to both the main and radiation models, and output by the main model.

  • shared_forcing_names (list[str]) – Names of forcing variables.

  • radiation_only_forcing_names (list[str]) – Names of forcing variables for the radiation model, in addition to the ones specified in shared_forcing_names.

  • radiation_diagnostic_names (list[str]) – Names of diagnostic variables for the radiation model.

  • main_diagnostic_names (list[str]) – Names of diagnostic variables for the main model.

  • normalization (NetworkAndLossNormalizationConfig) – The normalization configuration.

  • next_step_forcing_names (list[str], default: <factory>) – Names of forcing variables which come from the output timestep.

  • ocean (Optional[OceanConfig], default: None) – The ocean configuration.

  • corrector (AtmosphereCorrectorConfig | CorrectorSelector, default: <factory>) – The corrector configuration.

  • detach_radiation (bool, default: False) – Whether to detach the output of the radiation model before passing it to the main model. The radiation outputs returned by .step() will not be detached.

  • residual_prediction (bool, default: False) – Whether to use residual prediction.

get_loss_normalizer(extra_names=None, extra_residual_scaled_names=None)[source]
Parameters:
  • extra_names (Optional[list[str]], default: None) – Names of additional variables to include in the loss normalizer.

  • extra_residual_scaled_names (Optional[list[str]], default: None) – extra_names which use residual scale factors, if enabled.

Return type:

StandardNormalizer

Returns:

The loss normalizer.

get_step(dataset_info, init_weights)[source]
Parameters:
  • dataset_info (DatasetInfo) – Information about the training dataset.

  • init_weights (Callable[[list[Module]], None]) – Function to initialize the weights of the step before wrapping in DistributedDataParallel. This is particularly useful when freezing parameters, as the DistributedDataParallel will otherwise expect frozen weights to have gradients, and will raise an exception.

Return type:

SeparateRadiationStep

Returns:

The state of the stepper.

load()[source]

Update configuration in-place so it does not depend on external files.

property loss_names: list[str]

Names of variables to be included in the loss function.

property next_step_input_names: list[str]

Names of variables provided in next_step_input_data.

property output_names: list[str]

Names of variables output by the step.

class fme.ace.SequentialSchedulerConfig(schedulers, milestones, last_epoch=-1)[source]

Configuration for using torch.optim.SequentialLR to build a sequence of LR schedulers that run one after the other.

Parameters:
  • schedulers (Sequence[SchedulerConfig]) – Ordered sequence of SchedulerConfigs to define the schedulers for the SequentialLR. Note that all schedulers in the sequence must have the same value for steps_per_iteration.

  • milestones (Sequence[int]) – Sequence of integers that reflects milestone points, where milestones[i] corresponds to the last epoch or iteration where schedulers[i] is active before switching to schedulers[i+1]. For example, with two schedulers and milestones=[10] the first 10 epochs will use the first scheduler and then switch to the second scheduler for epoch 11.

  • last_epoch (int, default: -1) – The index of last epoch. Default: -1.

build(optimizer, max_epochs)[source]

Build the SequentialLR scheduler.

Return type:

LRScheduler

class fme.ace.SingleModuleStepConfig(builder, in_names, out_names, normalization, secondary_decoder=None, ocean=None, corrector=<factory>, next_step_forcing_names=<factory>, prescribed_prognostic_names=<factory>, residual_prediction=False)[source]

Configuration for a single module stepper.

Parameters:
  • builder (ModuleSelector) – The module builder.

  • in_names (list[str]) – Names of input variables.

  • out_names (list[str]) – Names of output variables.

  • normalization (NetworkAndLossNormalizationConfig) – The normalization configuration.

  • secondary_decoder (Optional[SecondaryDecoderConfig], default: None) – Configuration for the secondary decoder that computes additional diagnostic variables from outputs.

  • ocean (Optional[OceanConfig], default: None) – The ocean configuration.

  • corrector (AtmosphereCorrectorConfig | CorrectorSelector, default: <factory>) – The corrector configuration.

  • next_step_forcing_names (list[str], default: <factory>) – Names of forcing variables for the next timestep.

  • prescribed_prognostic_names (list[str], default: <factory>) – Prognostic variable names to overwrite from forcing data at each step (e.g. for inference with observed values).

  • residual_prediction (bool, default: False) – Whether to use residual prediction.

property diagnostic_names: list[str]

Names of variables which are outputs only.

get_loss_normalizer(extra_names=None, extra_residual_scaled_names=None)[source]
Parameters:
  • extra_names (Optional[list[str]], default: None) – Names of additional variables to include in the loss normalizer.

  • extra_residual_scaled_names (Optional[list[str]], default: None) – extra_names which use residual scale factors, if enabled.

Return type:

StandardNormalizer

Returns:

The loss normalizer.

get_next_step_forcing_names()[source]

Names of input-only variables which come from the output timestep.

Return type:

list[str]

get_step(dataset_info, init_weights)[source]
Parameters:
  • dataset_info (DatasetInfo) – Information about the training dataset.

  • init_weights (Callable[[list[Module]], None]) – Function to initialize the weights of the step before wrapping in DistributedDataParallel. This is particularly useful when freezing parameters, as the DistributedDataParallel will otherwise expect frozen weights to have gradients, and will raise an exception.

Return type:

SingleModuleStep

Returns:

The state of the stepper.

property input_names: list[str]

Names of variables required as inputs to step, either in input or next_step_input_data.

load()[source]

Update configuration in-place so it does not depend on external files.

property loss_names: list[str]

Names of variables to be included in the loss function.

property next_step_input_names: list[str]

Names of variables provided in next_step_input_data.

property output_names: list[str]

Names of variables output by the step.

replace_ocean(ocean)[source]

Replace the ocean model with a new one.

Parameters:

ocean (Optional[OceanConfig]) – The new ocean model configuration or None.

replace_prescribed_prognostic_names(names)[source]

Replace prescribed prognostic names (e.g. when loading from checkpoint).

Return type:

None

Parameters:

names (list[str]) –

class fme.ace.SlabOceanConfig(mixed_layer_depth_name, q_flux_name)[source]

Configuration for a slab ocean model.

Parameters:
  • mixed_layer_depth_name (str) – Name of the mixed layer depth field.

  • q_flux_name (str) – Name of the heat flux field.

class fme.ace.Slice(start=None, stop=None, step=None)[source]

Configuration of a python slice built-in.

Required because slice cannot be initialized directly by dacite.

Parameters:
  • start (Optional[int], default: None) – Start index of the slice.

  • stop (Optional[int], default: None) – Stop index of the slice.

  • step (Optional[int], default: None) – Step of the slice.

classmethod shift_left(original, start_index)[source]

Shift the slice relative to the start index of a group of data to capture requested correct quantities while still respecting batches. E.g., If slice is (0, 10, 1) and start_index is 5, the new slice would be (None, 5, 1).

Raises:
  • ValueError – If trying to shift negative valued slice object,

  • since that is not defined without knowing the total sequence

  • length.

Return type:

Slice

Parameters:
  • original (Slice) –

  • start_index (int) –

class fme.ace.SphericalFourierNeuralOperatorBuilder(spectral_transform='sht', filter_type='linear', operator_type='diagonal', scale_factor=1, residual_filter_factor=1, embed_dim=256, num_layers=12, hard_thresholding_fraction=1.0, normalization_layer='instance_norm', use_mlp=True, activation_function='gelu', encoder_layers=1, pos_embed=True, big_skip=True, rank=1.0, factorization=None, separable=False, complex_network=True, complex_activation='real', spectral_layers=1, checkpointing=0, data_grid='legendre-gauss')[source]

Configuration for the SFNO architecture used in FourCastNet-SFNO.

Parameters:
  • spectral_transform (str) –

  • filter_type (str) –

  • operator_type (str) –

  • scale_factor (int) –

  • residual_filter_factor (int) –

  • embed_dim (int) –

  • num_layers (int) –

  • hard_thresholding_fraction (float) –

  • normalization_layer (str) –

  • use_mlp (bool) –

  • activation_function (str) –

  • encoder_layers (int) –

  • pos_embed (bool) –

  • big_skip (bool) –

  • rank (float) –

  • factorization (str | None) –

  • separable (bool) –

  • complex_network (bool) –

  • complex_activation (str) –

  • spectral_layers (int) –

  • checkpointing (int) –

  • data_grid (Literal['legendre-gauss', 'equiangular']) –

build(n_in_channels, n_out_channels, dataset_info)[source]

Build a nn.Module given information about the input and output channels and the dataset.

Parameters:
  • n_in_channels (int) – number of input channels

  • n_out_channels (int) – number of output channels

  • dataset_info (DatasetInfo) – Information about the dataset, including img_shape, horizontal coordinates, vertical coordinate, etc.

Returns:

a nn.Module

class fme.ace.StaticMaskingConfig(mask_value, fill_value=0.0, exclude_names_and_prefixes=None)[source]

Replace static masked regions with a fill value.

Parameters:
  • mask_value (int) – Value of the mask variable in masked regions. Either 0 or 1.

  • fill_value (Union[Literal['mean'], float], default: 0.0) – A float fill value to use outside of masked regions. Can also be “mean”, in which case the normalizer means are used as channel-specific fill values.

  • exclude_names_and_prefixes (Optional[list[str]], default: None) – Names (2D variables) and prefixes (3D variables) to exclude when applying the mask.

build(mask, means=None)[source]

Build StaticMasking.

Parameters:
  • mask (HasGetMaskTensorFor) –

  • means (Mapping[str, Tensor] | None) –

class fme.ace.StepLossConfig(type='MSE', kwargs=<factory>, global_mean_type=None, global_mean_kwargs=<factory>, global_mean_weight=1.0, sqrt_loss_step_decay_constant=0.0, weights=<factory>)[source]

Loss configuration class that has the same fields as LossConfig but also has additional weights field, and optional step loss decay.

The build method will apply the weights to the inputs of the loss function. The loss returned by build will be a MappingLoss, which takes Dict[str, tensor] as inputs instead of packed tensors.

Parameters:
  • type (Literal['LpLoss', 'MSE', 'AreaWeightedMSE', 'EnsembleLoss'], default: 'MSE') – the type of the loss function

  • kwargs (Mapping[str, Any], default: <factory>) – data for a loss function instance of the indicated type

  • global_mean_type (Optional[Literal['LpLoss']], default: None) – the type of the loss function to apply to the global mean of each sample, by default no loss is applied

  • global_mean_kwargs (Mapping[str, Any], default: <factory>) – data for a loss function instance of the indicated type to apply to the global mean of each sample

  • global_mean_weight (float, default: 1.0) – the weight to apply to the global mean loss relative to the main loss

  • sqrt_loss_step_decay_constant (float, default: 0.0) – the constant to use for the square root loss step decay, alpha in 1/sqrt(1.0 + alpha * step) where step is indexed from 0 for the first step.

  • weights (dict[str, float], default: <factory>) – A dictionary of variable names with individual weights to apply to their normalized losses

class fme.ace.StepMeanEntry(step, name=None)[source]

Configuration for logging mean metrics at a particular step.

Parameters:
  • step (int) –

  • name (str | None) –

step

Number of forward steps after which to log mean metrics. For example, step=20 will log mean metrics at the 20th forward step (i.e. time index n_ic_steps + 19).

name

Name to use for the logged metrics. If None, will use “mean_step_{step}”.

class fme.ace.StepSelector(type, config)[source]
Parameters:
classmethod get_available_types()[source]

This class method is used to expose all available types of Steps.

Return type:

set[str]

get_loss_normalizer(extra_names=None, extra_residual_scaled_names=None)[source]
Parameters:
  • extra_names (Optional[list[str]], default: None) – Names of additional variables to include in the loss normalizer.

  • extra_residual_scaled_names (Optional[list[str]], default: None) – extra_names which use residual scale factors, if enabled.

Return type:

StandardNormalizer

Returns:

The loss normalizer.

get_step(dataset_info, init_weights=<function StepSelector.<lambda>>)[source]
Parameters:
  • dataset_info (DatasetInfo) – Information about the training dataset.

  • init_weights (Callable[[list[Module]], None], default: <function StepSelector.<lambda> at 0x78ce042b4fe0>) – Function to initialize the weights of the step before wrapping in DistributedDataParallel. This is particularly useful when freezing parameters, as the DistributedDataParallel will otherwise expect frozen weights to have gradients, and will raise an exception.

Return type:

StepABC

Returns:

The state of the stepper.

load()[source]

Update configuration in-place so it does not depend on external files.

property loss_names: list[str]

Names of variables to be included in the loss function.

property next_step_input_names: list[str]

Names of variables required in next_step_input_data for .step.

property output_names: list[str]

Names of variables output by the step.

classmethod register(name)[source]

Register a virtual subclass of an ABC.

Returns the subclass, to allow usage as a class decorator.

Parameters:

name (str) –

replace_prescribed_prognostic_names(names)[source]

Replace prescribed prognostic names (e.g. when loading from checkpoint).

Return type:

None

Parameters:

names (list[str]) –

class fme.ace.Stepper(config, step, dataset_info, input_process_func, output_process_func, derive_func, parameter_initializer, training_history=None)[source]

Stepper class for selectable step configurations.

Parameters:
build_loss(loss_config)[source]

Build a StepLoss from the given config using this stepper’s normalizer and dataset info.

Parameters:

loss_config (StepLossConfig) – The loss configuration to build from.

Return type:

StepLoss

Returns:

A StepLoss built using this stepper’s loss normalizer, gridded operations, loss variable names, and channel dimension.

classmethod from_state(state)[source]

Load the state of the stepper.

Parameters:

state – The state to load.

Return type:

Stepper

Returns:

The stepper.

get_base_weights()[source]

Get the base weights of the stepper.

Return type:

Optional[list[Mapping[str, Any]]]

Returns:

A list of weight dictionaries for each module in the stepper.

get_prediction_generator(initial_condition, forcing_data, n_forward_steps, optimizer)[source]

Predict multiple steps forward given initial condition and forcing data.

Uses low-level inputs and does not compute derived variables, to separate concerns from the predict method.

Parameters:
  • initial_condition (PrognosticState) – The initial condition, containing tensors of shape [n_batch, self.n_ic_timesteps, <horizontal_dims>].

  • forcing_data (BatchData) – The forcing data, containing tensors of shape [n_batch, n_forward_steps + self.n_ic_timesteps, <horizontal_dims>].

  • n_forward_steps (int) – The number of forward steps to predict, corresponding to the data shapes of forcing_data.

  • optimizer (OptimizationABC) – The optimizer to use for updating the module.

Return type:

Generator[dict[str, Tensor], None, None]

Returns:

Generator yielding the output data at each timestep.

get_state()[source]
Returns:

The state of the stepper.

load_state(state)[source]

Load the state of the stepper.

Parameters:

state (dict[str, Any]) – The state to load.

Return type:

None

property modules: ModuleList

Returns: A list of modules being trained.

predict(initial_condition, forcing, compute_derived_variables=False, compute_derived_forcings=True)[source]

Predict multiple steps forward given initial condition and reference data.

Parameters:
  • initial_condition (PrognosticState) – Prognostic state data with tensors of shape [n_batch, self.n_ic_timesteps, <horizontal_dims>]. This data is assumed to contain all prognostic variables and be denormalized.

  • forcing (BatchData) – Contains tensors of shape [n_batch, self.n_ic_timesteps + n_forward_steps, n_lat, n_lon]. This contains the forcing and ocean data for the initial condition and all subsequent timesteps.

  • compute_derived_variables (bool, default: False) – Whether to compute derived variables for the prediction.

  • compute_derived_forcings (bool, default: True) – Whether to compute derived forcing variables for the prediction. Only used to disable computing the derived forcings if they have been computed ahead of time.

Return type:

tuple[BatchData, PrognosticState]

Returns:

A batch data containing the prediction and the prediction’s final state which can be used as a new initial condition.

predict_paired(initial_condition, forcing, compute_derived_variables=False)[source]

Predict multiple steps forward given initial condition and reference data.

Parameters:
  • initial_condition (PrognosticState) – Prognostic state data with tensors of shape [n_batch, self.n_ic_timesteps, <horizontal_dims>]. This data is assumed to contain all prognostic variables and be denormalized.

  • forcing (BatchData) – Contains tensors of shape [n_batch, self.n_ic_timesteps + n_forward_steps, n_lat, n_lon]. This contains the forcing and ocean data for the initial condition and all subsequent timesteps.

  • compute_derived_variables (bool, default: False) – Whether to compute derived variables for the prediction.

Return type:

tuple[PairedData, PrognosticState]

Returns:

A tuple of 1) a paired data object, containing the prediction paired with all target/forcing data at the same timesteps, and 2) the prediction’s final state, which can be used as a new initial condition.

prescribe_sst(mask_data, gen_data, target_data)[source]

Prescribe sea surface temperature onto the generated surface temperature field.

Parameters:
  • mask_data (Mapping[str, Tensor]) – Source for the prescriber mask field.

  • gen_data (Mapping[str, Tensor]) – Contains the generated surface temperature field.

  • target_data (Mapping[str, Tensor]) – Contains the target surface temperature that will be prescribed onto the generated one according to the mask.

Return type:

dict[str, Tensor]

replace_derived_forcings(derived_forcings)[source]

Replace the derived forcings configuration with a new one.

Parameters:

derived_forcings (DerivedForcingsConfig) – The new derived forcings configuration or None.

replace_multi_call(multi_call)[source]

Replace the MultiCall object with a new one. Note this is only meant to be used at inference time and may result in the loss function being unusable.

Parameters:

multi_call (Optional[MultiCallConfig]) – The new multi_call configuration or None.

replace_ocean(ocean)[source]

Replace the ocean model with a new one.

Parameters:

ocean (Optional[OceanConfig]) – The new ocean model configuration or None.

replace_prescribed_prognostic_names(names)[source]

Replace prescribed prognostic names (e.g. when loading from checkpoint).

Parameters:

names (list[str]) – The new list of prescribed prognostic variable names.

Return type:

None

step(args, wrapper=<function Stepper.<lambda>>)[source]

Step the model forward one timestep given input data.

Parameters:
  • args (StepArgs) – The arguments to the step function.

  • wrapper (Callable[[Module], Module], default: <function Stepper.<lambda> at 0x78ce042f9260>) – Wrapper to apply over each nn.Module before calling.

Return type:

dict[str, Tensor]

Returns:

The denormalized output data at the next time step.

update_training_history(training_job)[source]

Update the stepper’s history of training jobs.

Parameters:

training_job (TrainingJob) – The training job to add to the history.

Return type:

None

class fme.ace.StepperConfig(step, input_masking=None, derived_forcings=<factory>)[source]

Configuration for a stepper.

Parameters:
property all_names: list[str]

Names of all variables.

classmethod from_stepper_state(state)[source]

Initialize a StepperConfig from a stepper state.

This is required for backwards compatibility with older steppers, whose configuration did not provide normalization constants, but rather pointed to files on disk. Newer stepper configurations load these constants into the configuration before checkpoints are saved.

Parameters:

state – The state of the stepper.

Return type:

StepperConfig

Returns:

The stepper config.

get_stepper(dataset_info, parameter_initializer=None, training_history=None)[source]
Parameters:
  • dataset_info (DatasetInfo) – Information about the training dataset.

  • parameter_initializer (Optional[ParameterInitializer], default: None) – The parameter initializer to use for loading weights from an external source. If None, no parameter initialization is applied.

  • training_history (Optional[TrainingHistory], default: None) – History of the stepper’s training jobs.

property input_names: list[str]

Names of variables which are required as inputs.

property loss_names

Names of variables to include in loss.

property next_step_forcing_names: list[str]

Names of variables which are given as inputs but taken from the output timestep.

An example might be solar insolation taken during the output window period.

property output_names: list[str]

Names of variables which are outputs only.

property prognostic_names: list[str]

Names of variables which both inputs and outputs.

replace_multi_call(multi_call, state)[source]

Replace the multi-call configuration of self.step and ensure the associated state can be loaded as a multi-call step.

A value of None for multi_call will remove the multi-call configuration.

If the selected type supports it, the multi-call configuration will be updated in place. Otherwise, it will be wrapped in the multi_call step configuration with the given multi_call config or None.

Note this updates self.step in place, but returns a new state dictionary.

Parameters:
  • multi_call (Optional[MultiCallConfig]) – MultiCallConfig for the resulting self.step.

  • state (dict[str, Any]) – state dictionary associated with the loaded step.

Return type:

dict[str, Any]

Returns:

The state dictionary updated to ensure consistency with that of a serialized multi-call step.

replace_prescribed_prognostic_names(names)[source]

Replace prescribed prognostic names (e.g. when loading from checkpoint).

Used for inference / evaluation where the trained ckpt does not contain prescribed_prognostic_names and we need to overwrite prescribed_prognostic_names.

Return type:

None

Parameters:

names (list[str]) –

class fme.ace.StepperOverrideConfig(ocean='keep', multi_call='keep', derived_forcings='keep', prescribed_prognostic_names='keep')[source]

Configuration for overriding stepper configuration options.

The default value for each parameter is "keep", which denotes that the serialized stepper’s configuration will not be modified when loaded. Passing other values will override the configuration of the loaded stepper.

Parameters:
  • ocean (Union[Literal['keep'], OceanConfig, None], default: 'keep') – Ocean configuration to override that used in producing a serialized stepper.

  • multi_call (Union[Literal['keep'], MultiCallConfig, None], default: 'keep') – MultiCall configuration to override that used in producing a serialized stepper.

  • derived_forcings (Union[Literal['keep'], DerivedForcingsConfig], default: 'keep') – Derived forcings configuration to override that used in producing a serialized stepper.

  • prescribed_prognostic_names (Union[Literal['keep'], list[str]], default: 'keep') – List of prognostic variable names to overwrite from forcing at each step during inference.

class fme.ace.TimeCoarsenConfig(coarsen_factor, method='block_mean')[source]

Config for inference data time coarsening.

Parameters:
  • coarsen_factor (int) – Factor by which to coarsen in time, an integer 1 or greater. The resulting time labels will be coarsened to the mean of the original labels.

  • method (Literal['block_mean'], default: 'block_mean') – Method to use for coarsening, currently only “block_mean” is supported.

n_coarsened_timesteps(n_timesteps)[source]

Assumes initial condition is NOT in n_timesteps.

Return type:

int

Parameters:

n_timesteps (int) –

class fme.ace.TimeLengthMilestone(epoch, value)[source]

A milestone for a time length schedule.

Parameters:
class fme.ace.TimeLengthProbabilities(outcomes)[source]
Parameters:

outcomes (list[fme.ace.stepper.time_length_probabilities.TimeLengthProbability]) –

initialize_rng()[source]

Set the rng at runtime. This helps guarantee that the distributed seed has already been set.

sample()[source]

Update the current number of timesteps to sample based on the probabilities of sampling each number of timesteps.

Return type:

int

class fme.ace.TimeLengthProbability(steps, probability)[source]
Parameters:
  • steps (int) –

  • probability (float) –

class fme.ace.TimeLengthSchedule(start_value, milestones)[source]

A schedule for a time length value.

Parameters:
classmethod from_constant(value)[source]

Create a TimeLengthSchedule that always returns the same value.

Parameters:

value (TimeLengthProbabilities | int) – The constant value.

Return type:

TimeLengthSchedule

Returns:

A TimeLengthSchedule instance.

property max_n_forward_steps: IntSchedule

Get a schedule of the maximum number of forward steps.

class fme.ace.TimeSlice(start_time=None, stop_time=None, step=None)[source]

Configuration of a slice of times. Step is an integer-valued index step.

Note: start_time and stop_time may be provided as partial time strings and the

stop_time will be included in the slice. See more details in Xarray docs.

Parameters:
  • start_time (Optional[str], default: None) – Start time of the slice.

  • stop_time (Optional[str], default: None) – Stop time of the slice.

  • step (Optional[int], default: None) – Step of the slice.

as_raw_slice()[source]

Return the raw slice object without applying it to a time index. E.g., directly as a selection method for an xarray object.

Return type:

slice

slice(time)[source]

Return a slice object with indexing based on the provided time index.

Return type:

slice

Parameters:

time (CFTimeIndex) –

class fme.ace.TimestampList(times, timestamp_format='%Y-%m-%dT%H:%M:%S')[source]

Configuration for a list of timestamps.

Parameters:
  • times (Sequence[str]) – List of timestamps.

  • timestamp_format (str, default: '%Y-%m-%dT%H:%M:%S') – Format of the timestamps.

class fme.ace.TrainAggregatorConfig(spherical_power_spectrum=True, weighted_rmse=True, per_channel_loss=True)[source]

Configuration for the train aggregator.

Parameters:
  • spherical_power_spectrum (bool) –

  • weighted_rmse (bool) –

  • per_channel_loss (bool) –

spherical_power_spectrum

Whether to compute the spherical power spectrum.

weighted_rmse

Whether to compute the weighted RMSE.

per_channel_loss

Whether to accumulate and report per-variable (per-channel) loss in get_logs (e.g. train/mean/loss/<var_name>).

class fme.ace.TrainConfig(train_loader, validation_loader, stepper, optimization, logging, max_epochs, save_checkpoint, experiment_dir, inference, stepper_training=<factory>, train_aggregator=<factory>, seed=None, copy_weights_after_batch=<factory>, ema=<factory>, additional_inference=<factory>, validate_using_ema=False, checkpoint_save_epochs=None, ema_checkpoint_save_epochs=None, log_train_every_n_batches=100, train_evaluation_samples=1000, checkpoint_every_n_batches=1000, segment_epochs=None, save_per_epoch_diagnostics=False, validation_aggregator=<factory>, evaluate_before_training=False, save_best_inference_epoch_checkpoints=False, lr_tuning=None, resume_results=None)[source]

Configuration for training a model.

Parameters:
  • train_loader (DataLoaderConfig) – Configuration for the training data loader.

  • validation_loader (DataLoaderConfig) – Configuration for the validation data loader.

  • stepper (StepperConfig | CheckpointStepperConfig) – Configuration for the stepper.

  • optimization (OptimizationConfig) – Configuration for the optimization.

  • logging (LoggingConfig) – Configuration for logging.

  • max_epochs (int) – Total number of epochs to train for.

  • save_checkpoint (bool) – Whether to save checkpoints. If false, no checkpoints are saved regardless of other checkpoint configuration settings. If true, checkpoints are saved at the end of the training loop, after evaluation, and on catching a termination signal.

  • experiment_dir (str) – Directory where checkpoints and logs are saved. For the time being, this must be a local directory.

  • inference (Optional[InlineInferenceConfig]) – Configuration for inline inference. If None, no inline inference is run, and no “best_inline_inference” checkpoint will be saved.

  • additional_inference (list[AdditionalInferenceConfig], default: <factory>) – Configurations for additional inference runs. Each entry has a name (used as wandb log prefix) and config. Not used to select checkpoints, but used to provide metrics.

  • stepper_training (TrainStepperConfig, default: <factory>) – Training-specific configuration including loss, ensemble settings, parameter initialization, and forward step scheduling.

  • train_aggregator (TrainAggregatorConfig, default: <factory>) – Configuration for the train aggregator.

  • seed (Optional[int], default: None) – Random seed for reproducibility. If set, is used for all types of randomization, including data shuffling and model initialization. If unset, weight initialization is not reproducible but data shuffling is.

  • copy_weights_after_batch (list[CopyWeightsConfig], default: <factory>) – Configuration for copying weights from the base model to the training model after each batch.

  • ema (EMAConfig, default: <factory>) – Configuration for exponential moving average of model weights.

  • validate_using_ema (bool, default: False) – Whether to validate and perform inference using the EMA model.

  • checkpoint_save_epochs (Optional[Slice], default: None) – How often to save epoch-based checkpoints, if save_checkpoint is True. If None, checkpoints are only saved for the most recent epoch (and the best epochs if validate_using_ema == False).

  • ema_checkpoint_save_epochs (Optional[Slice], default: None) – How often to save epoch-based EMA checkpoints, if save_checkpoint is True. If None, EMA checkpoints are only saved for the most recent epoch (and the best epochs if validate_using_ema == True).

  • log_train_every_n_batches (int, default: 100) – How often to log batch_loss during training.

  • train_evaluation_samples (int, default: 1000) – Number of samples to evaluate on after training on each epoch. The remainder samples after dividing by the batch size are discarded.

  • checkpoint_every_n_batches (int, default: 1000) – How often to save latest checkpoint during training. If 0 is given, checkpoints will not be saved based on batch progress, only other factors like pre-emption or being at the end of an epoch.

  • segment_epochs (Optional[int], default: None) – Exit after training for at most this many epochs in current job, without exceeding max_epochs. Use this if training must be run in segments, e.g. due to wall clock limit.

  • save_per_epoch_diagnostics (bool, default: False) – Whether to save per-epoch diagnostics from training, validation and inline inference aggregators.

  • validation_aggregator (OneStepAggregatorConfig, default: <factory>) – Configuration for the validation aggregator.

  • evaluate_before_training (bool, default: False) – Whether to run validation and inline inference before any training is done.

  • save_best_inference_epoch_checkpoints (bool, default: False) – Whether to save a separate checkpoint for each epoch where best_inference_error achieves a new minimum. Checkpoints are saved as best_inference_ckpt_XXXX.tar.

  • resume_results (Optional[ResumeResultsConfig], default: None) – Configuration for resuming a previously stopped or finished training job. When provided and experiment_dir has no training_checkpoints subdirectory, then it is assumed that this is a new run to resume a previously completed run and resume_results.existing_dir is recursively copied to experiment_dir.

  • lr_tuning (LRTuningConfig | None) –

property checkpoint_dir: str

The directory where checkpoints are saved.

property output_dir: str

The directory where output files are saved.

class fme.ace.TrainStepperConfig(loss=<factory>, optimize_last_step_only=False, n_ensemble=-1, n_forward_steps=None, parameter_init=<factory>)[source]

Configuration for training-specific aspects of a stepper.

Parameters:
  • loss (StepLossConfig, default: <factory>) – The loss configuration.

  • optimize_last_step_only (bool, default: False) – Whether to optimize only the last step.

  • n_ensemble (int, default: -1) – The number of ensemble members evaluated for each training batch member. Default is 2 if the loss type is EnsembleLoss, otherwise the default is 1. Must be 2 for EnsembleLoss to be valid.

  • n_forward_steps (UnionType[TimeLengthProbabilities, int, TimeLengthSchedule, None], default: None) – The number of timesteps to train on and associated sampling probabilities. By default, the stepper will train on the full number of timesteps present in the training dataset samples. Values must be less than or equal to the number of timesteps present in the training dataset samples.

  • parameter_init (ParameterInitializationConfig, default: <factory>) – The parameter initialization configuration for fine-tuning.

get_train_stepper(stepper_config, dataset_info, load_weights_and_history_fn=<function load_weights_and_history>)[source]

Build a TrainStepper from this configuration and a StepperConfig.

Builds the ParameterInitializer from this config’s parameter_init, passes it to StepperConfig.get_stepper() for weight loading and freezing during model construction, then wraps the result in a TrainStepper.

Parameters:
  • stepper_config (StepperConfig) – The stepper configuration for building the model.

  • dataset_info (DatasetInfo) – Information about the training dataset.

  • load_weights_and_history_fn (Callable[[Optional[str]], tuple[Optional[list[Mapping[str, Any]]], TrainingHistory]], default: <function load_weights_and_history at 0x78ce043f4400>) – Function for loading weights and history. Default implementation loads a Trainer checkpoint containing a Stepper.

Return type:

TrainStepper

Returns:

A TrainStepper wrapping the built stepper with training functionality.

class fme.ace.UNetDecoderConfig(conv_block, up_sampling_block, output_layer, recurrent_block=None, n_channels=<factory>, n_layers=<factory>, output_channels=1, dilations=None, enable_nhwc=False, enable_healpixpad=False)[source]

Configuration for the UNet Decoder.

Parameters:
  • conv_block (ConvBlockConfig) – Configuration for the convolutional block.

  • up_sampling_block (ConvBlockConfig) – Configuration for the up-sampling block.

  • output_layer (ConvBlockConfig) – Configuration for the output layer block.

  • recurrent_block (Optional[RecurrentBlockConfig], default: None) – Configuration for the recurrent block, by default None.

  • n_channels (List[int], default: <factory>) – Number of channels for each layer, by default (34, 68, 136).

  • n_layers (List[int], default: <factory>) – Number of layers in each block, by default (1, 2, 2).

  • output_channels (int, default: 1) – Number of output channels, by default 1.

  • dilations (Optional[list], default: None) – List of dilation rates for the layers, by default None.

  • enable_nhwc (bool, default: False) – Flag to enable NHWC data format, by default False.

  • enable_healpixpad (bool, default: False) – Flag to enable HEALPix padding, by default False.

build()[source]

Builds the UNet Decoder model.

Return type:

Module

Returns:

UNet Decoder model.

class fme.ace.UNetEncoderConfig(conv_block, down_sampling_block, input_channels=3, n_channels=<factory>, n_layers=<factory>, dilations=None, enable_nhwc=False, enable_healpixpad=False)[source]

Configuration for the UNet Encoder.

Parameters:
  • conv_block (ConvBlockConfig) – Configuration for the convolutional block.

  • down_sampling_block (DownsamplingBlockConfig) – Configuration for the down-sampling block.

  • input_channels (int, default: 3) – Number of input channels, by default 3.

  • n_channels (List[int], default: <factory>) – Number of channels for each layer, by default (136, 68, 34).

  • n_layers (List[int], default: <factory>) – Number of layers in each block, by default (2, 2, 1).

  • dilations (Optional[list], default: None) – List of dilation rates for the layers, by default None.

  • enable_nhwc (bool, default: False) – Flag to enable NHWC data format, by default False.

  • enable_healpixpad (bool, default: False) – Flag to enable HEALPix padding, by default False.

build()[source]

Builds the UNet Encoder model.

Return type:

Module

Returns:

UNet Encoder model.

class fme.ace.ValidationConfig(loader, aggregator=<factory>, stepper_training=<factory>)[source]

Configuration for running “validation” within an inference evaluator job.

This mirrors the validation loop performed at the end of each training epoch, producing metrics like val/mean/weighted_rmse and val/mean/loss. A possible use case is to configure loader so that it matches the validation data loader used during training, but other periods or datasets that are compatible with the checkpoint may also be used.

Parameters:
  • loader (DataLoaderConfig) – Data loader configuration for validation data. Uses the same DataLoaderConfig as training data loaders.

  • aggregator (OneStepAggregatorConfig, default: <factory>) – Configuration for the one-step validation aggregator.

  • stepper_training (TrainStepperConfig, default: <factory>) – Training-specific configuration including loss, ensemble settings, and forward step scheduling. Set this to match the training configuration if you want val/mean/loss to be directly comparable. The number of forward steps is derived from stepper_training.n_forward_steps (defaults to 1 if unset).

get_n_forward_steps()[source]

Resolve the effective number of forward steps for validation.

Derives the value from stepper_training.n_forward_steps. Defaults to 1 for standard single-step validation if unset.

Return type:

int

class fme.ace.ValueConfig(value, dtype='float32')[source]

Configuration for specifying a solar constant value.

Parameters:
  • value (float) – scalar solar constant value to use for all time.

  • dtype (str, default: 'float32') – dtype for solar constant and resulting insolation.

class fme.ace.XarrayDataConfig(data_path, file_pattern='*.nc', n_repeats=1, engine='netcdf4', spatial_dimensions='latlon', subset=<factory>, infer_timestep=True, dtype='float32', overwrite=<factory>, fill_nans=None, isel=<factory>, labels=None)[source]
Parameters:
  • data_path (str) – Path to the data.

  • file_pattern (str, default: '*.nc') – Glob pattern to match files in the data_path.

  • n_repeats (int, default: 1) – Number of times to repeat the dataset (in time). It is up to the user to ensure that the input dataset to repeat results in data that is reasonably continuous across repetitions.

  • engine (Literal['netcdf4', 'h5netcdf', 'zarr'], default: 'netcdf4') – Backend used in xarray.open_dataset call.

  • spatial_dimensions (Literal['healpix', 'latlon'], default: 'latlon') – Specifies the spatial dimensions for the grid, default is lat/lon. If ‘latlon’, it is assumed that the last two dimensions are latitude and longitude, respectively. If ‘healpix’, it is assumed that the last three dimensions are face, height, and width, respectively.

  • subset (Slice | TimeSlice | RepeatedInterval, default: <factory>) – Slice defining a subset of the XarrayDataset to load. This can either be a Slice of integer indices or a TimeSlice of timestamps. This feature is applied directly to the dataset samples. For example, if the file(s) have the time coordinate (t0, t1, t2, t3) and requirements.n_timesteps=2, then subset=Slice(stop=2) will provide two samples: (t0, t1), (t1, t2).

  • infer_timestep (bool, default: True) – Whether to infer the timestep from the provided data. This should be set to True (the default) for ACE training. It may be useful to toggle this to False for applications like downscaling, which do not depend on the timestep of the data and therefore lack the additional requirement that the data be ordered and evenly spaced in time. It must be set to True if n_repeats > 1 in order to be able to infer the full time coordinate.

  • dtype (Optional[str], default: 'float32') – Data type to cast the data to. If None, no casting is done. It is required that ‘torch.{dtype}’ is a valid dtype.

  • overwrite (OverwriteConfig, default: <factory>) – Optional OverwriteConfig to overwrite loaded field values.

  • fill_nans (Optional[FillNaNsConfig], default: None) – Optional FillNaNsConfig to fill NaNs with a constant value.

  • isel (Mapping[str, Slice | int], default: <factory>) – Optional xarray isel arguments to be passed to the dataset. Will raise ValueError if time is included here, since the subset argument is used specifically for selecting times. Horizontal dimensions are also not currently supported.

  • labels (Optional[list[str]], default: None) – Optional list of labels to be returned with the data.

Examples

If data is stored in a directory with multiple netCDF files which can be concatenated along the time dimension, use:

>>> fme.ace.XarrayDataConfig(data_path="/some/directory", file_pattern="*.nc") 

If data is stored in a single zarr store at /some/directory/dataset.zarr, use:

>>> fme.ace.XarrayDataConfig(
...     data_path="/some/directory",
...     file_pattern="dataset.zarr",
...     engine="zarr"
... ) 
property available_labels: set[str] | None

Return the labels that are available in the dataset.

fme.ace.get_forcing_data(config, total_forward_steps, window_requirements, initial_condition, surface_temperature_name=None, ocean_fraction_name=None, label_override=None)[source]

Return a GriddedData loader for forcing data based on the initial condition. This function determines the start indices for the forcing data based on the initial time in the provided initial condition.

Parameters:
  • config (ForcingDataLoaderConfig) – Parameters for the forcing data loader.

  • total_forward_steps (int) – Total number of forward steps to take over the course of inference.

  • window_requirements (DataRequirements) – Data requirements for the forcing data.

  • initial_condition (PrognosticState) – Initial condition for the inference.

  • surface_temperature_name (Optional[str], default: None) – Name of the surface temperature variable. Can be set to None if no ocean temperature prescribing is being used.

  • ocean_fraction_name (Optional[str], default: None) – Name of the ocean fraction variable. Can be set to None if no ocean temperature prescribing is being used.

  • label_override (Optional[list[str]], default: None) – Labels for the forcing data. If provided, these labels will be provided on each sample instead of the labels in the dataset.

Return type:

InferenceGriddedData

Returns:

A data loader for forcing data with coordinates and metadata.

fme.ace.get_initial_condition(ds, prognostic_names, labels=None, n_ensemble=1)[source]

Given a dataset, extract a mapping of variables to tensors. and the time coordinate corresponding to the initial conditions.

Parameters:
  • ds (Dataset) – Dataset containing initial condition data. Must include prognostic_names as variables, and they must each have shape (n_samples, n_lat, n_lon). Dataset must also include a ‘time’ variable with length n_samples.

  • prognostic_names (Sequence[str]) – Names of prognostic variables to extract from the dataset.

  • labels (Optional[list[str]], default: None) – Labels for the initial conditions. If provided, these labels will be provided to the stepper for every initial condition.

  • n_ensemble (int, default: 1) – Number of ensemble members per initial state

Return type:

PrognosticState

Returns:

The initial condition and the time coordinate.

fme.coupled

class fme.coupled.ComponentInitialConditionConfig(path, engine='netcdf4')[source]
Parameters:
  • path (str) – Path to the component initial condition dataset.

  • engine (Literal['netcdf4', 'h5netcdf', 'zarr'], default: 'netcdf4') – Backend used in xarray.open_dataset call.

class fme.coupled.CoupledDataWriterConfig(ocean=<factory>, atmosphere=<factory>)[source]

Configuration for coupled inference data writers.

Parameters:
  • ocean (DataWriterConfig, default: <factory>) – Configuration for ocean data writer.

  • atmosphere (DataWriterConfig, default: <factory>) – Configuration for atmosphere data writer.

class fme.coupled.CoupledForcingDataLoaderConfig(atmosphere, ocean=None, num_data_workers=0)[source]
Parameters:
class fme.coupled.CoupledInitialConditionConfig(ocean, atmosphere, start_indices=None)[source]

Configuration for initial conditions in coupled inference.

Parameters:
class fme.coupled.InferenceConfig(experiment_dir, n_coupled_steps, checkpoint_path, logging, initial_condition, forcing_loader, coupled_steps_in_memory=1, data_writer=<factory>, aggregator=<factory>, n_ensemble_per_ic=1)[source]

Configuration for running inference.

Parameters:
  • experiment_dir (str) – Directory to save results to.

  • n_coupled_steps (int) – Number of steps to run the model forward for.

  • checkpoint_path (str | StandaloneComponentCheckpointsConfig) – Path to a CoupledStepper training checkpoint to load, or a mapping to two separate Stepper training checkpoints.

  • logging (LoggingConfig) – configuration for logging.

  • initial_condition (CoupledInitialConditionConfig) – Configuration for initial condition data.

  • forcing_loader (CoupledForcingDataLoaderConfig) – Configuration for forcing data.

  • coupled_steps_in_memory (int, default: 1) – Number of coupled steps to complete in memory at a time, will load one more step for initial condition.

  • data_writer (CoupledDataWriterConfig, default: <factory>) – Configuration for data writers.

  • aggregator (InferenceAggregatorConfig, default: <factory>) – Configuration for inference aggregator.

  • n_ensemble_per_ic (int, default: 1) – Number of ensemble members per initial condition