Skip to content

API Reference

Glossary

Variable Definition
N # of Batches
Cin # of input channels (i.e. features)
D or Db Data or Batch depth (z)
H or Hb Data or Batch height (y)
W or Wb Data or Batch width (x)

Train/Evaluate


Train

CLASS

sapsan.lib.experiments.train.Train(model: Estimator, data_parameters: dict, backend = FakeBackend(), show_log = True, run_name = 'train')

Call Train to set up your run
Parameters
Name Type Discription Default
model object model to use for training
data_parameters dict data parameters from the data loader, necessary for tracking
backend object backend to track the experiment FakeBackend()
show_log bool show the loss vs. epoch progress plot (it will be save in mlflow in either case) True
run_name str 'run name' tag as recorded under MLflow train

sapsan.lib.experiments.train.Train.run()

Run the model
Return
Type Description
pytorch or sklearn or custom type trained model

Evaluate

CLASS

sapsan.lib.experiments.evaluate.Evaluate(model: Estimator, data_parameters: dict, backend = FakeBackend(), cmap: str = 'plasma', run_name: str = 'evaluate', **kwargs)

Call Evaluate to set up the testing of the trained model. Don't forget to update estimator.loaders with the new data for testing.
Parameters
Name Type Discription Default
model object model to use for testing
data_parameters dict data parameters from the data loader, necessary for tracking
backend obejct backend to track the experiment FakeBackend()
cmap str matplotlib colormap to use for slice plots plasma
run_name str 'run name' tag as recorded under MLflow evaluate
pdf_xlim tuple x-axis limits for the PDF plot
pdf_ylim tuple y-axis limits for the PDF plot

sapsan.lib.experiments.evaluate.Evaluate.run()

Run the evaluation of the trained model
Return
Type Description
dict{'target' : np.ndarray, 'predict' : np.ndarray} target and predicted data

Estimators


CNN3d

CLASS

sapsan.lib.estimator.CNN3d(loaders, config, model)

A model based on Pytorch's 3D Convolutional Neural Network
Parameters
Name Type Discription Default
loaders dict contains input and target data (loaders['train'], loaders['valid']). Datasets themselves have to be torch.tensor(s) CNN3dConfig()
config class configuration to use for the model CNN3dConfig()
model class the model itself - should not be adjusted CNN3dModel()

sapsan.lib.estimator.CNN3d.save(path: str)

Saves model and optimizer states, as well as final epoch and loss
Parameters
Name Type Discription Default
path str save path of the model and its config parameters, {path}/model.pt and {path}/params.json respectively

sapsan.lib.estimator.CNN3d.load(path: str, estimator, load_saved_config = False)

Loads model and optimizer states, as well as final epoch and loss
Parameters
Name Type Discription Default
path str save path of the model and its config parameters, {path}/model.pt and {path}/params.json respectively
estimator estimator need to provide an initialized model for which to load the weights. The estimator can include a new config setup, changing n_epochs to keep training the model further.
load_saved_config bool updates config parameters from {path}/params.json. False

Return

Type Description
pytorch model loaded model

CLASS

sapsan.lib.estimator.CNN3dConfig(n_epochs, patience, min_delta, logdir, lr, min_lr, *args, **kwargs)

Configuration for the CNN3d - based on pytorch and catalyst libraries
Parameters
Name Type Discription Default
n_epochs int number of epochs 1
patience int number of epochs with no improvement after which training will be stopped. Default 10
min_delta float minimum change in the monitored metric to qualify as an improvement, i.e. an absolute change of less than min_delta, will count as no improvement 1e-5
log_dir int path to store the logs ./logs/
lr float learning rate 1e-3
min_lr float a lower bound of the learning rate for ReduceLROnPlateau lr*1e-2
device str specify the device to run the model on cuda (or switch to cpu)
loader_key str the loader to use for early stop: train or valid first loader provided*, which is usually 'train'
metric_key str the metric to use for early stop 'loss'
ddp bool turn on Distributed Data Parallel (DDP) in order to distribute the data and train the model across multiple GPUs. This is passed to Catalyst to activate the ddp flag in runner (see more Distributed Training Tutorial; the runner is set up in pytorch_estimator.py). Note: doesn't support jupyter notebooks - prepare a script! False

PIMLTurb

CLASS

sapsan.lib.estimator.PIMLTurb(activ, loss, loaders, ks_stop, ks_frac, ks_scale, l1_scale, l1_beta, sigma, config, model)

Physics-informed machine learning model to predict Reynolds-like stress tensor, \(Re\), for turbulence modeling. Learn more on the wiki: PIMLTurb
A custom loss function was developed for this model combining spatial (SmoothL1) and statistical (Kolmogorov-Smirnov) losses.
Parameters
Name Type Discription Default
activ str activation function to use from PyTorch Tanhshrink
loss str loss function to use; accepts only custom SmoothL1_KSLoss
loaders dict contains input and target data (loaders['train'], loaders['valid']). Datasets themselves have to be torch.tensor(s)
ks_stop float early-stopping condition based on the KS loss value alone 0.1
ks_frac float fraction the KS loss contributes to the total loss 0.5
ks_scale float scale factor to prioritize KS loss over SmoothL1 (should not be altered) 1
l1_scale float scale factor to prioritize SmoothL1 loss over KS 1
l1_beta float \(beta\) threshold for smoothing the L1 loss 1
sigma float \(sigma\) for the last layer of the network that performs a filtering operation using a Gaussian kernel 1
config class configuration to use for the model PIMLTurbConfig()
model class the model itself - should not be adjusted PIMLTurbModel()

sapsan.lib.estimator.PIMLTurb.save(path: str)

Saves model and optimizer states, as well as final epoch and loss
Parameters
Name Type Discription Default
path str save path of the model and its config parameters, {path}/model.pt and {path}/params.json respectively

sapsan.lib.estimator.PIMLTurb.load(path: str, estimator, load_saved_config = False)

Loads model and optimizer states, as well as final epoch and loss
Parameters
Name Type Discription Default
path str save path of the model and its config parameters, {path}/model.pt and {path}/params.json respectively
estimator estimator need to provide an initialized model for which to load the weights. The estimator can include a new config setup, changing n_epochs to keep training the model further.
load_saved_config bool updates config parameters from {path}/params.json. False

Return

Type Description
pytorch model loaded model

CLASS

sapsan.lib.estimator.PIMLTurbConfig(n_epochs, patience, min_delta, logdir, lr, min_lr, *args, **kwargs)

Configuration for the PIMLTurb - based on pytorch (catalyst is not used)
Parameters
Name Type Discription Default
n_epochs int number of epochs 1
patience int number of epochs with no improvement after which training will be stopped (not used) 10
min_delta float minimum change in the monitored metric to qualify as an improvement, i.e. an absolute change of less than min_delta, will count as no improvement (not used) 1e-5
log_dir int path to store the logs ./logs/
lr float learning rate 1e-3
min_lr float a lower bound of the learning rate for ReduceLROnPlateau lr*1e-2
device str specify the device to run the model on cuda (or switch to cpu)

PIMLTurb1D

CLASS

sapsan.lib.estimator.PIMLTurb1D(activ, loss, loaders, ks_stop, ks_frac, ks_scale, l1_scale, l1_beta, sigma, config, model)

Physics-informed machine learning model to predict Reynolds-like stress tensor, \(Re\), for turbulence modeling. Learn more on the wiki: PIMLTurb
A custom loss function was developed for this model combining spatial (SmoothL1) and statistical (Kolmogorov-Smirnov) losses.
Parameters
Name Type Discription Default
activ str activation function to use from PyTorch Tanhshrink
loss str loss function to use; accepts only custom SmoothL1_KSLoss
loaders dict contains input and target data (loaders['train'], loaders['valid']). Datasets themselves have to be torch.tensor(s)
ks_stop float early-stopping condition based on the KS loss value alone 0.1
ks_frac float fraction the KS loss contributes to the total loss 0.5
ks_scale float scale factor to prioritize KS loss over SmoothL1 (should not be altered) 1
l1_scale float scale factor to prioritize SmoothL1 loss over KS 1
l1_beta float \(beta\) threshold for smoothing the L1 loss 1
sigma float \(sigma\) for the last layer of the network that performs a filtering operation using a Gaussian kernel 1
config class configuration to use for the model PIMLTurb1DConfig()
model class the model itself - should not be adjusted PIMLTurb1DModel()

sapsan.lib.estimator.PIMLTurb1D.save(path: str)

Saves model and optimizer states, as well as final epoch and loss
Parameters
Name Type Discription Default
path str save path of the model and its config parameters, {path}/model.pt and {path}/params.json respectively

sapsan.lib.estimator.PIMLTurb1D.load(path: str, estimator, load_saved_config = False)

Loads model and optimizer states, as well as final epoch and loss
Parameters
Name Type Discription Default
path str save path of the model and its config parameters, {path}/model.pt and {path}/params.json respectively
estimator estimator need to provide an initialized model for which to load the weights. The estimator can include a new config setup, changing n_epochs to keep training the model further.
load_saved_config bool updates config parameters from {path}/params.json. False

Return

Type Description
pytorch model loaded model

CLASS

sapsan.lib.estimator.PIMLTurb1DConfig(n_epochs, patience, min_delta, logdir, lr, min_lr, *args, **kwargs)

Configuration for the PIMLTurb1D - based on pytorch (catalyst is not used)
Parameters
Name Type Discription Default
n_epochs int number of epochs 1
patience int number of epochs with no improvement after which training will be stopped (not used) 10
min_delta float minimum change in the monitored metric to qualify as an improvement, i.e. an absolute change of less than min_delta, will count as no improvement (not used) 1e-5
log_dir int path to store the logs ./logs/
lr float learning rate 1e-3
min_lr float a lower bound of the learning rate for ReduceLROnPlateau lr*1e-2
device str specify the device to run the model on cuda (or switch to cpu)

PICAE

CLASS

sapsan.lib.estimator.PICAE(loaders, config, model)

Convolutional Auto Encoder with Divergence-Free Kernel and with periodic padding. Further details can be found on the PICAE page
Parameters
Name Type Discription Default
loaders dict contains input and target data (loaders['train'], loaders['valid']). Datasets themselves have to be torch.tensor(s)
config class configuration to use for the model PICAEConfig()
model class the model itself - should not be adjusted PICAEModel()

sapsan.lib.estimator.PICAE.save(path: str)

Saves model and optimizer states, as well as final epoch and loss
Parameters
Name Type Discription Default
path str save path of the model and its config parameters, {path}/model.pt and {path}/params.json respectively

sapsan.lib.estimator.PICAE.load(path: str, estimator, load_saved_config = False)

Loads model and optimizer states, as well as final epoch and loss
Parameters
Name Type Discription Default
path str save path of the model and its config parameters, {path}/model.pt and {path}/params.json respectively
estimator estimator need to provide an initialized model for which to load the weights. The estimator can include a new config setup, changing n_epochs to keep training the model further.
load_saved_config> bool updates config parameters from {path}/params.json False

Return

Type Description
pytorch model loaded model

CLASS

sapsan.lib.estimator.PICAEConfig(n_epochs, patience, min_delta, logdir, lr, min_lr, weight_decay, nfilters, kernel_size, enc_nlayers, dec_nlayers, *args, **kwargs)

Configuration for the CNN3d - based on pytorch and catalyst libraries
Parameters
Name Type Discription Default
n_epochs int number of epochs 1
batch_dim int dimension of a batch in each axis 64
patience int number of epochs with no improvement after which training will be stopped 10
min_delta float minimum change in the monitored metric to qualify as an improvement, i.e. an absolute change of less than min_delta, will count as no improvement 1e-5
log_dir str path to store the logs ./logs/
lr float learning rate 1e-3
min_lr float a lower bound of the learning rate for ReduceLROnPlateau lr*1e-2
weight_decay float weight decay (L2 penalty) 1e-5
nfilters int the output dim for each convolutional layer, which is the number of "filters" learned by that layer 6
kernel_size tuple size of the convolutional kernel (3,3,3)
enc_layers int number of encoding layers 3
dec_layers int number of decoding layers 3
device str specify the device to run the model on cuda (or switch to cpu)
loader_key str the loader to use for early stop: train or valid first loader provided*, which is usually 'train'
metric_key str the metric to use for early stop 'loss'
ddp bool turn on Distributed Data Parallel (DDP) in order to distribute the data and train the model across multiple GPUs. This is passed to Catalyst to activate the ddp flag in runner (see more Distributed Training Tutorial; the runner is set up in pytorch_estimator.py). Note: doesn't support jupyter notebooks - prepare a script! False

KRR

CLASS

sapsan.lib.estimator.KRR(loaders, config, model)

A model based on sk-learn Kernel Ridge Regression
Parameters
Name Type Discription Default
loaders list contains input and target data
config class configuration to use for the model KRRConfig()
model class the model itself - should not be adjusted KRRModel()

sapsan.lib.estimator.KRR.save(path: str)

Saves the model
Parameters
Name Type Discription Default
path str save path of the model and its config parameters, {path}/model.pt and {path}/params.json respectively

sapsan.lib.estimator.KRR.load(path: str, estimator, load_saved_config = False)

Loads the model
Parameters
Name Type Discription Default
path str save path of the model and its config parameters, {path}/model.pt and {path}/params.json respectively
estimator estimator need to provide an initialized model for which to load the weights. The estimator can include a new config setup, changing n_epochs to keep training the model further.
load_saved_config bool updates config parameters from {path}/params.json False

Return

Type Description
sklearn model loaded model

CLASS

sapsan.lib.estimator.KRRConfig(alpha, gamma)

Configuration for the KRR model
Parameters
Name Type Discription Default
alpha float regularization term, hyperparameter None
gamma float full-width at half-max for the RBF kernel, hyperparameter None

load_estimator

CLASS

sapsan.lib.estimator.load_estimator()

Dummy estimator to call load() to load the saved pytorch models

sapsan.lib.estimator.load_estimator.load(path: str, estimator, load_saved_config = False)

Loads model and optimizer states, as well as final epoch and loss
Parameters
Name Type Discription Default
path str save path of the model and its config parameters, {path}/model.pt and {path}/params.json respectively
estimator estimator need to provide an initialized model for which to load the weights. The estimator can include a new config setup, changing n_epochs to keep training the model further
load_saved_config bool updates config parameters from {path}/params.json False

Return

Type Description
pytorch model loaded model

load_sklearn_estimator

CLASS

sapsan.lib.estimator.load_sklearn_estimator()

Dummy estimator to call load() to load the saved sklearn models

sapsan.lib.estimator.load_sklearn_estimator.load(path: str, estimator, load_saved_config = False)

Loads model
Parameters
Name Type Discription Default
path str save path of the model and its config parameters, {path}/model.pt and {path}/params.json respectively
estimator estimator need to provide an initialized model for which to load the weights. The estimator can include a new config setup to keep training the model further
load_saved_config bool updates config parameters from {path}/params.json False

Return

Type Description
sklearn model loaded model

Torch Modules


Gaussian

CLASS

sapsan.lib.estimator.torch_modules.Gaussian(sigma: int)

[1D,3D] Applies a Guassian filter as a torch layer through a series of 3 separable 1D convolutions, utilizing torch.nn.funcitonal.conv3d. CUDA is supported.
Parameters
Name Type Discription Default
sigma int standard deviation \(\sigma\) for a Gaussian kernel 2

sapsan.lib.estimator.torch_modules.Gaussian.forward(tensor: torch.tensor)

Parameters
Name Type Discription Default
tensor torch.tensor input torch tensor of shape [N, Cin, Din, Hin, Win]

Return

Type Description
torch.tensor filtered 3D torch data

Interp1d

CLASS

sapsan.lib.estimator.torch_modules.Interp1d()

Linear 1D interpolation done in native PyTorch. CUDA is supported. Forked from @aliutkus

sapsan.lib.estimator.torch_modules.Interp1d.forward(x: torch.tensor, y: torch.tensor, xnew: torch.tensor, out: torch.tensor)

Parameters
Name Type Discription Default
x torch.tensor 1D or 2D tensor
y torch.tensor 1D or 2D tensor; the length of y along its last dimension must be the same as that of x
xnew torch.tensor 1D or 2D tensor of real values. xnew can only be 1D if both x and y are 1D. Otherwise, its length along the first dimension must be the same as that of whichever x and y is 2D.
out torch.tensor Tensor for the output If None, allocated automatically

Return

Type Description
torch.tensor interpolated tensor

Data Loaders


HDF5Dataset

CLASS

sapsan.lib.data.hdf5_dataset.HDF5Dataset( path: str, features: List[str], target: List[str], checkpoints: List[int], batch_size: int = None, input_size: int = None, sampler: Optional[Sampling] = None, time_granularity: float = 1, features_label: Optional[List[str]] = None, target_label: Optional[List[str]] = None, flat: bool = False, shuffle: bool=False, train_fraction = None)

HDF5 data loader class
Parameters
Name Type Discription Default
path str path to the data in the following format: "data/t_{checkpoint:1.0f}/{feature}_data.h5"
features List[str] list of train features to load ['not_specified_data']
target List[str] list of target features to load None
checkpoints List[int] list of checkpoints to load (they will be appended as batches)
input_size int dimension of the loaded data in each axis
batch_size int dimension of a batch in each axis. If batch_size != input_size, the datacube will be evenly splitted input_size (doesn't work with sampler)
batch_num int the number of batches to be loaded at a time 1
sampler object data sampler to use (ex: EquidistantSampling()) None
time_granularity float what is the time separation (dt) between checkpoints 1
features_label List[str] hdf5 data label for the train features list(file.keys())[-1], i.e. last one in hdf5 file
target_label List[str] hdf5 data label for the target features list(file.keys())[-1], i.e. last one in hdf5 file
flat bool flatten the data into [Cin, D*H*W]. Required for sk-learn models False
shuffle bool shuffle the dataset False
train_fraction float or int a fraction of the dataset to be used for training (accessed through loaders['train']). The rest will be used for validation (accessed through loaders['valid']). If int is provided, then that number of batches will be used for training. If float is provided, then it will try to split the data either by batch or by actually slicing the data cube into smaller chunks None - training data will be used for validation, effectively skipping the latter

sapsan.lib.data.hdf5_dataset.HDF5Dataset.load_numpy()

HDF5 data loader method - call it to load the data as a numpy array. If targets are not specified, than only features will be loaded (hence you can just load 1 dataset at a time).
Return
Type Description
np.ndarray, np.ndarray loaded a dataset as a numpy array

sapsan.lib.data.hdf5_dataset.HDF5Dataset.convert_to_torch([x, y])

Splits numpy arrays into batches and converts to torch dataloader
Parameters
Name Type Discription Default
[x, y] list or np.ndarray a list of input datasets to batch and convert to torch loaders

Return

Type Description
OrderedDict{'train' : DataLoader, 'valid' : DataLoader } Data in Torch Dataloader format ready for training

sapsan.lib.data.hdf5_dataset.HDF5Dataset.load()

Loads, splits into batches, and converts into torch dataloader. Effectively combines .load_numpy and .convert_to_torch
Return
Type Description
np.ndarray, np.ndarray loaded train and target features: x, y

get_loader_shape

sapsan.lib.data.data_functions.get_loader_shape()

Returns the shape of the loaded tensors - the loaded data that has been split into train and valid datasets.
Parameters
Name Type Discription Default
loaders torch DataLoader the loader of tensors passed for training
name str name of the dataset in the loaders; usually either train or valid None - chooses the first entry in loaders

Return

Type Description
np.ndarray shape of the tensor

Data Manipulation


EquidistantSampling

CLASS

sapsan.lib.data.sampling.EquidistantSampling(target_dim)

Samples the data to a lower dimension, keeping separation between the data points equally distant
Parameters
Name Type Discription Default
target_dim np.ndarray new shape of the input in the form [D, H, W]

sapsan.lib.data.sampling.EquidistantSampling.sample(data)

Performs sampling of the data
Parameters
Name Type Discription Default
data np.ndarray input data to be sampled - has the shape of [axis, D, H, W]

Return

Type Description
np.ndarray Sampled data with the shape [axis, D, H, W]

split_data_by_batch

sapsan.utils.shapes.split_data_by_batch(data: np.ndarray, size: int, batch_size: int, n_features: int, axis: int)

[2D, 3D]: splits data into smaller cubes or squares of batches
Parameters
Name Type Discription Default
data np.ndarray input 2D or 3D data, [Cin, D, H, W]
size int dimensionality of the data in each axis
batch_size int dimensionality of the batch in each axis
n_features int number of channels of the input data
axis int number of axes, 2 or 3

Return

Type Description
np.ndarray batched data: [N, Cin, Db, Hb, Wb]

combine_data

sapsan.utils.shapes.combine_data(data: np.ndarray, input_size: tuple, batch_size: tuple, axis: int)

[2D, 3D] - reverse of split_data_by_batch function
Parameters
Name Type Discription Default
data np.ndarray input 2D or 3D data, [N, Cin, Db, Hb, Wb]
input_size tuple dimensionality of the original data in each axis
batch_size tuple dimensionality of the batch in each axis
axis int number of axes, 2 or 3

Return

Type Description
np.ndarray reassembled data: [Cin, D, H, W]

slice_of_cube

sapsan.utils.shapes.slice_of_cube(data: np.ndarray, feature: Optional[int] = None, n_slice: Optional[int] = None)

Select a slice of a cube (to plot later)
Parameters
Name Type Discription Default
data np.ndarray input 3D data, [Cin, D, H, W]
feature int feature to take the slice of, i.e. the value of Cin 1
n_slice int what slice to select, i.e. the value of D 1

Return

Type Description
np.ndarray data slice: [H, W]

Filter


spectral

sapsan.utils.filter.spectral(im: np.ndarray, fm: int)

[2D, 3D] apply a spectral filter
Parameters
Name Type Discription Default
im np.ndarray input dataset (ex: [Cin, D, H, W])
fm int number of Fourier modes to filter down to

Return

Type Description
np.ndarray filtered dataset

box

sapsan.utils.filter.box(im: np.ndarray, ksize)

[2D] apply a box filter
Parameters
Name Type Discription Default
im np.ndarray input dataset (ex: [Cin, H, W])
ksize tupple kernel size (ex: ksize = (2,2))

Return

Type Description
np.ndarray filtered dataset

gaussian

sapsan.utils.filter.gaussian(im: np.ndarray, sigma)

[2D, 3D] apply a gaussian filter
Note: Guassian filter assumes dx=1 between the points. Adjust sigma accordingly.
Parameters
Name Type Discription Default
im np.ndarray input dataset (ex: [H, W] or [D, H, W])
sigma float or tuple of floats standard deviation for Gaussian kernel. Sigma can be defined for each axis individually.

Return

Type Description
np.ndarray filtered dataset

Backend (Tracking)


MLflowBackend

CLASS

sapsan.lib.backends.mlflow.MLflowBackend(name, host, port)

Initilizes mlflow and starts up mlflow ui at a given host:port
Parameters
Name Type Discription Default
name str name under which to record the experiment "experiment"
host str host of mlflow ui "localhost"
port int port of mlflow ui 5000

sapsan.lib.backends.mlflow.MLflowBackend.start_ui()

starts MLflow ui at a specified host and port

sapsan.lib.backends.mlflow.MLflowBackend.start(run_name: str, nested = False, run_id = None)

Starts a tracking run
Parameters
Name Type Discription Default
run_name str name of the run "train" for Train(), "evaluate" for Evaluate()
nested bool whether or not to nest the recorded run False for Train(), True for Evaluate()
run_id str run id None - a new will be generated

Return

Type Description
str run_id

sapsan.lib.backends.mlflow.MLflowBackend.resume(run_id, nested = True)

Resumes a previous run, so you can record extra parameters
Parameters
Name Type Discription Default
run_id str id of the run to resume
nested bool whether or not to nest the recorded run True, since it will usually be an Evaluate() run

sapsan.lib.backends.mlflow.MLflowBackend.log_metric()

Logs a metric

sapsan.lib.backends.mlflow.MLflowBackend.log_parameter()

Logs a parameter

sapsan.lib.backends.mlflow.MLflowBackend.log_artifact()

Logs an artifact (any saved file such, e.g. .png, .txt)

sapsan.lib.backends.mlflow.MLflowBackend.log_model()

Log a PyTorch model as an MLflow artifact for the current run. Corresponds to mlflow.pytorch.log_model()

sapsan.lib.backends.mlflow.MLflowBackend.load_model()

Load a PyTorch model from a local file or a run. Corresponds to mlflow.pytorch.load_model()

sapsan.lib.backends.mlflow.MLflowBackend.close_active_run()

Closes all active MLflow runs

sapsan.lib.backends.mlflow.MLflowBackend.end()

Ends the most recent MLflow run

FakeBackend

CLASS

sapsan.lib.backends.fake.FakeBackend()

Pass to train in order to disable backend (tracking)

Plotting


plot_params

sapsan.utils.plot.plot_params()

Contains the matplotlib parameters that format all of the plots (font.size, axes.labelsize, etc.)
Return
Type Description
dict matplotlib parameters
Default Parameters
def plot_params():
    params = {'font.size': 14, 'legend.fontsize': 14, 
            'axes.labelsize': 20, 'axes.titlesize': 24,
            'xtick.labelsize': 17,'ytick.labelsize': 17,
            'axes.linewidth': 1, 'patch.linewidth': 3, 
            'lines.linewidth': 3,
            'xtick.major.width': 1.5,'ytick.major.width': 1.5,
            'xtick.minor.width': 1.25,'ytick.minor.width': 1.25,
            'xtick.major.size': 7,'ytick.major.size': 7,
            'xtick.minor.size': 4,'ytick.minor.size': 4,
            'xtick.direction': 'in','ytick.direction': 'in',              
            'axes.formatter.limits': [-7, 7],'axes.grid': True, 
            'grid.linestyle': ':','grid.color': '#999999',
            'text.usetex': False,}
    return params

pdf_plot

sapsan.utils.plot.pdf_plot(series: List[np.ndarray], bins: int = 100, label: Optional[List[str]] = None, figsize: tuple, dpi: int, ax: matplotlib.axes, style: str)

Plot a probability density function (PDF) of a single or multiple datasets
Parameters
Name Type Discription Default
series List[np.ndarray] input datasets
bins int number of bins to use for the dataset to generate the pdf 100
label List[str] list of names to use as labels in the legend None
figsize tuple figure size as passed to matplotlib figure (6,6)
dpi int resolution of the figure 60
ax matplotlib.axes axes object to use for plotting (if you want to define your own figure and subplots) None - creates a separate figure
style str accepts matplotlib styles 'tableau-colorblind10'

Return

Type Description
matplotlib.axes object ax

cdf_plot

sapsan.utils.plot.cdf_plot(series: List[np.ndarray], bins: int = 100, label: Optional[List[str]] = None, figsize: tuple, dpi: int, ax: matplotlib.axes, ks: bool, style: str)

Plot a cumulative distribution function (CDF) of a single or multiple datasets
Parameters
Name Type Discription Default
series List[np.ndarray] input datasets
bins int number of bins to use for the dataset to generate the pdf 100
label List[str] list of names to use as labels in the legend None
figsize tuple figure size as passed to matplotlib figure (6,6)
dpi int resolution of the figure 60
ax matplotlib.axes axes object to use for plotting (if you want to define your own figure and subplots) None - creates a separate figure
ks bool if True prints out on the plot itself the Kolomogorov-Smirnov Statistic. It will also be returned along with the ax object False
style str accepts matplotlib styles 'tableau-colorblind10'

Return

Type Description
matplotlib.axes object, float (if ks==True) ax, ks (if ks==True)

line_plot

sapsan.utils.plot.line_plot(series: List[np.ndarray], bins: int = 100, label: Optional[List[str]] = None, plot_type: str, figsize: tuple, dpi: int, ax: matplotlib.axes, style: str)

Plot linear data of x vs y - same matplotlib formatting will be used as the other plots
Parameters
Name Type Discription Default
series List[np.ndarray] input datasets
bins int number of bins to use for the dataset to generate the pdf 100
label List[str] list of names to use as labels in the legend None
plot_type str axis type of the matplotlib plot; options = ['plot', 'semilogx', 'semilogy', 'loglog'] 'plot'
figsize tuple figure size as passed to matplotlib figure (6,6)
linestyle List[str] list of linestyles to use for each profile for the matplotlib figure ['-'] (solid line)
dpi int resolution of the figure 60
ax matplotlib.axes axes object to use for plotting (if you want to define your own figure and subplots) None - creates a separate figure
style str accepts matplotlib styles 'tableau-colorblind10'

Return

Type Description
matplotlib.axes object ax

slice_plot

sapsan.utils.plot.slice_plot(series: List[np.ndarray], label: Optional[List[str]] = None, cmap = 'plasma', figsize: tuple, dpi: int, ax: matplotlib.axes)

Plot 2D spatial distributions (slices) of your target and prediction datasets. Colorbar limits for both slices are set based on the minimum and maximum of the 2nd (target) provided dataset.
Parameters
Name Type Discription Default
series List[np.ndarray] input datasets
label List[str] list of names to use as labels in the legend None
cmap str matplotlib colormap to use 'viridis'
figsize tuple figure size as passed to matplotlib figure (6,6)
dpi int resolution of the figure 60
ax matplotlib.axes axes object to use for plotting (if you want to define your own figure and subplots)
WARNING: only works if a single image is supplied to slice_plot(), otherwise will be ignored
None - creates a separate figure

Return

Type Description
matplotlib.axes object ax

log_plot

sapsan.utils.plot.log_plot(show_log = True, log_path = 'logs/logs/train.csv', valid_log_path = 'logs/logs/valid.csv', delimiter=',', train_name = 'train_loss', valid_name = 'valid_loss', train_column = 1, valid_column = 1, epoch_column = 0)

Plots an interactive training log of train_loss vs. epoch with plotly
Parameters
Name Type Discription Default
show_log bool show the loss vs. epoch progress plot (it will be save in mlflow in either case) True
log_path str path to training log produced by the catalyst wrapper 'logs/logs/train.csv'
valid_log_path str path to validation log produced by the catalyst wrapper 'logs/logs/valid.csv'
delimiter str delimiter to use for numpy.genfromtxt data loading ','
train_name str name for the training label 'train_loss'
valid_name str name for the validation label 'valid_loss'
train_column int column to load for training data from log_path 1
valid_column int column to load for validation data from valid_log_path 1
epoch_column int column to load the epoch index from log_path. If None, then epoch will be generated fro the number of entries 0

Return

Type Description
plotly.express object plot figure

model_graph

sapsan.utils.plot.model_graph(model, shape: np.array, transforms)

Creates a graph of the ML model (needs graphviz to be installed). A tutorial is available on the wiki: Model Graph
The method is based on hiddenlayer originally written by Waleed Abdulla.
Parameters
Name Type Discription Default
model object initialized pytorch or tensorflow model
shape np.array shape of the input array in the form [N, Cin, Db, Hb, Wb], where Cin=1
transforms list[methods] a list of hiddenlayer transforms to be applied (Fold, FoldId, Prune, PruneBranch, FoldDuplicates, Rename), defined in transforms.py See below
Default Parameters
import sapsan.utils.hiddenlayer as hl
transforms = [
                hl.transforms.Fold("Conv > MaxPool > Relu", "ConvPoolRelu"),
                hl.transforms.Fold("Conv > MaxPool", "ConvPool"),    
                hl.transforms.Prune("Shape"),
                hl.transforms.Prune("Constant"),
                hl.transforms.Prune("Gather"),
                hl.transforms.Prune("Unsqueeze"),
                hl.transforms.Prune("Concat"),
                hl.transforms.Rename("Cast", to="Input"),
                hl.transforms.FoldDuplicates()
            ]

Return

Type Description
graphviz.Digraph object SVG graph of a model

Physics


ReynoldsStress

sapsan.utils.physics.ReynoldsStress(u, filt, filt_size, only_x_components=False)

Calculates a stress tensor of the form
\[ \tau_{ij} = \widetilde{u_i u_j} - \tilde{u}_i\tilde{u}_j \]
where \(\tilde{u}\) is the filtered \(u\)
Parameters
Name Type Discription Default
u np.ndarray input velocity in 3D - [axis, D, H, W]
filt sapsan.utils.filters the type of filter to use (spectral, box, gaussian). Pass the filter itself by loading the appropriate one from sapsan.utils.filters gaussian
filt_size int or float size of the filter to apply. For different filter types, the size is defined differently. Spectral - fourier mode to filter to, Box - k_size (box size), Gaussian - sigma 2 (sigma=2 for gaussian)
only_x_component bool calculates and outputs only x components of the tensor in shape [row, D, H, W] - calculating all 9 can be taxing on memory False

Return

Type Description
np.ndarray stress tensor of shape [column, row, D, H, W]

PowerSpectrum

CLASS

sapsan.utils.physics.PowerSpectrum(u: np.ndarray)

Sets up to produce a power spectrum
Parameters
Name Type Discription Default
u np.ndarray input velocity
(first dimension must be the axis=[1, 2, or 3],
e.g. the shape for 3D velocity should be: [axis, D, H, W])

sapsan.utils.physics.PowerSpectrum.calculate()

Calculates the power spectrum
Return
Type Description
np.ndarray, np.ndarray k_bins (fourier modes), Ek_bins (E(k))

sapsan.utils.physics.PowerSpectrum.spectrum_plot(k_bins, Ek_bins, kolmogorov=True, kl_a)

Plots the calculated power spectrum
Parameters
Name Type Discription Default
k_bins np.ndarray fourier mode values along x-axis
Ek_bins np.ndarray energy as a function of k: E(k)
kolmogorov bool plots scaled Kolmogorov's -5/3 spectrum alongside the calculated one True
kl_A float scaling factor of Kolmogorov's law np.amax(self.Ek_bins)*1e1

Return

Type Description
matplotlib.axes object spectrum plot

GradientModel

CLASS

sapsan.utils.physics.GradientModel(u: np.ndarray, filter_width, delta_u = 1)

sets up to apply a gradient turbulence subgrid model:
\[ \tau_{ij} = \frac{1}{12} \Delta^2 \,\delta_k u^*_i \,\delta_k u^*_j \]
where \(\Delta\) is the filter width and \(u^*\) is the filtered \(u\)
Parameters
Name Type Discription Default
u np.ndarray input filtered quantity in 3D - [axis, D, H, W]
filter_width float width of the filter which was applied onto u
delta_u distance between the points on the grid to use for scaling 1

sapsan.utils.physics.GradientModel.gradient()

calculated the gradient of the given input data from GradientModel
Return
Type Description
np.ndarray gradient with shape [column, row, D, H, W]

sapsan.utils.physics.GradientModel.model()

calculates the gradient model tensor with shape [column, row, D, H, W]
Return
Type Description
np.ndarray gradient model tensor

DynamicSmagorinskyModel

CLASS

sapsan.utils.physics.DynamicSmagorinskyModel(u: np.ndarray, filt, original_filt_size, filt_ratio, du, delta_u)

sets up to apply a Dynamic Smagorinsky (DS) turbulence subgrid model:
\[ \tau_{ij} = -2(C_s\Delta^*)^2|S^*|S^*_{ij} \]
where \(\Delta\) is the filter width and \(S^*\) is the filtered \(u\)
Parameters
Name Type Discription Default
u np.ndarray input filtered quantity either in 3D [axis, D, H, W] or 2D [axis, D, H]
du np.ndarray gradient of u None*: if du is not provided, then it will be calculated with np.gradient()
filt sapsan.utils.filters the type of filter to use (spectral, box, gaussian). Pass the filter itself by loading the appropriate one from sapsan.utils.filters spectral
original_fil_size int width of the filter which was applied onto u 15 (spectral, fourier modes = 15)
delta_u float distance between the points on the grid to use for scaling 1
filt_ratio float the ratio of additional filter that will be applied on the data to find the slope for Dynamic Smagorinsky extrapolation over original_filt_size 0.5

sapsan.utils.physics.DynamicSmagorinskyModel.model()

calculates the DS model tensor with shape [column, row, D, H, W]
Return
Type Description
np.ndarray DS model tensor