API Reference
Glossary
Variable | Definition |
---|---|
N | # of Batches |
Cin | # of input channels (i.e. features) |
D or Db | Data or Batch depth (z) |
H or Hb | Data or Batch height (y) |
W or Wb | Data or Batch width (x) |
Train/Evaluate
Train
CLASS
sapsan.lib.experiments.train.Train
(model: Estimator, data_parameters: dict, backend = FakeBackend(), show_log = True, run_name = 'train')
- Call
Train
to set up your run -
Parameters
Name Type Discription Default model
object model to use for training data_parameters
dict data parameters from the data loader, necessary for tracking backend
object backend to track the experiment FakeBackend() show_log
bool show the loss vs. epoch progress plot (it will be save in mlflow in either case) True run_name
str 'run name' tag as recorded under MLflow train
sapsan.lib.experiments.train.Train.run()
- Run the model
-
Return
Type Description pytorch or sklearn or custom type trained model
Evaluate
CLASS
sapsan.lib.experiments.evaluate.Evaluate
(model: Estimator, data_parameters: dict, backend = FakeBackend(), cmap: str = 'plasma', run_name: str = 'evaluate', **kwargs)
- Call
Evaluate
to set up the testing of the trained model. Don't forget to updateestimator.loaders
with the new data for testing. -
Parameters
Name Type Discription Default model
object model to use for testing data_parameters
dict data parameters from the data loader, necessary for tracking backend
obejct backend to track the experiment FakeBackend() cmap
str matplotlib colormap to use for slice plots plasma run_name
str 'run name' tag as recorded under MLflow evaluate pdf_xlim
tuple x-axis limits for the PDF plot pdf_ylim
tuple y-axis limits for the PDF plot
sapsan.lib.experiments.evaluate.Evaluate.run()
- Run the evaluation of the trained model
-
Return
Type Description dict{'target' : np.ndarray, 'predict' : np.ndarray} target and predicted data
Estimators
CNN3d
CLASS
sapsan.lib.estimator.CNN3d
(loaders, config, model)
- A model based on Pytorch's 3D Convolutional Neural Network
-
Parameters
Name Type Discription Default loaders
dict contains input and target data (loaders['train'], loaders['valid']). Datasets themselves have to be torch.tensor(s) CNN3dConfig() config
class configuration to use for the model CNN3dConfig() model
class the model itself - should not be adjusted CNN3dModel()
sapsan.lib.estimator.CNN3d.save
(path: str)
- Saves model and optimizer states, as well as final epoch and loss
-
Parameters
Name Type Discription Default path
str save path of the model and its config parameters, {path}/model.pt
and{path}/params.json
respectively
sapsan.lib.estimator.CNN3d.load
(path: str, estimator, load_saved_config = False)
- Loads model and optimizer states, as well as final epoch and loss
-
Parameters
Name Type Discription Default path
str save path of the model and its config parameters, {path}/model.pt
and{path}/params.json
respectivelyestimator
estimator need to provide an initialized model for which to load the weights. The estimator can include a new config setup, changing n_epochs
to keep training the model further.load_saved_config
bool updates config parameters from {path}/params.json
.False -
Return
Type Description pytorch model loaded model
CLASS
sapsan.lib.estimator.CNN3dConfig
(n_epochs, patience, min_delta, logdir, lr, min_lr, *args, **kwargs)
- Configuration for the CNN3d - based on pytorch and catalyst libraries
-
Parameters
Name Type Discription Default n_epochs
int number of epochs 1 patience
int number of epochs with no improvement after which training will be stopped. Default 10 min_delta
float minimum change in the monitored metric to qualify as an improvement, i.e. an absolute change of less than min_delta, will count as no improvement 1e-5 log_dir
int path to store the logs ./logs/ lr
float learning rate 1e-3 min_lr
float a lower bound of the learning rate for ReduceLROnPlateau lr*1e-2 device
str specify the device to run the model on cuda (or switch to cpu) loader_key
str the loader to use for early stop: train or valid first loader provided*, which is usually 'train' metric_key
str the metric to use for early stop 'loss' ddp
bool turn on Distributed Data Parallel (DDP) in order to distribute the data and train the model across multiple GPUs. This is passed to Catalyst to activate the ddp
flag inrunner
(see more Distributed Training Tutorial; therunner
is set up in pytorch_estimator.py). Note: doesn't support jupyter notebooks - prepare a script!False
PIMLTurb
CLASS
sapsan.lib.estimator.PIMLTurb
(activ, loss, loaders, ks_stop, ks_frac, ks_scale, l1_scale, l1_beta, sigma, config, model)
- Physics-informed machine learning model to predict Reynolds-like stress tensor, \(Re\), for turbulence modeling. Learn more on the wiki: PIMLTurb
- A custom loss function was developed for this model combining spatial (SmoothL1) and statistical (Kolmogorov-Smirnov) losses.
-
Parameters
Name Type Discription Default activ
str activation function to use from PyTorch Tanhshrink loss
str loss function to use; accepts only custom SmoothL1_KSLoss loaders
dict contains input and target data (loaders['train'], loaders['valid']). Datasets themselves have to be torch.tensor(s) ks_stop
float early-stopping condition based on the KS loss value alone 0.1 ks_frac
float fraction the KS loss contributes to the total loss 0.5 ks_scale
float scale factor to prioritize KS loss over SmoothL1 (should not be altered) 1 l1_scale
float scale factor to prioritize SmoothL1 loss over KS 1 l1_beta
float \(beta\) threshold for smoothing the L1 loss 1 sigma
float \(sigma\) for the last layer of the network that performs a filtering operation using a Gaussian kernel 1 config
class configuration to use for the model PIMLTurbConfig() model
class the model itself - should not be adjusted PIMLTurbModel()
sapsan.lib.estimator.PIMLTurb.save
(path: str)
- Saves model and optimizer states, as well as final epoch and loss
-
Parameters
Name Type Discription Default path
str save path of the model and its config parameters, {path}/model.pt
and{path}/params.json
respectively
sapsan.lib.estimator.PIMLTurb.load
(path: str, estimator, load_saved_config = False)
- Loads model and optimizer states, as well as final epoch and loss
-
Parameters
Name Type Discription Default path
str save path of the model and its config parameters, {path}/model.pt
and{path}/params.json
respectivelyestimator
estimator need to provide an initialized model for which to load the weights. The estimator can include a new config setup, changing n_epochs
to keep training the model further.load_saved_config
bool updates config parameters from {path}/params.json
.False -
Return
Type Description pytorch model loaded model
CLASS
sapsan.lib.estimator.PIMLTurbConfig
(n_epochs, patience, min_delta, logdir, lr, min_lr, *args, **kwargs)
- Configuration for the PIMLTurb - based on pytorch (catalyst is not used)
-
Parameters
Name Type Discription Default n_epochs
int number of epochs 1 patience
int number of epochs with no improvement after which training will be stopped (not used) 10 min_delta
float minimum change in the monitored metric to qualify as an improvement, i.e. an absolute change of less than min_delta, will count as no improvement (not used) 1e-5 log_dir
int path to store the logs ./logs/ lr
float learning rate 1e-3 min_lr
float a lower bound of the learning rate for ReduceLROnPlateau lr*1e-2 device
str specify the device to run the model on cuda (or switch to cpu)
PIMLTurb1D
CLASS
sapsan.lib.estimator.PIMLTurb1D
(activ, loss, loaders, ks_stop, ks_frac, ks_scale, l1_scale, l1_beta, sigma, config, model)
- Physics-informed machine learning model to predict Reynolds-like stress tensor, \(Re\), for turbulence modeling. Learn more on the wiki: PIMLTurb
- A custom loss function was developed for this model combining spatial (SmoothL1) and statistical (Kolmogorov-Smirnov) losses.
-
Parameters
Name Type Discription Default activ
str activation function to use from PyTorch Tanhshrink loss
str loss function to use; accepts only custom SmoothL1_KSLoss loaders
dict contains input and target data (loaders['train'], loaders['valid']). Datasets themselves have to be torch.tensor(s) ks_stop
float early-stopping condition based on the KS loss value alone 0.1 ks_frac
float fraction the KS loss contributes to the total loss 0.5 ks_scale
float scale factor to prioritize KS loss over SmoothL1 (should not be altered) 1 l1_scale
float scale factor to prioritize SmoothL1 loss over KS 1 l1_beta
float \(beta\) threshold for smoothing the L1 loss 1 sigma
float \(sigma\) for the last layer of the network that performs a filtering operation using a Gaussian kernel 1 config
class configuration to use for the model PIMLTurb1DConfig() model
class the model itself - should not be adjusted PIMLTurb1DModel()
sapsan.lib.estimator.PIMLTurb1D.save
(path: str)
- Saves model and optimizer states, as well as final epoch and loss
-
Parameters
Name Type Discription Default path
str save path of the model and its config parameters, {path}/model.pt
and{path}/params.json
respectively
sapsan.lib.estimator.PIMLTurb1D.load
(path: str, estimator, load_saved_config = False)
- Loads model and optimizer states, as well as final epoch and loss
-
Parameters
Name Type Discription Default path
str save path of the model and its config parameters, {path}/model.pt
and{path}/params.json
respectivelyestimator
estimator need to provide an initialized model for which to load the weights. The estimator can include a new config setup, changing n_epochs
to keep training the model further.load_saved_config
bool updates config parameters from {path}/params.json
.False -
Return
Type Description pytorch model loaded model
CLASS
sapsan.lib.estimator.PIMLTurb1DConfig
(n_epochs, patience, min_delta, logdir, lr, min_lr, *args, **kwargs)
- Configuration for the PIMLTurb1D - based on pytorch (catalyst is not used)
-
Parameters
Name Type Discription Default n_epochs
int number of epochs 1 patience
int number of epochs with no improvement after which training will be stopped (not used) 10 min_delta
float minimum change in the monitored metric to qualify as an improvement, i.e. an absolute change of less than min_delta, will count as no improvement (not used) 1e-5 log_dir
int path to store the logs ./logs/ lr
float learning rate 1e-3 min_lr
float a lower bound of the learning rate for ReduceLROnPlateau lr*1e-2 device
str specify the device to run the model on cuda (or switch to cpu)
PICAE
CLASS
sapsan.lib.estimator.PICAE
(loaders, config, model)
- Convolutional Auto Encoder with Divergence-Free Kernel and with periodic padding. Further details can be found on the PICAE page
-
Parameters
Name Type Discription Default loaders
dict contains input and target data (loaders['train'], loaders['valid']). Datasets themselves have to be torch.tensor(s) config
class configuration to use for the model PICAEConfig() model
class the model itself - should not be adjusted PICAEModel()
sapsan.lib.estimator.PICAE.save
(path: str)
- Saves model and optimizer states, as well as final epoch and loss
-
Parameters
Name Type Discription Default path
str save path of the model and its config parameters, {path}/model.pt
and{path}/params.json
respectively
sapsan.lib.estimator.PICAE.load
(path: str, estimator, load_saved_config = False)
- Loads model and optimizer states, as well as final epoch and loss
-
Parameters
Name Type Discription Default path
str save path of the model and its config parameters, {path}/model.pt
and{path}/params.json
respectivelyestimator
estimator need to provide an initialized model for which to load the weights. The estimator can include a new config setup, changing n_epochs
to keep training the model further.load_saved_config
>bool updates config parameters from {path}/params.json
False -
Return
Type Description pytorch model loaded model
CLASS
sapsan.lib.estimator.PICAEConfig
(n_epochs, patience, min_delta, logdir, lr, min_lr, weight_decay, nfilters, kernel_size, enc_nlayers, dec_nlayers, *args, **kwargs)
- Configuration for the CNN3d - based on pytorch and catalyst libraries
-
Parameters
Name Type Discription Default n_epochs
int number of epochs 1 batch_dim
int dimension of a batch in each axis 64 patience
int number of epochs with no improvement after which training will be stopped 10 min_delta
float minimum change in the monitored metric to qualify as an improvement, i.e. an absolute change of less than min_delta, will count as no improvement 1e-5 log_dir
str path to store the logs ./logs/ lr
float learning rate 1e-3 min_lr
float a lower bound of the learning rate for ReduceLROnPlateau lr*1e-2 weight_decay
float weight decay (L2 penalty) 1e-5 nfilters
int the output dim for each convolutional layer, which is the number of "filters" learned by that layer 6 kernel_size
tuple size of the convolutional kernel (3,3,3) enc_layers
int number of encoding layers 3 dec_layers
int number of decoding layers 3 device
str specify the device to run the model on cuda (or switch to cpu) loader_key
str the loader to use for early stop: train or valid first loader provided*, which is usually 'train' metric_key
str the metric to use for early stop 'loss' ddp
bool turn on Distributed Data Parallel (DDP) in order to distribute the data and train the model across multiple GPUs. This is passed to Catalyst to activate the ddp
flag inrunner
(see more Distributed Training Tutorial; therunner
is set up in pytorch_estimator.py). Note: doesn't support jupyter notebooks - prepare a script!False
KRR
CLASS
sapsan.lib.estimator.KRR
(loaders, config, model)
- A model based on sk-learn Kernel Ridge Regression
-
Parameters
Name Type Discription Default loaders
list contains input and target data config
class configuration to use for the model KRRConfig() model
class the model itself - should not be adjusted KRRModel()
sapsan.lib.estimator.KRR.save
(path: str)
- Saves the model
-
Parameters
Name Type Discription Default path
str save path of the model and its config parameters, {path}/model.pt
and{path}/params.json
respectively
sapsan.lib.estimator.KRR.load
(path: str, estimator, load_saved_config = False)
- Loads the model
-
Parameters
Name Type Discription Default path
str save path of the model and its config parameters, {path}/model.pt
and{path}/params.json
respectivelyestimator
estimator need to provide an initialized model for which to load the weights. The estimator can include a new config setup, changing n_epochs
to keep training the model further.load_saved_config
bool updates config parameters from {path}/params.json
False -
Return
Type Description sklearn model loaded model
CLASS
sapsan.lib.estimator.KRRConfig
(alpha, gamma)
- Configuration for the KRR model
-
Parameters
Name Type Discription Default alpha
float regularization term, hyperparameter None gamma
float full-width at half-max for the RBF kernel, hyperparameter None
load_estimator
CLASS
sapsan.lib.estimator.load_estimator
()
- Dummy estimator to call
load()
to load the saved pytorch models
sapsan.lib.estimator.load_estimator.load
(path: str, estimator, load_saved_config = False)
- Loads model and optimizer states, as well as final epoch and loss
-
Parameters
Name Type Discription Default path
str save path of the model and its config parameters, {path}/model.pt
and{path}/params.json
respectivelyestimator
estimator need to provide an initialized model for which to load the weights. The estimator can include a new config setup, changing n_epochs
to keep training the model furtherload_saved_config
bool updates config parameters from {path}/params.json
False -
Return
Type Description pytorch model loaded model
load_sklearn_estimator
CLASS
sapsan.lib.estimator.load_sklearn_estimator
()
- Dummy estimator to call
load()
to load the saved sklearn models
sapsan.lib.estimator.load_sklearn_estimator.load
(path: str, estimator, load_saved_config = False)
- Loads model
-
Parameters
Name Type Discription Default path
str save path of the model and its config parameters, {path}/model.pt
and{path}/params.json
respectivelyestimator
estimator need to provide an initialized model for which to load the weights. The estimator can include a new config setup to keep training the model further load_saved_config
bool updates config parameters from {path}/params.json
False -
Return
Type Description sklearn model loaded model
Torch Modules
Gaussian
CLASS
sapsan.lib.estimator.torch_modules.Gaussian
(sigma: int)
- [1D,3D] Applies a Guassian filter as a torch layer through a series of 3 separable 1D convolutions, utilizing torch.nn.funcitonal.conv3d. CUDA is supported.
-
Parameters
Name Type Discription Default sigma
int standard deviation \(\sigma\) for a Gaussian kernel 2
sapsan.lib.estimator.torch_modules.Gaussian.forward
(tensor: torch.tensor)
-
Parameters
Name Type Discription Default tensor
torch.tensor input torch tensor of shape [N, Cin, Din, Hin, Win] -
Return
Type Description torch.tensor filtered 3D torch data
Interp1d
CLASS
sapsan.lib.estimator.torch_modules.Interp1d
()
- Linear 1D interpolation done in native PyTorch. CUDA is supported. Forked from @aliutkus
sapsan.lib.estimator.torch_modules.Interp1d.forward
(x: torch.tensor, y: torch.tensor, xnew: torch.tensor, out: torch.tensor)
-
Parameters
Name Type Discription Default x
torch.tensor 1D or 2D tensor y
torch.tensor 1D or 2D tensor; the length of y
along its last dimension must be the same as that ofx
xnew
torch.tensor 1D or 2D tensor of real values. xnew
can only be 1D if bothx
andy
are 1D. Otherwise, its length along the first dimension must be the same as that of whicheverx
andy
is 2D.out
torch.tensor Tensor for the output If None, allocated automatically -
Return
Type Description torch.tensor interpolated tensor
Data Loaders
HDF5Dataset
CLASS
sapsan.lib.data.hdf5_dataset.HDF5Dataset
( path: str, features: List[str], target: List[str], checkpoints: List[int], batch_size: int = None, input_size: int = None, sampler: Optional[Sampling] = None, time_granularity: float = 1, features_label: Optional[List[str]] = None, target_label: Optional[List[str]] = None, flat: bool = False, shuffle: bool=False, train_fraction = None)
- HDF5 data loader class
-
Parameters
Name Type Discription Default path
str path to the data in the following format: "data/t_{checkpoint:1.0f}/{feature}_data.h5"
features
List[str] list of train features to load ['not_specified_data'] target
List[str] list of target features to load None checkpoints
List[int] list of checkpoints to load (they will be appended as batches) input_size
int dimension of the loaded data in each axis batch_size
int dimension of a batch in each axis. If batch_size != input_size, the datacube will be evenly splitted input_size (doesn't work with sampler) batch_num
int the number of batches to be loaded at a time 1 sampler
object data sampler to use (ex: EquidistantSampling()) None time_granularity
float what is the time separation (dt) between checkpoints 1 features_label
List[str] hdf5 data label for the train features list(file.keys())[-1], i.e. last one in hdf5 file target_label
List[str] hdf5 data label for the target features list(file.keys())[-1], i.e. last one in hdf5 file flat
bool flatten the data into [Cin, D*H*W]. Required for sk-learn models False shuffle
bool shuffle the dataset False train_fraction
float or int a fraction of the dataset to be used for training (accessed through loaders['train']). The rest will be used for validation (accessed through loaders['valid']). If int is provided, then that number of batches will be used for training. If float is provided, then it will try to split the data either by batch or by actually slicing the data cube into smaller chunks None - training data will be used for validation, effectively skipping the latter
sapsan.lib.data.hdf5_dataset.HDF5Dataset.load_numpy()
- HDF5 data loader method - call it to load the data as a numpy array. If targets are not specified, than only features will be loaded (hence you can just load 1 dataset at a time).
-
Return
Type Description np.ndarray, np.ndarray loaded a dataset as a numpy array
sapsan.lib.data.hdf5_dataset.HDF5Dataset.convert_to_torch([x, y])
- Splits numpy arrays into batches and converts to torch dataloader
-
Parameters
Name Type Discription Default [x, y]
list or np.ndarray a list of input datasets to batch and convert to torch loaders -
Return
Type Description OrderedDict{'train' : DataLoader, 'valid' : DataLoader } Data in Torch Dataloader format ready for training
sapsan.lib.data.hdf5_dataset.HDF5Dataset.load()
- Loads, splits into batches, and converts into torch dataloader. Effectively combines .load_numpy and .convert_to_torch
-
Return
Type Description np.ndarray, np.ndarray loaded train and target features: x, y
get_loader_shape
sapsan.lib.data.data_functions.get_loader_shape()
- Returns the shape of the loaded tensors - the loaded data that has been split into
train
andvalid
datasets. -
Parameters
Name Type Discription Default loaders
torch DataLoader the loader of tensors passed for training name
str name of the dataset in the loaders; usually either train
orvalid
None - chooses the first entry in loaders -
Return
Type Description np.ndarray shape of the tensor
Data Manipulation
EquidistantSampling
CLASS
sapsan.lib.data.sampling.EquidistantSampling
(target_dim)
- Samples the data to a lower dimension, keeping separation between the data points equally distant
-
Parameters
Name Type Discription Default target_dim
np.ndarray new shape of the input in the form [D, H, W]
sapsan.lib.data.sampling.EquidistantSampling.sample
(data)
- Performs sampling of the data
-
Parameters
Name Type Discription Default data
np.ndarray input data to be sampled - has the shape of [axis, D, H, W] -
Return
Type Description np.ndarray Sampled data with the shape [axis, D, H, W]
split_data_by_batch
sapsan.utils.shapes.split_data_by_batch
(data: np.ndarray, size: int, batch_size: int, n_features: int, axis: int)
- [2D, 3D]: splits data into smaller cubes or squares of batches
-
Parameters
Name Type Discription Default data
np.ndarray input 2D or 3D data, [Cin, D, H, W] size
int dimensionality of the data in each axis batch_size
int dimensionality of the batch in each axis n_features
int number of channels of the input data axis
int number of axes, 2 or 3 -
Return
Type Description np.ndarray batched data: [N, Cin, Db, Hb, Wb]
combine_data
sapsan.utils.shapes.combine_data
(data: np.ndarray, input_size: tuple, batch_size: tuple, axis: int)
- [2D, 3D] - reverse of
split_data_by_batch
function -
Parameters
Name Type Discription Default data
np.ndarray input 2D or 3D data, [N, Cin, Db, Hb, Wb] input_size
tuple dimensionality of the original data in each axis batch_size
tuple dimensionality of the batch in each axis axis
int number of axes, 2 or 3 -
Return
Type Description np.ndarray reassembled data: [Cin, D, H, W]
slice_of_cube
sapsan.utils.shapes.slice_of_cube
(data: np.ndarray, feature: Optional[int] = None, n_slice: Optional[int] = None)
- Select a slice of a cube (to plot later)
-
Parameters
Name Type Discription Default data
np.ndarray input 3D data, [Cin, D, H, W] feature
int feature to take the slice of, i.e. the value of Cin 1 n_slice
int what slice to select, i.e. the value of D 1 -
Return
Type Description np.ndarray data slice: [H, W]
Filter
spectral
sapsan.utils.filter.spectral
(im: np.ndarray, fm: int)
- [2D, 3D] apply a spectral filter
-
Parameters
Name Type Discription Default im
np.ndarray input dataset (ex: [Cin, D, H, W]) fm
int number of Fourier modes to filter down to -
Return
Type Description np.ndarray filtered dataset
box
sapsan.utils.filter.box
(im: np.ndarray, ksize)
- [2D] apply a box filter
-
Parameters
Name Type Discription Default im
np.ndarray input dataset (ex: [Cin, H, W]) ksize
tupple kernel size (ex: ksize = (2,2)) -
Return
Type Description np.ndarray filtered dataset
gaussian
sapsan.utils.filter.gaussian
(im: np.ndarray, sigma)
- [2D, 3D] apply a gaussian filter
- Note: Guassian filter assumes dx=1 between the points. Adjust sigma accordingly.
-
Parameters
Name Type Discription Default im
np.ndarray input dataset (ex: [H, W] or [D, H, W]) sigma
float or tuple of floats standard deviation for Gaussian kernel. Sigma can be defined for each axis individually. -
Return
Type Description np.ndarray filtered dataset
Backend (Tracking)
MLflowBackend
CLASS
sapsan.lib.backends.mlflow.MLflowBackend
(name, host, port)
- Initilizes mlflow and starts up mlflow ui at a given host:port
-
Parameters
Name Type Discription Default name
str name under which to record the experiment "experiment" host
str host of mlflow ui "localhost" port
int port of mlflow ui 5000
sapsan.lib.backends.mlflow.MLflowBackend.start_ui
()
- starts MLflow ui at a specified host and port
sapsan.lib.backends.mlflow.MLflowBackend.start
(run_name: str, nested = False, run_id = None)
- Starts a tracking run
-
Parameters
Name Type Discription Default run_name
str name of the run "train" for Train()
, "evaluate" forEvaluate()
nested
bool whether or not to nest the recorded run False for Train()
, True forEvaluate()
run_id
str run id None - a new will be generated -
Return
Type Description str run_id
sapsan.lib.backends.mlflow.MLflowBackend.resume
(run_id, nested = True)
- Resumes a previous run, so you can record extra parameters
-
Parameters
Name Type Discription Default run_id
str id of the run to resume nested
bool whether or not to nest the recorded run True, since it will usually be an Evaluate()
run
sapsan.lib.backends.mlflow.MLflowBackend.log_metric
()
- Logs a metric
sapsan.lib.backends.mlflow.MLflowBackend.log_parameter
()
- Logs a parameter
sapsan.lib.backends.mlflow.MLflowBackend.log_artifact
()
- Logs an artifact (any saved file such, e.g. .png, .txt)
sapsan.lib.backends.mlflow.MLflowBackend.log_model
()
- Log a PyTorch model as an MLflow artifact for the current run. Corresponds to mlflow.pytorch.log_model()
sapsan.lib.backends.mlflow.MLflowBackend.load_model
()
- Load a PyTorch model from a local file or a run. Corresponds to mlflow.pytorch.load_model()
sapsan.lib.backends.mlflow.MLflowBackend.close_active_run
()
- Closes all active MLflow runs
sapsan.lib.backends.mlflow.MLflowBackend.end
()
- Ends the most recent MLflow run
FakeBackend
CLASS
sapsan.lib.backends.fake.FakeBackend()
- Pass to
train
in order to disable backend (tracking)
Plotting
plot_params
sapsan.utils.plot.plot_params()
- Contains the matplotlib parameters that format all of the plots (
font.size
,axes.labelsize
, etc.) -
Return
Type Description dict matplotlib parameters -
Default Parameters
pdf_plot
sapsan.utils.plot.pdf_plot
(series: List[np.ndarray], bins: int = 100, label: Optional[List[str]] = None, figsize: tuple, dpi: int, ax: matplotlib.axes, style: str)
- Plot a probability density function (PDF) of a single or multiple datasets
-
Parameters
Name Type Discription Default series
List[np.ndarray] input datasets bins
int number of bins to use for the dataset to generate the pdf 100 label
List[str] list of names to use as labels in the legend None figsize
tuple figure size as passed to matplotlib figure (6,6) dpi
int resolution of the figure 60 ax
matplotlib.axes axes object to use for plotting (if you want to define your own figure and subplots) None - creates a separate figure style
str accepts matplotlib styles 'tableau-colorblind10' -
Return
Type Description matplotlib.axes object ax
cdf_plot
sapsan.utils.plot.cdf_plot
(series: List[np.ndarray], bins: int = 100, label: Optional[List[str]] = None, figsize: tuple, dpi: int, ax: matplotlib.axes, ks: bool, style: str)
- Plot a cumulative distribution function (CDF) of a single or multiple datasets
-
Parameters
Name Type Discription Default series
List[np.ndarray] input datasets bins
int number of bins to use for the dataset to generate the pdf 100 label
List[str] list of names to use as labels in the legend None figsize
tuple figure size as passed to matplotlib figure (6,6) dpi
int resolution of the figure 60 ax
matplotlib.axes axes object to use for plotting (if you want to define your own figure and subplots) None - creates a separate figure ks
bool if True prints out on the plot itself the Kolomogorov-Smirnov Statistic. It will also be returned along with the ax object False style
str accepts matplotlib styles 'tableau-colorblind10' -
Return
Type Description matplotlib.axes object, float (if ks==True) ax, ks (if ks==True)
line_plot
sapsan.utils.plot.line_plot
(series: List[np.ndarray], bins: int = 100, label: Optional[List[str]] = None, plot_type: str, figsize: tuple, dpi: int, ax: matplotlib.axes, style: str)
- Plot linear data of x vs y - same matplotlib formatting will be used as the other plots
-
Parameters
Name Type Discription Default series
List[np.ndarray] input datasets bins
int number of bins to use for the dataset to generate the pdf 100 label
List[str] list of names to use as labels in the legend None plot_type
str axis type of the matplotlib plot; options = ['plot', 'semilogx', 'semilogy', 'loglog'] 'plot' figsize
tuple figure size as passed to matplotlib figure (6,6) linestyle
List[str] list of linestyles to use for each profile for the matplotlib figure ['-'] (solid line) dpi
int resolution of the figure 60 ax
matplotlib.axes axes object to use for plotting (if you want to define your own figure and subplots) None - creates a separate figure style
str accepts matplotlib styles 'tableau-colorblind10' -
Return
Type Description matplotlib.axes object ax
slice_plot
sapsan.utils.plot.slice_plot
(series: List[np.ndarray], label: Optional[List[str]] = None, cmap = 'plasma', figsize: tuple, dpi: int, ax: matplotlib.axes)
- Plot 2D spatial distributions (slices) of your target and prediction datasets. Colorbar limits for both slices are set based on the minimum and maximum of the 2nd (target) provided dataset.
-
Parameters
Name Type Discription Default series
List[np.ndarray] input datasets label
List[str] list of names to use as labels in the legend None cmap
str matplotlib colormap to use 'viridis' figsize
tuple figure size as passed to matplotlib figure (6,6) dpi
int resolution of the figure 60 ax
matplotlib.axes axes object to use for plotting (if you want to define your own figure and subplots)
WARNING: only works if a single image is supplied toslice_plot()
, otherwise will be ignoredNone - creates a separate figure -
Return
Type Description matplotlib.axes object ax
log_plot
sapsan.utils.plot.log_plot
(show_log = True, log_path = 'logs/logs/train.csv', valid_log_path = 'logs/logs/valid.csv', delimiter=',', train_name = 'train_loss', valid_name = 'valid_loss', train_column = 1, valid_column = 1, epoch_column = 0)
- Plots an interactive training log of train_loss vs. epoch with plotly
-
Parameters
Name Type Discription Default show_log
bool show the loss vs. epoch progress plot (it will be save in mlflow in either case) True log_path
str path to training log produced by the catalyst wrapper 'logs/logs/train.csv' valid_log_path
str path to validation log produced by the catalyst wrapper 'logs/logs/valid.csv' delimiter
str delimiter to use for numpy.genfromtxt data loading ',' train_name
str name for the training label 'train_loss' valid_name
str name for the validation label 'valid_loss' train_column
int column to load for training data from log_path
1 valid_column
int column to load for validation data from valid_log_path
1 epoch_column
int column to load the epoch index from log_path
. If None, then epoch will be generated fro the number of entries0 -
Return
Type Description plotly.express object plot figure
model_graph
sapsan.utils.plot.model_graph
(model, shape: np.array, transforms)
- Creates a graph of the ML model (needs graphviz to be installed). A tutorial is available on the wiki: Model Graph
- The method is based on hiddenlayer originally written by Waleed Abdulla.
-
Parameters
Name Type Discription Default model
object initialized pytorch or tensorflow model shape
np.array shape of the input array in the form [N, Cin, Db, Hb, Wb], where Cin=1 transforms
list[methods] a list of hiddenlayer transforms to be applied (Fold, FoldId, Prune, PruneBranch, FoldDuplicates, Rename), defined in transforms.py See below -
Default Parameters
-
Return
Type Description graphviz.Digraph object SVG graph of a model
Physics
ReynoldsStress
sapsan.utils.physics.ReynoldsStress
(u, filt, filt_size, only_x_components=False)
- Calculates a stress tensor of the form
- where \(\tilde{u}\) is the filtered \(u\)
-
Parameters
Name Type Discription Default u
np.ndarray input velocity in 3D - [axis, D, H, W] filt
sapsan.utils.filters the type of filter to use (spectral, box, gaussian). Pass the filter itself by loading the appropriate one from sapsan.utils.filters
gaussian filt_size
int or float size of the filter to apply. For different filter types, the size is defined differently. Spectral - fourier mode to filter to, Box - k_size (box size), Gaussian - sigma 2 (sigma=2 for gaussian) only_x_component
bool calculates and outputs only x components of the tensor in shape [row, D, H, W] - calculating all 9 can be taxing on memory False -
Return
Type Description np.ndarray stress tensor of shape [column, row, D, H, W]
PowerSpectrum
CLASS
sapsan.utils.physics.PowerSpectrum
(u: np.ndarray)
- Sets up to produce a power spectrum
-
Parameters
Name Type Discription Default u
np.ndarray input velocity
(first dimension must be the axis=[1, 2, or 3],
e.g. the shape for 3D velocity should be: [axis, D, H, W])
sapsan.utils.physics.PowerSpectrum.calculate()
- Calculates the power spectrum
-
Return
Type Description np.ndarray, np.ndarray k_bins (fourier modes), Ek_bins (E(k))
sapsan.utils.physics.PowerSpectrum.spectrum_plot
(k_bins, Ek_bins, kolmogorov=True, kl_a)
- Plots the calculated power spectrum
-
Parameters
Name Type Discription Default k_bins
np.ndarray fourier mode values along x-axis Ek_bins
np.ndarray energy as a function of k: E(k) kolmogorov
bool plots scaled Kolmogorov's -5/3 spectrum alongside the calculated one True kl_A
float scaling factor of Kolmogorov's law np.amax(self.Ek_bins)*1e1 -
Return
Type Description matplotlib.axes object spectrum plot
GradientModel
CLASS
sapsan.utils.physics.GradientModel
(u: np.ndarray, filter_width, delta_u = 1)
- sets up to apply a gradient turbulence subgrid model:
- where \(\Delta\) is the filter width and \(u^*\) is the filtered \(u\)
-
Parameters
Name Type Discription Default u
np.ndarray input filtered quantity in 3D - [axis, D, H, W] filter_width
float width of the filter which was applied onto u
delta_u
distance between the points on the grid to use for scaling 1
sapsan.utils.physics.GradientModel.gradient()
- calculated the gradient of the given input data from GradientModel
-
Return
Type Description np.ndarray gradient with shape [column, row, D, H, W]
sapsan.utils.physics.GradientModel.model()
- calculates the gradient model tensor with shape [column, row, D, H, W]
-
Return
Type Description np.ndarray gradient model tensor
DynamicSmagorinskyModel
CLASS
sapsan.utils.physics.DynamicSmagorinskyModel
(u: np.ndarray, filt, original_filt_size, filt_ratio, du, delta_u)
- sets up to apply a Dynamic Smagorinsky (DS) turbulence subgrid model:
- where \(\Delta\) is the filter width and \(S^*\) is the filtered \(u\)
-
Parameters
Name Type Discription Default u
np.ndarray input filtered quantity either in 3D [axis, D, H, W] or 2D [axis, D, H] du
np.ndarray gradient of u
None*: if du
is not provided, then it will be calculated withnp.gradient()
filt
sapsan.utils.filters the type of filter to use (spectral, box, gaussian). Pass the filter itself by loading the appropriate one from sapsan.utils.filters
spectral original_fil_size
int width of the filter which was applied onto u
15 (spectral, fourier modes = 15) delta_u
float distance between the points on the grid to use for scaling 1 filt_ratio
float the ratio of additional filter that will be applied on the data to find the slope for Dynamic Smagorinsky extrapolation over original_filt_size
0.5
sapsan.utils.physics.DynamicSmagorinskyModel.model()
- calculates the DS model tensor with shape [column, row, D, H, W]
-
Return
Type Description np.ndarray DS model tensor