API Reference
Glossary
| Variable | Definition |
|---|---|
| N | # of Batches |
| Cin | # of input channels (i.e. features) |
| D or Db | Data or Batch depth (z) |
| H or Hb | Data or Batch height (y) |
| W or Wb | Data or Batch width (x) |
Train/Evaluate
Train
CLASS
sapsan.lib.experiments.train.Train(model: Estimator, data_parameters: dict, backend = FakeBackend(), show_log = True, run_name = 'train')
- Call
Trainto set up your run -
Parameters
Name Type Discription Default modelobject model to use for training data_parametersdict data parameters from the data loader, necessary for tracking backendobject backend to track the experiment FakeBackend() show_logbool show the loss vs. epoch progress plot (it will be save in mlflow in either case) True run_namestr 'run name' tag as recorded under MLflow train
sapsan.lib.experiments.train.Train.run()
- Run the model
-
Return
Type Description pytorch or sklearn or custom type trained model
Evaluate
CLASS
sapsan.lib.experiments.evaluate.Evaluate(model: Estimator, data_parameters: dict, backend = FakeBackend(), cmap: str = 'plasma', run_name: str = 'evaluate', **kwargs)
- Call
Evaluateto set up the testing of the trained model. Don't forget to updateestimator.loaderswith the new data for testing. -
Parameters
Name Type Discription Default modelobject model to use for testing data_parametersdict data parameters from the data loader, necessary for tracking backendobejct backend to track the experiment FakeBackend() cmapstr matplotlib colormap to use for slice plots plasma run_namestr 'run name' tag as recorded under MLflow evaluate pdf_xlimtuple x-axis limits for the PDF plot pdf_ylimtuple y-axis limits for the PDF plot
sapsan.lib.experiments.evaluate.Evaluate.run()
- Run the evaluation of the trained model
-
Return
Type Description dict{'target' : np.ndarray, 'predict' : np.ndarray} target and predicted data
Estimators
CNN3d
CLASS
sapsan.lib.estimator.CNN3d(loaders, config, model)
- A model based on Pytorch's 3D Convolutional Neural Network
-
Parameters
Name Type Discription Default loadersdict contains input and target data (loaders['train'], loaders['valid']). Datasets themselves have to be torch.tensor(s) CNN3dConfig() configclass configuration to use for the model CNN3dConfig() modelclass the model itself - should not be adjusted CNN3dModel()
sapsan.lib.estimator.CNN3d.save(path: str)
- Saves model and optimizer states, as well as final epoch and loss
-
Parameters
Name Type Discription Default pathstr save path of the model and its config parameters, {path}/model.ptand{path}/params.jsonrespectively
sapsan.lib.estimator.CNN3d.load(path: str, estimator, load_saved_config = False)
- Loads model and optimizer states, as well as final epoch and loss
-
Parameters
Name Type Discription Default pathstr save path of the model and its config parameters, {path}/model.ptand{path}/params.jsonrespectivelyestimatorestimator need to provide an initialized model for which to load the weights. The estimator can include a new config setup, changing n_epochsto keep training the model further.load_saved_configbool updates config parameters from {path}/params.json.False -
Return
Type Description pytorch model loaded model
CLASS
sapsan.lib.estimator.CNN3dConfig(n_epochs, patience, min_delta, logdir, lr, min_lr, *args, **kwargs)
- Configuration for the CNN3d - based on pytorch and catalyst libraries
-
Parameters
Name Type Discription Default n_epochsint number of epochs 1 patienceint number of epochs with no improvement after which training will be stopped. Default 10 min_deltafloat minimum change in the monitored metric to qualify as an improvement, i.e. an absolute change of less than min_delta, will count as no improvement 1e-5 log_dirint path to store the logs ./logs/ lrfloat learning rate 1e-3 min_lrfloat a lower bound of the learning rate for ReduceLROnPlateau lr*1e-2 devicestr specify the device to run the model on cuda (or switch to cpu) loader_keystr the loader to use for early stop: train or valid first loader provided*, which is usually 'train' metric_keystr the metric to use for early stop 'loss' ddpbool turn on Distributed Data Parallel (DDP) in order to distribute the data and train the model across multiple GPUs. This is passed to Catalyst to activate the ddpflag inrunner(see more Distributed Training Tutorial; therunneris set up in pytorch_estimator.py). Note: doesn't support jupyter notebooks - prepare a script!False
PIMLTurb
CLASS
sapsan.lib.estimator.PIMLTurb(activ, loss, loaders, ks_stop, ks_frac, ks_scale, l1_scale, l1_beta, sigma, config, model)
- Physics-informed machine learning model to predict Reynolds-like stress tensor, \(Re\), for turbulence modeling. Learn more on the wiki: PIMLTurb
- A custom loss function was developed for this model combining spatial (SmoothL1) and statistical (Kolmogorov-Smirnov) losses.
-
Parameters
Name Type Discription Default activstr activation function to use from PyTorch Tanhshrink lossstr loss function to use; accepts only custom SmoothL1_KSLoss loadersdict contains input and target data (loaders['train'], loaders['valid']). Datasets themselves have to be torch.tensor(s) ks_stopfloat early-stopping condition based on the KS loss value alone 0.1 ks_fracfloat fraction the KS loss contributes to the total loss 0.5 ks_scalefloat scale factor to prioritize KS loss over SmoothL1 (should not be altered) 1 l1_scalefloat scale factor to prioritize SmoothL1 loss over KS 1 l1_betafloat \(beta\) threshold for smoothing the L1 loss 1 sigmafloat \(sigma\) for the last layer of the network that performs a filtering operation using a Gaussian kernel 1 configclass configuration to use for the model PIMLTurbConfig() modelclass the model itself - should not be adjusted PIMLTurbModel()
sapsan.lib.estimator.PIMLTurb.save(path: str)
- Saves model and optimizer states, as well as final epoch and loss
-
Parameters
Name Type Discription Default pathstr save path of the model and its config parameters, {path}/model.ptand{path}/params.jsonrespectively
sapsan.lib.estimator.PIMLTurb.load(path: str, estimator, load_saved_config = False)
- Loads model and optimizer states, as well as final epoch and loss
-
Parameters
Name Type Discription Default pathstr save path of the model and its config parameters, {path}/model.ptand{path}/params.jsonrespectivelyestimatorestimator need to provide an initialized model for which to load the weights. The estimator can include a new config setup, changing n_epochsto keep training the model further.load_saved_configbool updates config parameters from {path}/params.json.False -
Return
Type Description pytorch model loaded model
CLASS
sapsan.lib.estimator.PIMLTurbConfig(n_epochs, patience, min_delta, logdir, lr, min_lr, *args, **kwargs)
- Configuration for the PIMLTurb - based on pytorch (catalyst is not used)
-
Parameters
Name Type Discription Default n_epochsint number of epochs 1 patienceint number of epochs with no improvement after which training will be stopped (not used) 10 min_deltafloat minimum change in the monitored metric to qualify as an improvement, i.e. an absolute change of less than min_delta, will count as no improvement (not used) 1e-5 log_dirint path to store the logs ./logs/ lrfloat learning rate 1e-3 min_lrfloat a lower bound of the learning rate for ReduceLROnPlateau lr*1e-2 devicestr specify the device to run the model on cuda (or switch to cpu)
PIMLTurb1D
CLASS
sapsan.lib.estimator.PIMLTurb1D(activ, loss, loaders, ks_stop, ks_frac, ks_scale, l1_scale, l1_beta, sigma, config, model)
- Physics-informed machine learning model to predict Reynolds-like stress tensor, \(Re\), for turbulence modeling. Learn more on the wiki: PIMLTurb
- A custom loss function was developed for this model combining spatial (SmoothL1) and statistical (Kolmogorov-Smirnov) losses.
-
Parameters
Name Type Discription Default activstr activation function to use from PyTorch Tanhshrink lossstr loss function to use; accepts only custom SmoothL1_KSLoss loadersdict contains input and target data (loaders['train'], loaders['valid']). Datasets themselves have to be torch.tensor(s) ks_stopfloat early-stopping condition based on the KS loss value alone 0.1 ks_fracfloat fraction the KS loss contributes to the total loss 0.5 ks_scalefloat scale factor to prioritize KS loss over SmoothL1 (should not be altered) 1 l1_scalefloat scale factor to prioritize SmoothL1 loss over KS 1 l1_betafloat \(beta\) threshold for smoothing the L1 loss 1 sigmafloat \(sigma\) for the last layer of the network that performs a filtering operation using a Gaussian kernel 1 configclass configuration to use for the model PIMLTurb1DConfig() modelclass the model itself - should not be adjusted PIMLTurb1DModel()
sapsan.lib.estimator.PIMLTurb1D.save(path: str)
- Saves model and optimizer states, as well as final epoch and loss
-
Parameters
Name Type Discription Default pathstr save path of the model and its config parameters, {path}/model.ptand{path}/params.jsonrespectively
sapsan.lib.estimator.PIMLTurb1D.load(path: str, estimator, load_saved_config = False)
- Loads model and optimizer states, as well as final epoch and loss
-
Parameters
Name Type Discription Default pathstr save path of the model and its config parameters, {path}/model.ptand{path}/params.jsonrespectivelyestimatorestimator need to provide an initialized model for which to load the weights. The estimator can include a new config setup, changing n_epochsto keep training the model further.load_saved_configbool updates config parameters from {path}/params.json.False -
Return
Type Description pytorch model loaded model
CLASS
sapsan.lib.estimator.PIMLTurb1DConfig(n_epochs, patience, min_delta, logdir, lr, min_lr, *args, **kwargs)
- Configuration for the PIMLTurb1D - based on pytorch (catalyst is not used)
-
Parameters
Name Type Discription Default n_epochsint number of epochs 1 patienceint number of epochs with no improvement after which training will be stopped (not used) 10 min_deltafloat minimum change in the monitored metric to qualify as an improvement, i.e. an absolute change of less than min_delta, will count as no improvement (not used) 1e-5 log_dirint path to store the logs ./logs/ lrfloat learning rate 1e-3 min_lrfloat a lower bound of the learning rate for ReduceLROnPlateau lr*1e-2 devicestr specify the device to run the model on cuda (or switch to cpu)
PICAE
CLASS
sapsan.lib.estimator.PICAE(loaders, config, model)
- Convolutional Auto Encoder with Divergence-Free Kernel and with periodic padding. Further details can be found on the PICAE page
-
Parameters
Name Type Discription Default loadersdict contains input and target data (loaders['train'], loaders['valid']). Datasets themselves have to be torch.tensor(s) configclass configuration to use for the model PICAEConfig() modelclass the model itself - should not be adjusted PICAEModel()
sapsan.lib.estimator.PICAE.save(path: str)
- Saves model and optimizer states, as well as final epoch and loss
-
Parameters
Name Type Discription Default pathstr save path of the model and its config parameters, {path}/model.ptand{path}/params.jsonrespectively
sapsan.lib.estimator.PICAE.load(path: str, estimator, load_saved_config = False)
- Loads model and optimizer states, as well as final epoch and loss
-
Parameters
Name Type Discription Default pathstr save path of the model and its config parameters, {path}/model.ptand{path}/params.jsonrespectivelyestimatorestimator need to provide an initialized model for which to load the weights. The estimator can include a new config setup, changing n_epochsto keep training the model further.load_saved_config>bool updates config parameters from {path}/params.jsonFalse -
Return
Type Description pytorch model loaded model
CLASS
sapsan.lib.estimator.PICAEConfig(n_epochs, patience, min_delta, logdir, lr, min_lr, weight_decay, nfilters, kernel_size, enc_nlayers, dec_nlayers, *args, **kwargs)
- Configuration for the CNN3d - based on pytorch and catalyst libraries
-
Parameters
Name Type Discription Default n_epochsint number of epochs 1 batch_dimint dimension of a batch in each axis 64 patienceint number of epochs with no improvement after which training will be stopped 10 min_deltafloat minimum change in the monitored metric to qualify as an improvement, i.e. an absolute change of less than min_delta, will count as no improvement 1e-5 log_dirstr path to store the logs ./logs/ lrfloat learning rate 1e-3 min_lrfloat a lower bound of the learning rate for ReduceLROnPlateau lr*1e-2 weight_decayfloat weight decay (L2 penalty) 1e-5 nfiltersint the output dim for each convolutional layer, which is the number of "filters" learned by that layer 6 kernel_sizetuple size of the convolutional kernel (3,3,3) enc_layersint number of encoding layers 3 dec_layersint number of decoding layers 3 devicestr specify the device to run the model on cuda (or switch to cpu) loader_keystr the loader to use for early stop: train or valid first loader provided*, which is usually 'train' metric_keystr the metric to use for early stop 'loss' ddpbool turn on Distributed Data Parallel (DDP) in order to distribute the data and train the model across multiple GPUs. This is passed to Catalyst to activate the ddpflag inrunner(see more Distributed Training Tutorial; therunneris set up in pytorch_estimator.py). Note: doesn't support jupyter notebooks - prepare a script!False
KRR
CLASS
sapsan.lib.estimator.KRR(loaders, config, model)
- A model based on sk-learn Kernel Ridge Regression
-
Parameters
Name Type Discription Default loaderslist contains input and target data configclass configuration to use for the model KRRConfig() modelclass the model itself - should not be adjusted KRRModel()
sapsan.lib.estimator.KRR.save(path: str)
- Saves the model
-
Parameters
Name Type Discription Default pathstr save path of the model and its config parameters, {path}/model.ptand{path}/params.jsonrespectively
sapsan.lib.estimator.KRR.load(path: str, estimator, load_saved_config = False)
- Loads the model
-
Parameters
Name Type Discription Default pathstr save path of the model and its config parameters, {path}/model.ptand{path}/params.jsonrespectivelyestimatorestimator need to provide an initialized model for which to load the weights. The estimator can include a new config setup, changing n_epochsto keep training the model further.load_saved_configbool updates config parameters from {path}/params.jsonFalse -
Return
Type Description sklearn model loaded model
CLASS
sapsan.lib.estimator.KRRConfig(alpha, gamma)
- Configuration for the KRR model
-
Parameters
Name Type Discription Default alphafloat regularization term, hyperparameter None gammafloat full-width at half-max for the RBF kernel, hyperparameter None
load_estimator
CLASS
sapsan.lib.estimator.load_estimator()
- Dummy estimator to call
load()to load the saved pytorch models
sapsan.lib.estimator.load_estimator.load(path: str, estimator, load_saved_config = False)
- Loads model and optimizer states, as well as final epoch and loss
-
Parameters
Name Type Discription Default pathstr save path of the model and its config parameters, {path}/model.ptand{path}/params.jsonrespectivelyestimatorestimator need to provide an initialized model for which to load the weights. The estimator can include a new config setup, changing n_epochsto keep training the model furtherload_saved_configbool updates config parameters from {path}/params.jsonFalse -
Return
Type Description pytorch model loaded model
load_sklearn_estimator
CLASS
sapsan.lib.estimator.load_sklearn_estimator()
- Dummy estimator to call
load()to load the saved sklearn models
sapsan.lib.estimator.load_sklearn_estimator.load(path: str, estimator, load_saved_config = False)
- Loads model
-
Parameters
Name Type Discription Default pathstr save path of the model and its config parameters, {path}/model.ptand{path}/params.jsonrespectivelyestimatorestimator need to provide an initialized model for which to load the weights. The estimator can include a new config setup to keep training the model further load_saved_configbool updates config parameters from {path}/params.jsonFalse -
Return
Type Description sklearn model loaded model
Torch Modules
Gaussian
CLASS
sapsan.lib.estimator.torch_modules.Gaussian(sigma: int)
- [1D,3D] Applies a Guassian filter as a torch layer through a series of 3 separable 1D convolutions, utilizing torch.nn.funcitonal.conv3d. CUDA is supported.
-
Parameters
Name Type Discription Default sigmaint standard deviation \(\sigma\) for a Gaussian kernel 2
sapsan.lib.estimator.torch_modules.Gaussian.forward(tensor: torch.tensor)
-
Parameters
Name Type Discription Default tensortorch.tensor input torch tensor of shape [N, Cin, Din, Hin, Win] -
Return
Type Description torch.tensor filtered 3D torch data
Interp1d
CLASS
sapsan.lib.estimator.torch_modules.Interp1d()
- Linear 1D interpolation done in native PyTorch. CUDA is supported. Forked from @aliutkus
sapsan.lib.estimator.torch_modules.Interp1d.forward(x: torch.tensor, y: torch.tensor, xnew: torch.tensor, out: torch.tensor)
-
Parameters
Name Type Discription Default xtorch.tensor 1D or 2D tensor ytorch.tensor 1D or 2D tensor; the length of yalong its last dimension must be the same as that ofxxnewtorch.tensor 1D or 2D tensor of real values. xnewcan only be 1D if bothxandyare 1D. Otherwise, its length along the first dimension must be the same as that of whicheverxandyis 2D.outtorch.tensor Tensor for the output If None, allocated automatically -
Return
Type Description torch.tensor interpolated tensor
Data Loaders
HDF5Dataset
CLASS
sapsan.lib.data.hdf5_dataset.HDF5Dataset( path: str, features: List[str], target: List[str], checkpoints: List[int], batch_size: int = None, input_size: int = None, sampler: Optional[Sampling] = None, time_granularity: float = 1, features_label: Optional[List[str]] = None, target_label: Optional[List[str]] = None, flat: bool = False, shuffle: bool=False, train_fraction = None)
- HDF5 data loader class
-
Parameters
Name Type Discription Default pathstr path to the data in the following format: "data/t_{checkpoint:1.0f}/{feature}_data.h5"featuresList[str] list of train features to load ['not_specified_data'] targetList[str] list of target features to load None checkpointsList[int] list of checkpoints to load (they will be appended as batches) input_sizeint dimension of the loaded data in each axis batch_sizeint dimension of a batch in each axis. If batch_size != input_size, the datacube will be evenly splitted input_size (doesn't work with sampler) batch_numint the number of batches to be loaded at a time 1 samplerobject data sampler to use (ex: EquidistantSampling()) None time_granularityfloat what is the time separation (dt) between checkpoints 1 features_labelList[str] hdf5 data label for the train features list(file.keys())[-1], i.e. last one in hdf5 file target_labelList[str] hdf5 data label for the target features list(file.keys())[-1], i.e. last one in hdf5 file flatbool flatten the data into [Cin, D*H*W]. Required for sk-learn models False shufflebool shuffle the dataset False train_fractionfloat or int a fraction of the dataset to be used for training (accessed through loaders['train']). The rest will be used for validation (accessed through loaders['valid']). If int is provided, then that number of batches will be used for training. If float is provided, then it will try to split the data either by batch or by actually slicing the data cube into smaller chunks None - training data will be used for validation, effectively skipping the latter
sapsan.lib.data.hdf5_dataset.HDF5Dataset.load_numpy()
- HDF5 data loader method - call it to load the data as a numpy array. If targets are not specified, than only features will be loaded (hence you can just load 1 dataset at a time).
-
Return
Type Description np.ndarray, np.ndarray loaded a dataset as a numpy array
sapsan.lib.data.hdf5_dataset.HDF5Dataset.convert_to_torch([x, y])
- Splits numpy arrays into batches and converts to torch dataloader
-
Parameters
Name Type Discription Default [x, y]list or np.ndarray a list of input datasets to batch and convert to torch loaders -
Return
Type Description OrderedDict{'train' : DataLoader, 'valid' : DataLoader } Data in Torch Dataloader format ready for training
sapsan.lib.data.hdf5_dataset.HDF5Dataset.load()
- Loads, splits into batches, and converts into torch dataloader. Effectively combines .load_numpy and .convert_to_torch
-
Return
Type Description np.ndarray, np.ndarray loaded train and target features: x, y
get_loader_shape
sapsan.lib.data.data_functions.get_loader_shape()
- Returns the shape of the loaded tensors - the loaded data that has been split into
trainandvaliddatasets. -
Parameters
Name Type Discription Default loaderstorch DataLoader the loader of tensors passed for training namestr name of the dataset in the loaders; usually either trainorvalidNone - chooses the first entry in loaders -
Return
Type Description np.ndarray shape of the tensor
Data Manipulation
EquidistantSampling
CLASS
sapsan.lib.data.sampling.EquidistantSampling(target_dim)
- Samples the data to a lower dimension, keeping separation between the data points equally distant
-
Parameters
Name Type Discription Default target_dimnp.ndarray new shape of the input in the form [D, H, W]
sapsan.lib.data.sampling.EquidistantSampling.sample(data)
- Performs sampling of the data
-
Parameters
Name Type Discription Default datanp.ndarray input data to be sampled - has the shape of [axis, D, H, W] -
Return
Type Description np.ndarray Sampled data with the shape [axis, D, H, W]
split_data_by_batch
sapsan.utils.shapes.split_data_by_batch(data: np.ndarray, size: int, batch_size: int, n_features: int, axis: int)
- [2D, 3D]: splits data into smaller cubes or squares of batches
-
Parameters
Name Type Discription Default datanp.ndarray input 2D or 3D data, [Cin, D, H, W] sizeint dimensionality of the data in each axis batch_sizeint dimensionality of the batch in each axis n_featuresint number of channels of the input data axisint number of axes, 2 or 3 -
Return
Type Description np.ndarray batched data: [N, Cin, Db, Hb, Wb]
combine_data
sapsan.utils.shapes.combine_data(data: np.ndarray, input_size: tuple, batch_size: tuple, axis: int)
- [2D, 3D] - reverse of
split_data_by_batchfunction -
Parameters
Name Type Discription Default datanp.ndarray input 2D or 3D data, [N, Cin, Db, Hb, Wb] input_sizetuple dimensionality of the original data in each axis batch_sizetuple dimensionality of the batch in each axis axisint number of axes, 2 or 3 -
Return
Type Description np.ndarray reassembled data: [Cin, D, H, W]
slice_of_cube
sapsan.utils.shapes.slice_of_cube(data: np.ndarray, feature: Optional[int] = None, n_slice: Optional[int] = None)
- Select a slice of a cube (to plot later)
-
Parameters
Name Type Discription Default datanp.ndarray input 3D data, [Cin, D, H, W] featureint feature to take the slice of, i.e. the value of Cin 1 n_sliceint what slice to select, i.e. the value of D 1 -
Return
Type Description np.ndarray data slice: [H, W]
Filter
spectral
sapsan.utils.filter.spectral(im: np.ndarray, fm: int)
- [2D, 3D] apply a spectral filter
-
Parameters
Name Type Discription Default imnp.ndarray input dataset (ex: [Cin, D, H, W]) fmint number of Fourier modes to filter down to -
Return
Type Description np.ndarray filtered dataset
box
sapsan.utils.filter.box(im: np.ndarray, ksize)
- [2D] apply a box filter
-
Parameters
Name Type Discription Default imnp.ndarray input dataset (ex: [Cin, H, W]) ksizetupple kernel size (ex: ksize = (2,2)) -
Return
Type Description np.ndarray filtered dataset
gaussian
sapsan.utils.filter.gaussian(im: np.ndarray, sigma)
- [2D, 3D] apply a gaussian filter
- Note: Guassian filter assumes dx=1 between the points. Adjust sigma accordingly.
-
Parameters
Name Type Discription Default imnp.ndarray input dataset (ex: [H, W] or [D, H, W]) sigmafloat or tuple of floats standard deviation for Gaussian kernel. Sigma can be defined for each axis individually. -
Return
Type Description np.ndarray filtered dataset
Backend (Tracking)
MLflowBackend
CLASS
sapsan.lib.backends.mlflow.MLflowBackend(name, host, port)
- Initilizes mlflow and starts up mlflow ui at a given host:port
-
Parameters
Name Type Discription Default namestr name under which to record the experiment "experiment" hoststr host of mlflow ui "localhost" portint port of mlflow ui 5000
sapsan.lib.backends.mlflow.MLflowBackend.start_ui()
- starts MLflow ui at a specified host and port
sapsan.lib.backends.mlflow.MLflowBackend.start(run_name: str, nested = False, run_id = None)
- Starts a tracking run
-
Parameters
Name Type Discription Default run_namestr name of the run "train" for Train(), "evaluate" forEvaluate()nestedbool whether or not to nest the recorded run False for Train(), True forEvaluate()run_idstr run id None - a new will be generated -
Return
Type Description str run_id
sapsan.lib.backends.mlflow.MLflowBackend.resume(run_id, nested = True)
- Resumes a previous run, so you can record extra parameters
-
Parameters
Name Type Discription Default run_idstr id of the run to resume nestedbool whether or not to nest the recorded run True, since it will usually be an Evaluate()run
sapsan.lib.backends.mlflow.MLflowBackend.log_metric()
- Logs a metric
sapsan.lib.backends.mlflow.MLflowBackend.log_parameter()
- Logs a parameter
sapsan.lib.backends.mlflow.MLflowBackend.log_artifact()
- Logs an artifact (any saved file such, e.g. .png, .txt)
sapsan.lib.backends.mlflow.MLflowBackend.log_model()
- Log a PyTorch model as an MLflow artifact for the current run. Corresponds to mlflow.pytorch.log_model()
sapsan.lib.backends.mlflow.MLflowBackend.load_model()
- Load a PyTorch model from a local file or a run. Corresponds to mlflow.pytorch.load_model()
sapsan.lib.backends.mlflow.MLflowBackend.close_active_run()
- Closes all active MLflow runs
sapsan.lib.backends.mlflow.MLflowBackend.end()
- Ends the most recent MLflow run
FakeBackend
CLASS
sapsan.lib.backends.fake.FakeBackend()
- Pass to
trainin order to disable backend (tracking)
Plotting
plot_params
sapsan.utils.plot.plot_params()
- Contains the matplotlib parameters that format all of the plots (
font.size,axes.labelsize, etc.) -
Return
Type Description dict matplotlib parameters -
Default Parameters
pdf_plot
sapsan.utils.plot.pdf_plot(series: List[np.ndarray], bins: int = 100, label: Optional[List[str]] = None, figsize: tuple, dpi: int, ax: matplotlib.axes, style: str)
- Plot a probability density function (PDF) of a single or multiple datasets
-
Parameters
Name Type Discription Default seriesList[np.ndarray] input datasets binsint number of bins to use for the dataset to generate the pdf 100 labelList[str] list of names to use as labels in the legend None figsizetuple figure size as passed to matplotlib figure (6,6) dpiint resolution of the figure 60 axmatplotlib.axes axes object to use for plotting (if you want to define your own figure and subplots) None - creates a separate figure stylestr accepts matplotlib styles 'tableau-colorblind10' -
Return
Type Description matplotlib.axes object ax
cdf_plot
sapsan.utils.plot.cdf_plot(series: List[np.ndarray], bins: int = 100, label: Optional[List[str]] = None, figsize: tuple, dpi: int, ax: matplotlib.axes, ks: bool, style: str)
- Plot a cumulative distribution function (CDF) of a single or multiple datasets
-
Parameters
Name Type Discription Default seriesList[np.ndarray] input datasets binsint number of bins to use for the dataset to generate the pdf 100 labelList[str] list of names to use as labels in the legend None figsizetuple figure size as passed to matplotlib figure (6,6) dpiint resolution of the figure 60 axmatplotlib.axes axes object to use for plotting (if you want to define your own figure and subplots) None - creates a separate figure ksbool if True prints out on the plot itself the Kolomogorov-Smirnov Statistic. It will also be returned along with the ax object False stylestr accepts matplotlib styles 'tableau-colorblind10' -
Return
Type Description matplotlib.axes object, float (if ks==True) ax, ks (if ks==True)
line_plot
sapsan.utils.plot.line_plot(series: List[np.ndarray], bins: int = 100, label: Optional[List[str]] = None, plot_type: str, figsize: tuple, dpi: int, ax: matplotlib.axes, style: str)
- Plot linear data of x vs y - same matplotlib formatting will be used as the other plots
-
Parameters
Name Type Discription Default seriesList[np.ndarray] input datasets binsint number of bins to use for the dataset to generate the pdf 100 labelList[str] list of names to use as labels in the legend None plot_typestr axis type of the matplotlib plot; options = ['plot', 'semilogx', 'semilogy', 'loglog'] 'plot' figsizetuple figure size as passed to matplotlib figure (6,6) linestyleList[str] list of linestyles to use for each profile for the matplotlib figure ['-'] (solid line) dpiint resolution of the figure 60 axmatplotlib.axes axes object to use for plotting (if you want to define your own figure and subplots) None - creates a separate figure stylestr accepts matplotlib styles 'tableau-colorblind10' -
Return
Type Description matplotlib.axes object ax
slice_plot
sapsan.utils.plot.slice_plot(series: List[np.ndarray], label: Optional[List[str]] = None, cmap = 'plasma', figsize: tuple, dpi: int, ax: matplotlib.axes)
- Plot 2D spatial distributions (slices) of your target and prediction datasets. Colorbar limits for both slices are set based on the minimum and maximum of the 2nd (target) provided dataset.
-
Parameters
Name Type Discription Default seriesList[np.ndarray] input datasets labelList[str] list of names to use as labels in the legend None cmapstr matplotlib colormap to use 'viridis' figsizetuple figure size as passed to matplotlib figure (6,6) dpiint resolution of the figure 60 axmatplotlib.axes axes object to use for plotting (if you want to define your own figure and subplots)
WARNING: only works if a single image is supplied toslice_plot(), otherwise will be ignoredNone - creates a separate figure -
Return
Type Description matplotlib.axes object ax
log_plot
sapsan.utils.plot.log_plot(show_log = True, log_path = 'logs/logs/train.csv', valid_log_path = 'logs/logs/valid.csv', delimiter=',', train_name = 'train_loss', valid_name = 'valid_loss', train_column = 1, valid_column = 1, epoch_column = 0)
- Plots an interactive training log of train_loss vs. epoch with plotly
-
Parameters
Name Type Discription Default show_logbool show the loss vs. epoch progress plot (it will be save in mlflow in either case) True log_pathstr path to training log produced by the catalyst wrapper 'logs/logs/train.csv' valid_log_pathstr path to validation log produced by the catalyst wrapper 'logs/logs/valid.csv' delimiterstr delimiter to use for numpy.genfromtxt data loading ',' train_namestr name for the training label 'train_loss' valid_namestr name for the validation label 'valid_loss' train_columnint column to load for training data from log_path1 valid_columnint column to load for validation data from valid_log_path1 epoch_columnint column to load the epoch index from log_path. If None, then epoch will be generated fro the number of entries0 -
Return
Type Description plotly.express object plot figure
model_graph
sapsan.utils.plot.model_graph(model, shape: np.array, transforms)
- Creates a graph of the ML model (needs graphviz to be installed). A tutorial is available on the wiki: Model Graph
- The method is based on hiddenlayer originally written by Waleed Abdulla.
-
Parameters
Name Type Discription Default modelobject initialized pytorch or tensorflow model shapenp.array shape of the input array in the form [N, Cin, Db, Hb, Wb], where Cin=1 transformslist[methods] a list of hiddenlayer transforms to be applied (Fold, FoldId, Prune, PruneBranch, FoldDuplicates, Rename), defined in transforms.py See below -
Default Parameters
-
Return
Type Description graphviz.Digraph object SVG graph of a model
Physics
ReynoldsStress
sapsan.utils.physics.ReynoldsStress(u, filt, filt_size, only_x_components=False)
- Calculates a stress tensor of the form
- where \(\tilde{u}\) is the filtered \(u\)
-
Parameters
Name Type Discription Default unp.ndarray input velocity in 3D - [axis, D, H, W] filtsapsan.utils.filters the type of filter to use (spectral, box, gaussian). Pass the filter itself by loading the appropriate one from sapsan.utils.filtersgaussian filt_sizeint or float size of the filter to apply. For different filter types, the size is defined differently. Spectral - fourier mode to filter to, Box - k_size (box size), Gaussian - sigma 2 (sigma=2 for gaussian) only_x_componentbool calculates and outputs only x components of the tensor in shape [row, D, H, W] - calculating all 9 can be taxing on memory False -
Return
Type Description np.ndarray stress tensor of shape [column, row, D, H, W]
PowerSpectrum
CLASS
sapsan.utils.physics.PowerSpectrum(u: np.ndarray)
- Sets up to produce a power spectrum
-
Parameters
Name Type Discription Default unp.ndarray input velocity
(first dimension must be the axis=[1, 2, or 3],
e.g. the shape for 3D velocity should be: [axis, D, H, W])
sapsan.utils.physics.PowerSpectrum.calculate()
- Calculates the power spectrum
-
Return
Type Description np.ndarray, np.ndarray k_bins (fourier modes), Ek_bins (E(k))
sapsan.utils.physics.PowerSpectrum.spectrum_plot(k_bins, Ek_bins, kolmogorov=True, kl_a)
- Plots the calculated power spectrum
-
Parameters
Name Type Discription Default k_binsnp.ndarray fourier mode values along x-axis Ek_binsnp.ndarray energy as a function of k: E(k) kolmogorovbool plots scaled Kolmogorov's -5/3 spectrum alongside the calculated one True kl_Afloat scaling factor of Kolmogorov's law np.amax(self.Ek_bins)*1e1 -
Return
Type Description matplotlib.axes object spectrum plot
GradientModel
CLASS
sapsan.utils.physics.GradientModel(u: np.ndarray, filter_width, delta_u = 1)
- sets up to apply a gradient turbulence subgrid model:
- where \(\Delta\) is the filter width and \(u^*\) is the filtered \(u\)
-
Parameters
Name Type Discription Default unp.ndarray input filtered quantity in 3D - [axis, D, H, W] filter_widthfloat width of the filter which was applied onto udelta_udistance between the points on the grid to use for scaling 1
sapsan.utils.physics.GradientModel.gradient()
- calculated the gradient of the given input data from GradientModel
-
Return
Type Description np.ndarray gradient with shape [column, row, D, H, W]
sapsan.utils.physics.GradientModel.model()
- calculates the gradient model tensor with shape [column, row, D, H, W]
-
Return
Type Description np.ndarray gradient model tensor
DynamicSmagorinskyModel
CLASS
sapsan.utils.physics.DynamicSmagorinskyModel(u: np.ndarray, filt, original_filt_size, filt_ratio, du, delta_u)
- sets up to apply a Dynamic Smagorinsky (DS) turbulence subgrid model:
- where \(\Delta\) is the filter width and \(S^*\) is the filtered \(u\)
-
Parameters
Name Type Discription Default unp.ndarray input filtered quantity either in 3D [axis, D, H, W] or 2D [axis, D, H] dunp.ndarray gradient of uNone*: if duis not provided, then it will be calculated withnp.gradient()filtsapsan.utils.filters the type of filter to use (spectral, box, gaussian). Pass the filter itself by loading the appropriate one from sapsan.utils.filtersspectral original_fil_sizeint width of the filter which was applied onto u15 (spectral, fourier modes = 15) delta_ufloat distance between the points on the grid to use for scaling 1 filt_ratiofloat the ratio of additional filter that will be applied on the data to find the slope for Dynamic Smagorinsky extrapolation over original_filt_size0.5
sapsan.utils.physics.DynamicSmagorinskyModel.model()
- calculates the DS model tensor with shape [column, row, D, H, W]
-
Return
Type Description np.ndarray DS model tensor