Custom Estimator
Sapsan makes it easy to get started on designing your own ML model layer-by-layer.
Command-line Interface (CLI)
Here is the easiest way to get started, where you should replace {name}
with your custom project name.
This will create the full structure for your project, but in a template form. You will primarily focus on the designing your ML model (estimator). You will find the template for it in
The template is structured to utilize a custom backend sapsan.lib.estimator.torch_backend.py
, hence it revolves around using PyTorch. In the template, you will define the layers your want to use, the order in which they should be executed, and a few custom model parameters (Optimizer, Loss Function, Scheduler). Since we are talking about PyTorch, refer to its API to define your layers.
Estimator Template
- {name}Model
- define your ML layers
- forward function (layer order)
- {name}Config
- set parameters (e.g. number of epochs) - usually set through a high-level interface (e.g. a jupyter notebook)
- add custom parameters to be tracked by MLflow
- {name}
- Set the Optimizer
- Set the Loss Functions
- Set the Scheduler
- Set the Model (based on {name}Model & {name}Config)
Editing Catalyst Runner
For majority of applications, you won't need to touch Catalyst Runner settings, which located in torch_backend.py
. However, in case you would like to dig further into more unique loss functions, optimizers, data distribution setups, then you can copy the torch_backend.py
via --get_torch_backend
or shorthand --gtb
flag during the creation of the project:
For runner
types and extensive options please refer to Catalyst Documentation.
As for runner adjustments to parallele your training, Sapsan's Wiki includes a page on Parallel GPU Training.
Loss
Catalyst includes a more extensive list of losses, i.e. criterions, than the standard PyTorch. Their implementations might require to include some extra Callback
s to be specified in the runner (Criterion Documentation). Please refer to Catalyst examples to create your own loss functions.
Optimizer
Similar deal is with the optimizer. While using standard (i.e. Adam) can be specified within Estimator Template, for a more complex or custom setup you will need to refer to the runner (Optimizer Documentation).