SDK Core ML Core API

Modules

Reusable building blocks for neural networks.

class metavision_core_ml.core.modules.ConvLayer(in_channels, out_channels, kernel_size=3, stride=1, padding=1, dilation=1, bias=True, norm='BatchNorm2d', activation='ReLU', separable=False, **kwargs)

Building Block Convolution Layer

Parameters

in_channels (int) – number of input channels
out_channels (int) – number of output channels
kernel_size (int) – conv receptive field
stride (int) – conv stride
dilation (int) – conv dilation
bias (bool) – whether or not to add a bias
norm (str) – type of the normalization
activation (str) – type of non-linear activation
separable (bool) – whether to use separable convolution
**kwargs – Additional keyword arguments passed to the convolution operator.

Initialize internal Module state, shared by both nn.Module and ScriptModule.

class metavision_core_ml.core.modules.DepthWiseSeparableConv2d(in_channels, out_channels, kernel_size=1, stride=1, padding=0, dilation=1, bias=False, depth_multiplier=1, **kwargs)

Depthwise Separable Convolution followed by pointwise 1x1 Convolution.

A convolution is called depthwise separable when the normal convolution is split into two convolutions: depthwise convolution and pointwise convolution.

Parameters

in_channels (int) – number of input channels
out_channels (int) – number of output channels
kernel_size (int) – separable conv receptive field
stride (int) – separable conv stride.
padding (int) – separable conv padding.
depth_multiplier (int) – Factor by which we multiply the in_channels to get the number of output_channels in the depthwise convolution.
**kwargs – Additional keyword arguments passed to the first convolution operator.

Initialize internal Module state, shared by both nn.Module and ScriptModule.

class metavision_core_ml.core.modules.PreActBlock(in_channels, out_channels, stride=1)

Squeeze-Excite Block from: Squeeze-and-Excitation Networks (Hu et al.)

Parameters

in_channels (int) – number of input channels.
out_channels (int) – number of output channels.
stride (int) – convolution stride.

Initialize internal Module state, shared by both nn.Module and ScriptModule.

forward(x)

Define the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class metavision_core_ml.core.modules.ResBlock(in_channels, out_channels, stride=1, norm='BatchNorm2d')

Residual Convolutional Block

Parameters

in_channels (int) – number of input channels
out_channels (int) – number of output channels
stride (int) – convolutional stride
norm (str) – type of normalization

Initialize internal Module state, shared by both nn.Module and ScriptModule.

forward(x)

Define the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

Temporal Modules

Layers involving recursion or temporal aspects.

class metavision_core_ml.core.temporal_modules.ConvGRUCell(in_channels, out_channels, kernel_size=3, padding=1, conv_func=<class 'torch.nn.modules.conv.Conv2d'>, hard=False, stride=1, dilation=1)

ConvGRUCell module, applies sequential part of the Gated Recurrent Unit.

GRU with matrix multiplication replaced by convolution See Chung, Junyoung, et al. “Empirical evaluation of gated recurrent neural networks on sequence modeling.

Parameters

in_channels (int) – number of input channels.
out_channels (int) – number of output_channels of hidden state.
kernel_size (int) – internal convolution receptive field.
padding (int) – padding parameter for the convolution
conv_func (fun) – functional that you can replace if you want to interact with your 2D state differently.
hard (bool) – applies hard gates.

Initialize internal Module state, shared by both nn.Module and ScriptModule.

forward(xt): xt size: (T, B,C,H,W) return size: (T, B,C’,H,W)

reset(mask)

Sets the memory (or hidden state to zero), normally at the beginning of a new sequence.

reset() needs to be called at the beginning of a new sequence. The mask is here to indicate which elements of the batch are indeed new sequences.

class metavision_core_ml.core.temporal_modules.ConvLSTMCell(hidden_dim, kernel_size, conv_func=<class 'torch.nn.modules.conv.Conv2d'>, hard=False)

ConvLSTMCell module, applies sequential part of LSTM.

LSTM with matrix multiplication replaced by convolution See Convolutional LSTM Network: A Machine Learning Approach for Precipitation Nowcasting (Shi et al.)

Parameters

hidden_dim (int) – number of output_channels of hidden state.
kernel_size (int) – internal convolution receptive field.
conv_func (fun) – functional that you can replace if you want to interact with your 2D state differently.
hard (bool) – applies hard gates.

Initialize internal Module state, shared by both nn.Module and ScriptModule.

forward(x)

Define the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

reset(mask)

Sets the memory (or hidden state to zero), normally at the beginning of a new sequence.

reset() needs to be called at the beginning of a new sequence. The mask is here to indicate which elements of the batch are indeed new sequences.

reset_all(): Resets memory for all sequences in one batch.

class metavision_core_ml.core.temporal_modules.ConvRNN(in_channels, out_channels, kernel_size=3, stride=1, padding=1, dilation=1, cell='lstm', separable=False, separable_hidden=False, **kwargs)

ConvRNN module. ConvLSTM cell followed by a feed forward convolution layer.

Parameters

in_channels (int) – number of input channels
out_channels (int) – number of output channels
kernel_size (int) – separable conv receptive field
stride (int) – separable conv stride.
padding (int) – padding.
separable (boolean) – if True, uses depthwise separable convolution for the forward convolutional layer.
separable_hidden (boolean) – if True, uses depthwise separable convolution for the hidden convolutional layer.
cell (string) – RNN cell type, currently gru and lstm only are supported.
**kwargs – additional parameters for the feed forward convolutional layer.

Initialize internal Module state, shared by both nn.Module and ScriptModule.

forward(x)

Define the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

reset(mask=tensor([0.])): Resets memory of the network.

class metavision_core_ml.core.temporal_modules.RNNCell(hard)

Abstract class that has memory. serving as a base class to memory layers.

Parameters: hard (bool) – Applies hard gates to memory updates function.

Initialize internal Module state, shared by both nn.Module and ScriptModule.

class metavision_core_ml.core.temporal_modules.SequenceWise(module, ndims=5)

Wrapper Module that allows the wrapped Module to be applied on sequential Tensors of shape 5 (num_time_bins, batch_size, channel_count, height, width)

module

Module to wrap to be able to apply non sequential model on tensor of 5 dimensions.

Type: torch.nn.Module

Parameters: module (torch.nn.Module) – Module to wrap to be able to apply non sequential model on tensor of 5 dimensions.

forward(x)

Define the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class metavision_core_ml.core.temporal_modules.VideoSequential(*args)

Wrapper Module that allows to call a torch.nn.Sequential object on shape 5 (num_time_bins, batch_size, channel_count, height, width)

Difference with SequenceWise is that this handles a list of module. You can build this like a Sequential Object.

Example

>> video_net = VideoSequential(nn.Conv2d(3,16,3,1,1),: nn.ReLU())

>> t,b,c,h,w = 3,2,3,128,128 >> x = torch.randn(t,b,c,h,w) >> y = video_net(x)

Initialize internal Module state, shared by both nn.Module and ScriptModule.

forward(x)

Define the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

metavision_core_ml.core.temporal_modules.batch_to_time(x: torch.Tensor, n: int) → torch.Tensor

Reverts a 5 dimensional Tensor that has been collapsed with time_to_batch to its original form

Parameters

x (torch.Tensor) – shape (num_time_bins * batch_size, channel_count, height, width)
batch_size (int) – number of separate sequences part of the Tensor.

Returns

shape (num_time_bins, batch_size, channel_count, height, width)

Return type

x (torch.Tensor)

metavision_core_ml.core.temporal_modules.seq_wise(function): Decorator to apply 4 dimensional tensor functions on 5 dimensional temporal tensor input.

metavision_core_ml.core.temporal_modules.time_to_batch(x: torch.Tensor) → Tuple[torch.Tensor, int]

Collapses a five dimensional Tensor to four dimensional tensor by putting sequence samples in the batch dimension.

Parameters: x (torch.Tensor) – (num_time_bins, batch_size, channel_count, height, width)
Returns: shape (num_time_bins * batch_size, channel_count, height, width) batch_size (int): number of separate sequences part of the Tensor.
Return type: x (torch.Tensor)

Unet

Base unet code U-Net: Convolutional Networks for Biomedical Image Segmentation Olaf Ronneberger, Philipp Fischer, Thomas Brox

Notes: - User is responsible for creating the layers, they should have in_channels, out_channels in argument (they must be pre-filled) - User is responsible for making sure spatial sizes agree.

class metavision_core_ml.core.unet.Unet(encoders, decoders, down, up)

Ultra-Generic Unet

Parameters

encoders – list of encoder layers
decoders – list of decoder layers
down_layer – layer to resize input
up_layer – layer to resize + merge

Initialize internal Module state, shared by both nn.Module and ScriptModule.

forward(x)

Define the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

metavision_core_ml.core.unet.unet_layers(down_block, middle_block, up_block, input_size=5, down_filter_sizes=[32, 64], middle_filter_size=128, up_filter_sizes=[64, 32, 8])

Builds unet layers can be used to build unet layers (but you are not forced to)

Here we make sure to connect the last upsampled feature-map to the input!

X - Y = Conv([X,Up(U2)])

/

D1 - U2 = Conv([D1,Up(U1)])

/

D2 - U1 = Conv([D2,Up(M)])

/: D3 - M

All block types are partial functions expecting in_channels, out_channels as first two parameters.

Parameters

down_block – encoder’s block type
middle_block – bottleneck’s block type
up_block – decoder’s block type
input_size – in_channels
down_filter_sizes – out_channels per encoder
middle_filter_size – bottleneck’s channels
up_fitler_sizes – decoder’s channels