SDK Core ML Core API

Modules

Reusable building blocks for neural networks.

class metavision_core_ml.core.modules.ConvLayer(in_channels, out_channels, kernel_size=3, stride=1, padding=1, dilation=1, bias=True, norm='BatchNorm2d', activation='ReLU', separable=False, **kwargs)

Building Block Convolution Layer

Parameters
  • in_channels (int) – number of input channels

  • out_channels (int) – number of output channels

  • kernel_size (int) – conv receptive field

  • stride (int) – conv stride

  • dilation (int) – conv dilation

  • bias (bool) – whether or not to add a bias

  • norm (str) – type of the normalization

  • activation (str) – type of non-linear activation

  • separable (bool) – whether to use separable convolution

  • **kwargs – Additional keyword arguments passed to the convolution operator.

Initializes internal Module state, shared by both nn.Module and ScriptModule.

class metavision_core_ml.core.modules.DepthWiseSeparableConv2d(in_channels, out_channels, kernel_size=1, stride=1, padding=0, dilation=1, bias=False, depth_multiplier=1, **kwargs)

Depthwise Separable Convolution followed by pointwise 1x1 Convolution.

A convolution is called depthwise separable when the normal convolution is split into two convolutions: depthwise convolution and pointwise convolution.

Parameters
  • in_channels (int) – number of input channels

  • out_channels (int) – number of output channels

  • kernel_size (int) – separable conv receptive field

  • stride (int) – separable conv stride.

  • padding (int) – separable conv padding.

  • depth_multiplier (int) – Factor by which we multiply the in_channels to get the number of output_channels in the depthwise convolution.

  • **kwargs – Additional keyword arguments passed to the first convolution operator.

Initializes internal Module state, shared by both nn.Module and ScriptModule.

class metavision_core_ml.core.modules.PreActBlock(in_channels, out_channels, stride=1)

Squeeze-Excite Block from: Squeeze-and-Excitation Networks (Hu et al.)

Parameters
  • in_channels (int) – number of input channels.

  • out_channels (int) – number of output channels.

  • stride (int) – convolution stride.

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(x)

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class metavision_core_ml.core.modules.ResBlock(in_channels, out_channels, stride=1, norm='BatchNorm2d')

Residual Convolutional Block

Parameters
  • in_channels (int) – number of input channels

  • out_channels (int) – number of output channels

  • stride (int) – convolutional stride

  • norm (str) – type of normalization

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(x)

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

Temporal Modules

Layers involving recursion or temporal aspects.

class metavision_core_ml.core.temporal_modules.ConvGRUCell(in_channels, out_channels, kernel_size=3, padding=1, conv_func=<class 'torch.nn.modules.conv.Conv2d'>, hard=False, stride=1, dilation=1)

ConvGRUCell module, applies sequential part of the Gated Recurrent Unit.

GRU with matrix multiplication replaced by convolution See Chung, Junyoung, et al. “Empirical evaluation of gated recurrent neural networks on sequence modeling.

Parameters
  • in_channels (int) – number of input channels.

  • out_channels (int) – number of output_channels of hidden state.

  • kernel_size (int) – internal convolution receptive field.

  • padding (int) – padding parameter for the convolution

  • conv_func (fun) – functional that you can replace if you want to interact with your 2D state differently.

  • hard (bool) – applies hard gates.

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(xt)

xt size: (T, B,C,H,W) return size: (T, B,C’,H,W)

reset(mask)

Sets the memory (or hidden state to zero), normally at the beginning of a new sequence.

reset() needs to be called at the beginning of a new sequence. The mask is here to indicate which elements of the batch are indeed new sequences.

class metavision_core_ml.core.temporal_modules.ConvLSTMCell(hidden_dim, kernel_size, conv_func=<class 'torch.nn.modules.conv.Conv2d'>, hard=False)

ConvLSTMCell module, applies sequential part of LSTM.

LSTM with matrix multiplication replaced by convolution See Convolutional LSTM Network: A Machine Learning Approach for Precipitation Nowcasting (Shi et al.)

Parameters
  • hidden_dim (int) – number of output_channels of hidden state.

  • kernel_size (int) – internal convolution receptive field.

  • conv_func (fun) – functional that you can replace if you want to interact with your 2D state differently.

  • hard (bool) – applies hard gates.

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(x)

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

reset(mask)

Sets the memory (or hidden state to zero), normally at the beginning of a new sequence.

reset() needs to be called at the beginning of a new sequence. The mask is here to indicate which elements of the batch are indeed new sequences.

reset_all()

Resets memory for all sequences in one batch.

class metavision_core_ml.core.temporal_modules.ConvRNN(in_channels, out_channels, kernel_size=3, stride=1, padding=1, dilation=1, cell='lstm', separable=False, separable_hidden=False, **kwargs)

ConvRNN module. ConvLSTM cell followed by a feed forward convolution layer.

Parameters
  • in_channels (int) – number of input channels

  • out_channels (int) – number of output channels

  • kernel_size (int) – separable conv receptive field

  • stride (int) – separable conv stride.

  • padding (int) – padding.

  • separable (boolean) – if True, uses depthwise separable convolution for the forward convolutional layer.

  • separable_hidden (boolean) – if True, uses depthwise separable convolution for the hidden convolutional layer.

  • cell (string) – RNN cell type, currently gru and lstm only are supported.

  • **kwargs – additional parameters for the feed forward convolutional layer.

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(x)

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

reset(mask=tensor([0.]))

Resets memory of the network.

class metavision_core_ml.core.temporal_modules.RNNCell(hard)

Abstract class that has memory. serving as a base class to memory layers.

Parameters

hard (bool) – Applies hard gates to memory updates function.

Initializes internal Module state, shared by both nn.Module and ScriptModule.

class metavision_core_ml.core.temporal_modules.SequenceWise(module, ndims=5)

Wrapper Module that allows the wrapped Module to be applied on sequential Tensors of shape 5 (num_time_bins, batch_size, channel_count, height, width)

module

Module to wrap to be able to apply non sequential model on tensor of 5 dimensions.

Type

torch.nn.Module

Parameters

module (torch.nn.Module) – Module to wrap to be able to apply non sequential model on tensor of 5 dimensions.

forward(x)

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class metavision_core_ml.core.temporal_modules.VideoSequential(*args)

Wrapper Module that allows to call a torch.nn.Sequential object on shape 5 (num_time_bins, batch_size, channel_count, height, width)

Difference with SequenceWise is that this handles a list of module. You can build this like a Sequential Object.

Example

>> video_net = VideoSequential(nn.Conv2d(3,16,3,1,1),

nn.ReLU())

>> t,b,c,h,w = 3,2,3,128,128 >> x = torch.randn(t,b,c,h,w) >> y = video_net(x)

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(x)

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

metavision_core_ml.core.temporal_modules.batch_to_time(x: torch.Tensor, n: int)torch.Tensor

Reverts a 5 dimensional Tensor that has been collapsed with time_to_batch to its original form

Parameters
  • x (torch.Tensor) – shape (num_time_bins * batch_size, channel_count, height, width)

  • batch_size (int) – number of separate sequences part of the Tensor.

Returns

shape (num_time_bins, batch_size, channel_count, height, width)

Return type

x (torch.Tensor)

metavision_core_ml.core.temporal_modules.seq_wise(function)

Decorator to apply 4 dimensional tensor functions on 5 dimensional temporal tensor input.

metavision_core_ml.core.temporal_modules.time_to_batch(x: torch.Tensor)Tuple[torch.Tensor, int]

Collapses a five dimensional Tensor to four dimensional tensor by putting sequence samples in the batch dimension.

Parameters

x (torch.Tensor) – (num_time_bins, batch_size, channel_count, height, width)

Returns

shape (num_time_bins * batch_size, channel_count, height, width) batch_size (int): number of separate sequences part of the Tensor.

Return type

x (torch.Tensor)

Unet

Base unet code U-Net: Convolutional Networks for Biomedical Image Segmentation Olaf Ronneberger, Philipp Fischer, Thomas Brox

Notes: - User is responsible for creating the layers, they should have in_channels, out_channels in argument (they must be pre-filled) - User is responsible for making sure spatial sizes agree.

class metavision_core_ml.core.unet.Unet(encoders, decoders, down, up)

Ultra-Generic Unet

Parameters
  • encoders – list of encoder layers

  • decoders – list of decoder layers

  • down_layer – layer to resize input

  • up_layer – layer to resize + merge

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(x)

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

metavision_core_ml.core.unet.unet_layers(down_block, middle_block, up_block, input_size=5, down_filter_sizes=[32, 64], middle_filter_size=128, up_filter_sizes=[64, 32, 8])

Builds unet layers can be used to build unet layers (but you are not forced to)

Here we make sure to connect the last upsampled feature-map to the input!

X - Y = Conv([X,Up(U2)])
/
D1 - U2 = Conv([D1,Up(U1)])
/
D2 - U1 = Conv([D2,Up(M)])
/

D3 - M

All block types are partial functions expecting in_channels, out_channels as first two parameters.

Parameters
  • down_block – encoder’s block type

  • middle_block – bottleneck’s block type

  • up_block – decoder’s block type

  • input_size – in_channels

  • down_filter_sizes – out_channels per encoder

  • middle_filter_size – bottleneck’s channels

  • up_fitler_sizes – decoder’s channels