SDK ML Core API

Pyramid of inputs

class metavision_ml.core.pyramid.Pyramid(num_levels, mode='bilinear', align_corners=True)

Holds a Pyramid of Resized Tensors

Parameters

num_levels – number of scales
mode – mode of interpolation
align_corners – see Pytorch Documentation of interpolate

compute_all_levels(inputs)

Computes all levels of Pyramid

Parameters: x (torch.Tensor) – batch (num_tbins, batch_size, channels, height, width)

compute_level(inputs, level)

Computes One Level of Resize

Parameters

x (torch.Tensor) – batch (num_tbins, batch_size, channels, height, width)
level (int) – level of pyramid

Module implementing functional Unet style networks.

class metavision_ml.core.unet.Unet(down_block, middle_block, up_block, n_input_channels=5, down_channel_counts=[32, 64], middle_channel_count=128, up_channel_counts=[64, 32, 8])

Generic Unet implementation.

D -> U

D -> U /

D -> U /: MIDDLE /

Parameters

down_block (function) – function (input_channels x output_channels) -> torch.nn.Module used to create each layer of the encoder. The eventual spatial downsampling is supposed to be part of the Module (i.e. Either the module needs to have pooling or a strided convolution if the unet needs to have a hourglass shape).
middle_block (function) – function (input_channels x output_channels) -> torch.nn.Module used to instantiate the middle part of the unet architecture. For instance, this is a good part to put a recurrent layer. If downsampling is used in the encoder, this part has the lowest spatial resolution.
up_block (function) – function (input_channels x output_channels) -> torch.nn.Module used to create each layer of the decoder. Its module will be run after eventual upsampling, so that its spatial resolution is the same as the corresponding decoding layer.
n_input_channels (int) – Number of channels in the input of the Unet model.
down_channel_counts (int List) – Number of filters in the “down” layers of the encoder.
middle_channel_counts (int List) – Number of filters in the middle layer.
up_channel_counts (int Lists) – Number of filters in the “up” layers of the decoder.

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(x)

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

metavision_ml.core.unet.interpolate(tensor: torch.Tensor, size: List[int]) → torch.Tensor: Generic interpolation for TNCHW and NCHW Tensors.

Module implementing functional Unet style networks with a regression map.

class metavision_ml.core.unet_variants.RegressorHead(block, in_channels, out_channels, n_output_channels, kernel=3, stride=1, padding=1)

Performs a dense regression after a feature computation.

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(inp: torch.Tensor) → Tuple[torch.Tensor, torch.Tensor]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class metavision_ml.core.unet_variants.UnetRegressor(down_block, middle_block, up_block, n_input_channels=5, n_output_channels=2, down_channel_counts=[32, 64], middle_channel_count=128, up_channel_counts=[64, 32, 8])

Generic Unet implementation computing a regression map at different scales.

Schematics:

ENCODER ->   DECODER
D       ->          U  ->
\ D     ->       U / ->
 \ D    ->     U / ->
  \ D   ->    U / ->
   \ MIDDLE   /

This model works with either (B, C H W) tensors or (T, B, C, H, W) tensors.

Parameters

down_block (function) – function (input_channels x output_channels) -> torch.nn.Module used to create each layer of the encoder. The eventual spatial downsampling is supposed to be part of the Module (i.e. Either the module needs to have pooling or a strided convolution if the unet needs to have a hourglass shape).
middle_block (function) – function (input_channels x output_channels) -> torch.nn.Module used to instantiate the middle part of the unet architecture. For instance, this is a good part to put a recurrent layer. If downsampling is used in the encoder, this part has the lowest spatial resolution.
up_block (function) – function (input_channels x output_channels) -> torch.nn.Module used to create each layer of the decoder. Its module will be run after eventual upsampling, so that its spatial resolution is the same as the corresponding decoding layer.
n_input_channels (int) – Number of channels in the input of the Unet model.
n_output_channels (int) – Number of channels in the output feature map of the Unet model.
down_channel_counts (int List) – Number of filters in the “down” layers of the encoder.
middle_channel_counts (int List) – Number of filters in the middle layer.
up_channel_counts (int Lists) – Number of filters in the “up” layers of the decoder.

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(x) → List[torch.Tensor]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

Layers of Neural network involving a displacement field, or interpolation.

metavision_ml.core.warping.compute_iwe(events_np, flow, t, duration_flow=None, assert_target_time_strict=True)

Computes an image of warped events (IWE) given a numpy array of events

Parameters

events_np (np.array) – numpy array of EventCD
flow (torch.Tensor) – size (2, H, W) expressed in pixels
t (int) – timestamp to warp to
duration_flow (int) – duration of the chunk of events over which the flow is computed
assert_target_time_strict (boolean) – if True, the target timestamp t must be in the range covered by events_np

metavision_ml.core.warping.compute_iwe_torch(xt, yt, tt, pt, flow, t, duration_flow)

Compute an image of warped events (IWE) given torch tensors of events

Parameters

xt (torch.Tensor) – x (n,)
yt (torch.Tensor) – y (n,)
tt (torch.Tensor) – t (n,)
pt (torch.Tensor) – p (n,)
flow (torch.Tensor) – size (2, H, W) expressed in pixels
t (int) – timestamp to warp to
duration_flow (int) – duration of the chunk of events over which the flow is computed

metavision_ml.core.warping.event_image(xs, ys, ps, height, width, interpolation='bilinear')

Differentiable Image creation from events

Parameters

xs (torch.Tensor) – x values (n,)
ys (torch.Tensor) – y values (n,)
ps (torch.Tensor) – polarities (or wheights) to apply
height (int) – height of the image
width (int) – width of the image
interpolation (string) – either ‘bilinear’ or ‘nearest’

metavision_ml.core.warping.interpolate_to_image(pxs, pys, dxs, dys, weights, img)

Accumulate x and y coords to an image using bilinear interpolation

Parameters

pxs (torch.Tensor) – pixel x (n,)
pys (torch.Tensor) – pixel y (n,)
dxs (torch.Tensor) – decimal part in x (n,)
dys (torch.Tensor) – decimal part in y (n,)
weights (torch.Tensor) – values to interpolate (n,)
img (torch.Tensor) – output image is updated (height, width)

metavision_ml.core.warping.warp_events(x, y, t, flow, t0)

Moves Events Directly

Parameters

x (torch.Tensor) – x (n,)
y (torch.Tensor) – y (n,)
t (torch.Tensor) – time (n,)
flow (torch.Tensor) – (2,height,width)

Layers of Neural network involving a displacement field, or interpolation.

class metavision_ml.core.warp_modules.Warping(mode='bilinear')

Differentiable warping module using bilinear sampling. This calls the internal functions above while storing a grid. When the resolution changes, the grid is reallocated. This modules handles sequences (T,B,C,H,W tensors).

grid

grid used for interpolation, saved to avoid reallocations.

Type: torch.FloatTensor

Parameters: mode (string) – interpolation mode (can be bilinear or nearest).

sharpen_micro_tbin(img, flow, micro_tbin, is_on_off_volume=True, align_corners=True, flow_is_normalized=False)

Applies flow to an input tensor with on/off channels to warp all micro bins onto one.

This has the effect of reducing apparent motion blur (providing the flow is correct !). Contrary to the function sharpen this is only applicable to input features that have micro time bins : id est Channels that are computed by slice of time within a delta_t (for instance event_cube).

Parameters

img (torch.Tensor) – size (num_time_bins, batch_size, channel_count, height, width)
flow – size (num_tbins x batch_size, 2, height, width) in pixels/bin
micro_tbin (int) – index of the time bin to warp to (from zero to num_tbins -1)
is_on_off_volume (bool) – if input channels are organized into 2 groups of “off” and “one” channels.
happens when you use set split_polarity option to True in a preprocessing (This) –
align_corners (bool) – See Pytorch Documentation for the grid_sample function
flow_is_normalized (bool) – flow is in [-2,2], no call to normalize_flow

warp_sequence_by_one_time_bin(img, flow, align_corners=True, flow_is_normalized=False)

Uses the displacement to warp the image.

If the displacement doesn’t match the existing grid attributes, the grid is regenerated.

Args :: img (torch.Tensor): (num_tbins, batch_size, channel_count, height, width) tensor flow (torch.tensor): (num_tbins x batch_size, 2, height, width) flow in pixels per time bin align_corners (bool): See Pytorch Documentation for the grid_sample function flow_is_normalized (bool): flow is in [-2,2], no call to normalize_flow

metavision_ml.core.warp_modules.make_grid2d(height, width, device='cpu:0')

Generates a 2d Grid

Parameters

height – image height
width – image width

metavision_ml.core.warp_modules.normalize_flow(flow)

Normalizes the flow in pixel unit to a flow in [-2,2] for its use grid_sample.

Parameters: flow – tensor of shape (B,2,H,W)

metavision_ml.core.warp_modules.warp_backward_using_forward_flow(img, flow, grid=None, align_corners=True, mode='bilinear', flow_is_normalized=False)

Generates previous images given a set of next images and forward flow in pixels or in [-2,2] if flow_is_normalized is set to True.

Parameters

img (torch.Tensor) – size (batch_size, channels, height, width)
flow (torch.Tensor) – size (batch_size, 2, height, width)
grid (torch.Tensor) – normalized meshgrid coordinates.
align_corners (bool) – See Pytorch Documentation for the grid_sample function
mode (string) – mode of interpolation used
flow_is_normalized (bool) – flow is in [-2,2], no call to normalize_flow

metavision_ml.core.warp_modules.warp_forward_using_backward_flow(img, flow, grid=None, align_corners=True, mode='bilinear', flow_is_normalized=False)

Generates next images given a set of previous images and backward flow in pixels or in [-2,2] if flow_is_normalized is set to True.

Note: forward warping using the backward flow is the same computation as backward warping using the forward flow

Parameters

img (torch.Tensor) – size (batch_size, channels, height, width)
flow (torch.Tensor) – size (batch_size, 2, height, width)
grid (torch.Tensor) – normalized meshgrid coordinates.
align_corners (bool) – See Pytorch Documentation for the grid_sample function
mode (string) – mode of interpolation used
flow_is_normalized (bool) – flow is in [-2,2], no call to normalize_flow

metavision_ml.core.warp_modules.warp_forward_using_forward_flow(img, flow, grid=None, align_corners=True, mode='bilinear', flow_is_normalized=False)

Generates next images given a set of images and forward flow in pixels or in [-2,2] if flow_is_normalized is set to True.

Parameters

img (torch.Tensor) – size (batch_size, channels, height, width)
flow (torch.Tensor) – size (batch_size, 2, height, width)
grid (torch.Tensor) – normalized meshgrid coordinates.
align_corners (bool) – See Pytorch Documentation for the grid_sample function
mode (string) – mode of interpolation used
flow_is_normalized (bool) – flow is in [-2,2], no call to normalize_flow

metavision_ml.core.warp_modules.warp_to_micro_tbin(inputs, flow, micro_tbin, grid=None, align_corners=True, mode='bilinear', flow_is_normalized=False)

similar to “warp_to_tbin” but here we consider a sequence of spatio-temporal volumes with the shape [T,B,C,D,H,W] Here we suppose a Constant Flow per time bin (that includes multiple micro time bins).

Parameters

inputs (torch.Tensor) – size (num_time_bins, batch_size, channels, depth, height, width)
flow (torch.Tensor) – size (num_tbins, batch_size, 2, height, width) normalized in [-2,2]
tbin (int) – index of the time bin to warp to (from zero to num_tbins -1)
grid (torch.Tensor) – normalized meshgrid coordinates.
align_corners (bool) – See Pytorch Documentation for the grid_sample function
mode (string) – mode of interpolation used.
flow_is_normalized (bool) – flow is in [-2,2], no call to normalize_flow

metavision_ml.core.warp_modules.warp_to_tbin(tensor, flow, tbin, grid=None, align_corners=True, mode='bilinear', flow_is_normalized=False)

Applies flow to an input tensor to warp all time bins onto one. This has the effect of reducing apparent motion blur (providing the flow is correct !). Here we suppose a Constant Flow during num_tbins steps per sample.

Parameters

inputs (torch.Tensor) – size (num_tbins, batch_size, channels, height, width).
flow (torch.Tensor) – size (batch_size, 2, height, width) normalized in [-2,2]
tbin (int) – index of the time bin to warp to (from zero to num_tbins -1).
grid (torch.Tensor) – normalized meshgrid coordinates.
align_corners (bool) – See Pytorch Documentation for the grid_sample function
mode (string) – mode of interpolation used.
flow_is_normalized (bool) – flow is in [-2,2], no call to normalize_flow