SDK ML Core API
Pyramid of inputs
- class metavision_ml.core.pyramid.Pyramid(num_levels, mode='bilinear', align_corners=True)
Holds a Pyramid of Resized Tensors
- Parameters
num_levels – number of scales
mode – mode of interpolation
align_corners – see Pytorch Documentation of interpolate
- compute_all_levels(inputs)
Computes all levels of Pyramid
- Parameters
x (torch.Tensor) – batch (num_tbins, batch_size, channels, height, width)
- compute_level(inputs, level)
Computes One Level of Resize
- Parameters
x (torch.Tensor) – batch (num_tbins, batch_size, channels, height, width)
level (int) – level of pyramid
Module implementing functional Unet style networks.
- class metavision_ml.core.unet.Unet(down_block, middle_block, up_block, n_input_channels=5, down_channel_counts=[32, 64], middle_channel_count=128, up_channel_counts=[64, 32, 8])
Generic Unet implementation.
- D -> U
- D -> U /
- D -> U /
- D -> U /
MIDDLE /
- Parameters
down_block (function) – function (input_channels x output_channels) -> torch.nn.Module used to create each layer of the encoder. The eventual spatial downsampling is supposed to be part of the Module (i.e. Either the module needs to have pooling or a strided convolution if the unet needs to have a hourglass shape).
middle_block (function) – function (input_channels x output_channels) -> torch.nn.Module used to instantiate the middle part of the unet architecture. For instance, this is a good part to put a recurrent layer. If downsampling is used in the encoder, this part has the lowest spatial resolution.
up_block (function) – function (input_channels x output_channels) -> torch.nn.Module used to create each layer of the decoder. Its module will be run after eventual upsampling, so that its spatial resolution is the same as the corresponding decoding layer.
n_input_channels (int) – Number of channels in the input of the Unet model.
down_channel_counts (int List) – Number of filters in the “down” layers of the encoder.
middle_channel_counts (int List) – Number of filters in the middle layer.
up_channel_counts (int Lists) – Number of filters in the “up” layers of the decoder.
Initialize internal Module state, shared by both nn.Module and ScriptModule.
- forward(x)
Define the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- metavision_ml.core.unet.interpolate(tensor: torch.Tensor, size: List[int]) torch.Tensor
Generic interpolation for TNCHW and NCHW Tensors.
Module implementing functional Unet style networks with a regression map.
- class metavision_ml.core.unet_variants.RegressorHead(block, in_channels, out_channels, n_output_channels, kernel=3, stride=1, padding=1)
Performs a dense regression after a feature computation.
Initialize internal Module state, shared by both nn.Module and ScriptModule.
- forward(inp: torch.Tensor) Tuple[torch.Tensor, torch.Tensor]
Define the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class metavision_ml.core.unet_variants.UnetRegressor(down_block, middle_block, up_block, n_input_channels=5, n_output_channels=2, down_channel_counts=[32, 64], middle_channel_count=128, up_channel_counts=[64, 32, 8])
Generic Unet implementation computing a regression map at different scales.
Schematics:
ENCODER -> DECODER D -> U -> \ D -> U / -> \ D -> U / -> \ D -> U / -> \ MIDDLE /
This model works with either (B, C H W) tensors or (T, B, C, H, W) tensors.
- Parameters
down_block (function) – function (input_channels x output_channels) -> torch.nn.Module used to create each layer of the encoder. The eventual spatial downsampling is supposed to be part of the Module (i.e. Either the module needs to have pooling or a strided convolution if the unet needs to have a hourglass shape).
middle_block (function) – function (input_channels x output_channels) -> torch.nn.Module used to instantiate the middle part of the unet architecture. For instance, this is a good part to put a recurrent layer. If downsampling is used in the encoder, this part has the lowest spatial resolution.
up_block (function) – function (input_channels x output_channels) -> torch.nn.Module used to create each layer of the decoder. Its module will be run after eventual upsampling, so that its spatial resolution is the same as the corresponding decoding layer.
n_input_channels (int) – Number of channels in the input of the Unet model.
n_output_channels (int) – Number of channels in the output feature map of the Unet model.
down_channel_counts (int List) – Number of filters in the “down” layers of the encoder.
middle_channel_counts (int List) – Number of filters in the middle layer.
up_channel_counts (int Lists) – Number of filters in the “up” layers of the decoder.
Initialize internal Module state, shared by both nn.Module and ScriptModule.
- forward(x) List[torch.Tensor]
Define the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
Layers of Neural network involving a displacement field, or interpolation.
- metavision_ml.core.warping.compute_iwe(events_np, flow, t, duration_flow=None, assert_target_time_strict=True)
Computes an image of warped events (IWE) given a numpy array of events
- Parameters
events_np (np.array) – numpy array of EventCD
flow (torch.Tensor) – size (2, H, W) expressed in pixels
t (int) – timestamp to warp to
duration_flow (int) – duration of the chunk of events over which the flow is computed
assert_target_time_strict (boolean) – if True, the target timestamp t must be in the range covered by events_np
- metavision_ml.core.warping.compute_iwe_torch(xt, yt, tt, pt, flow, t, duration_flow)
Compute an image of warped events (IWE) given torch tensors of events
- Parameters
xt (torch.Tensor) – x (n,)
yt (torch.Tensor) – y (n,)
tt (torch.Tensor) – t (n,)
pt (torch.Tensor) – p (n,)
flow (torch.Tensor) – size (2, H, W) expressed in pixels
t (int) – timestamp to warp to
duration_flow (int) – duration of the chunk of events over which the flow is computed
- metavision_ml.core.warping.event_image(xs, ys, ps, height, width, interpolation='bilinear')
Differentiable Image creation from events
- Parameters
xs (torch.Tensor) – x values (n,)
ys (torch.Tensor) – y values (n,)
ps (torch.Tensor) – polarities (or wheights) to apply
height (int) – height of the image
width (int) – width of the image
interpolation (string) – either ‘bilinear’ or ‘nearest’
- metavision_ml.core.warping.interpolate_to_image(pxs, pys, dxs, dys, weights, img)
Accumulate x and y coords to an image using bilinear interpolation
- Parameters
pxs (torch.Tensor) – pixel x (n,)
pys (torch.Tensor) – pixel y (n,)
dxs (torch.Tensor) – decimal part in x (n,)
dys (torch.Tensor) – decimal part in y (n,)
weights (torch.Tensor) – values to interpolate (n,)
img (torch.Tensor) – output image is updated (height, width)
- metavision_ml.core.warping.warp_events(x, y, t, flow, t0)
Moves Events Directly
- Parameters
x (torch.Tensor) – x (n,)
y (torch.Tensor) – y (n,)
t (torch.Tensor) – time (n,)
flow (torch.Tensor) – (2,height,width)
Layers of Neural network involving a displacement field, or interpolation.
- class metavision_ml.core.warp_modules.Warping(mode='bilinear')
Differentiable warping module using bilinear sampling. This calls the internal functions above while storing a grid. When the resolution changes, the grid is reallocated. This modules handles sequences (T,B,C,H,W tensors).
- grid
grid used for interpolation, saved to avoid reallocations.
- Type
torch.FloatTensor
- Parameters
mode (string) – interpolation mode (can be bilinear or nearest).
- sharpen_micro_tbin(img, flow, micro_tbin, is_on_off_volume=True, align_corners=True, flow_is_normalized=False)
Applies flow to an input tensor with on/off channels to warp all micro bins onto one.
This has the effect of reducing apparent motion blur (providing the flow is correct !). Contrary to the function sharpen this is only applicable to input features that have micro time bins : id est Channels that are computed by slice of time within a delta_t (for instance event_cube).
- Parameters
img (torch.Tensor) – size (num_time_bins, batch_size, channel_count, height, width)
flow – size (num_tbins x batch_size, 2, height, width) in pixels/bin
micro_tbin (int) – index of the time bin to warp to (from zero to num_tbins -1)
is_on_off_volume (bool) – if input channels are organized into 2 groups of “off” and “one” channels.
preprocessing (This happens when you use set split_polarity option to True in a) –
align_corners (bool) – See Pytorch Documentation for the grid_sample function
flow_is_normalized (bool) – flow is in [-2,2], no call to normalize_flow
- warp_sequence_by_one_time_bin(img, flow, align_corners=True, flow_is_normalized=False)
Uses the displacement to warp the image.
If the displacement doesn’t match the existing grid attributes, the grid is regenerated.
- Args :
img (torch.Tensor): (num_tbins, batch_size, channel_count, height, width) tensor flow (torch.tensor): (num_tbins x batch_size, 2, height, width) flow in pixels per time bin align_corners (bool): See Pytorch Documentation for the grid_sample function flow_is_normalized (bool): flow is in [-2,2], no call to normalize_flow
- metavision_ml.core.warp_modules.make_grid2d(height, width, device='cpu:0')
Generates a 2d Grid
- Parameters
height – image height
width – image width
- metavision_ml.core.warp_modules.normalize_flow(flow)
Normalizes the flow in pixel unit to a flow in [-2,2] for its use grid_sample.
- Parameters
flow – tensor of shape (B,2,H,W)
- metavision_ml.core.warp_modules.warp_backward_using_forward_flow(img, flow, grid=None, align_corners=True, mode='bilinear', flow_is_normalized=False)
Generates previous images given a set of next images and forward flow in pixels or in [-2,2] if flow_is_normalized is set to True.
- Parameters
img (torch.Tensor) – size (batch_size, channels, height, width)
flow (torch.Tensor) – size (batch_size, 2, height, width)
grid (torch.Tensor) – normalized meshgrid coordinates.
align_corners (bool) – See Pytorch Documentation for the grid_sample function
mode (string) – mode of interpolation used
flow_is_normalized (bool) – flow is in [-2,2], no call to normalize_flow
- metavision_ml.core.warp_modules.warp_forward_using_backward_flow(img, flow, grid=None, align_corners=True, mode='bilinear', flow_is_normalized=False)
Generates next images given a set of previous images and backward flow in pixels or in [-2,2] if flow_is_normalized is set to True.
Note: forward warping using the backward flow is the same computation as backward warping using the forward flow
- Parameters
img (torch.Tensor) – size (batch_size, channels, height, width)
flow (torch.Tensor) – size (batch_size, 2, height, width)
grid (torch.Tensor) – normalized meshgrid coordinates.
align_corners (bool) – See Pytorch Documentation for the grid_sample function
mode (string) – mode of interpolation used
flow_is_normalized (bool) – flow is in [-2,2], no call to normalize_flow
- metavision_ml.core.warp_modules.warp_forward_using_forward_flow(img, flow, grid=None, align_corners=True, mode='bilinear', flow_is_normalized=False)
Generates next images given a set of images and forward flow in pixels or in [-2,2] if flow_is_normalized is set to True.
- Parameters
img (torch.Tensor) – size (batch_size, channels, height, width)
flow (torch.Tensor) – size (batch_size, 2, height, width)
grid (torch.Tensor) – normalized meshgrid coordinates.
align_corners (bool) – See Pytorch Documentation for the grid_sample function
mode (string) – mode of interpolation used
flow_is_normalized (bool) – flow is in [-2,2], no call to normalize_flow
- metavision_ml.core.warp_modules.warp_to_micro_tbin(inputs, flow, micro_tbin, grid=None, align_corners=True, mode='bilinear', flow_is_normalized=False)
similar to “warp_to_tbin” but here we consider a sequence of spatio-temporal volumes with the shape [T,B,C,D,H,W] Here we suppose a Constant Flow per time bin (that includes multiple micro time bins).
- Parameters
inputs (torch.Tensor) – size (num_time_bins, batch_size, channels, depth, height, width)
flow (torch.Tensor) – size (num_tbins, batch_size, 2, height, width) normalized in [-2,2]
tbin (int) – index of the time bin to warp to (from zero to num_tbins -1)
grid (torch.Tensor) – normalized meshgrid coordinates.
align_corners (bool) – See Pytorch Documentation for the grid_sample function
mode (string) – mode of interpolation used.
flow_is_normalized (bool) – flow is in [-2,2], no call to normalize_flow
- metavision_ml.core.warp_modules.warp_to_tbin(tensor, flow, tbin, grid=None, align_corners=True, mode='bilinear', flow_is_normalized=False)
Applies flow to an input tensor to warp all time bins onto one. This has the effect of reducing apparent motion blur (providing the flow is correct !). Here we suppose a Constant Flow during num_tbins steps per sample.
- Parameters
inputs (torch.Tensor) – size (num_tbins, batch_size, channels, height, width).
flow (torch.Tensor) – size (batch_size, 2, height, width) normalized in [-2,2]
tbin (int) – index of the time bin to warp to (from zero to num_tbins -1).
grid (torch.Tensor) – normalized meshgrid coordinates.
align_corners (bool) – See Pytorch Documentation for the grid_sample function
mode (string) – mode of interpolation used.
flow_is_normalized (bool) – flow is in [-2,2], no call to normalize_flow