SDK ML Preprocessing API

A collection of preprocessing functions to turn a buffer of events into a sequence of 2 dimensional features

This works only for numpy structured arrays representation of events This makes intensive use of numba http://numba.pydata.org/ , which is awesome.

Examples

>>> delta = 100000
>>> initial_ts = record.current_time
>>> events = record.load_delta_t(delta)  # load 100 milliseconds worth of events
>>> events['t'] -= int(initial_ts)  # events timestamp should be reset
>>> output_array = np.zeros((1, 2, height, width))  # preallocate output array
>>> histo(events, output_array, delta)

metavision_ml.preprocessing.event_to_tensor.diff(xypt, output_array, total_tbins_delta_t, downsampling_factor=0, reset=True, max_incr_per_pixel=5.0)

Returns the difference of histogram of positive events and negative events. It requires output_array have a signed dtype.

Parameters

xypt (events) – structured array containing events
output_array – numpy float32 array [tbins, 1, height, width]
total_tbins_delta_t – duration of the timeslice
downsampling_factor (int) – Parameter used to reduce the spatial dimension of the obtained feature. Actually multiply the coordinates by 2**(-downsampling_factor).
reset (boolean) – whether to reset output_array to 0 beforehand.
max_incr_per_pixel – maximum number of increments per pixel (expressed in initial resolution).

metavision_ml.preprocessing.event_to_tensor.diff_quantized(xypt, output_array, total_tbins_delta_t, downsampling_factor=0, reset=True, negative_bit_length=8, normalization=False)

Returns the difference of histogram of positive events and negative events. It requires output_array have a signed dtype.

Parameters

xypt (events) – structured array containing events
output_array (np.ndarray) – Pre allocated numpy array of shape (num_tbins, 1, height, width). If downsampling_factor is not 0 or normalization is True, the datatype should be floating. Otherwise it should be np.int8.
total_tbins_delta_t – duration of the timeslice
downsampling_factor (int) – Parameter used to reduce the spatial dimension of the obtained feature. Actually multiply the coordinates by 2**(-downsampling_factor).
reset (boolean) – whether to reset output_array to 0 beforehand.
negative_bit_length (int) – negative bits length set by the camera settings. It controls the precision used to compute the tensor and must be integer >2 and <=8 in Diff3D mode.
normalization (boolean) – If not used, the dtype of output_array should be set as int8 and the data ranges [-2 ** (negative_bit_length - 1), 2 ** (negative_bit_length - 1) - 1]. If used, the dtype of output_array should be set as floating point and the data ranges [-1.0, 1.0]

metavision_ml.preprocessing.event_to_tensor.event_cube(xypt, output_array, total_tbins_delta_t, downsampling_factor=0, split_polarity=True, reset=True, max_incr_per_pixel=5)

Takes xypt events within a timeslice of length total_tbins_delta_t and updates an array with shape (num_tbins,num_utbins*2,H,W) microbins are interval that are used in the channels [Unsupervised Event-based Learning of Optical Flow, Zhu et al. 2018]

Note: you should load delta_t * (num_tbins+1) to avoid artefacts on last timeslice: because the support of the time bilinear kernel is 2 bins.

Parameters

xypt (events) – structured array containing events
output_array (np.ndarray) – (num_tbins,num_utbins*2,H,W) dtype MUST be floating point!
total_tbins_delta_t (int) – Length in us of the interval in which events are taken
downsampling_factor – will divide by this power of 2 the event coordinates x & y. (WARNING): This is not like in the paper where you should use bilinear weights for downsampling as well. A true event-based bilinear resize should contribute to 8 different cells in result tensor.
split_polarity (bool) – whether or not to split polarity into 2 channels or consider p as weight {-1,1}
reset (bool) – reset output_array, in most cases you should put this to True
max_incr_per_pixel – maximum number of increments per pixel (expressed in initial resolution).

metavision_ml.preprocessing.event_to_tensor.histo(xypt, output_array, total_tbins_delta_t, downsampling_factor=0, reset=True, max_incr_per_pixel=5.0)

Computes histogram on a sequence of events.

Parameters

xypt (events) – structured array containing events
output_array (np.ndarray) – Pre allocated numpy array of shape (num_tbins, 2, height, width)
total_tbins_delta_t (int) – Time interval of the extended time slice (with tbins).
downsampling_factor (int) – Parameter used to reduce the spatial dimension of the obtained feature. Actually multiply the coordinates by 2**(-downsampling_factor).
reset (boolean) – whether to reset output_array to 0 beforehand.
max_incr_per_pixel – maximum number of increments per pixel (expressed in initial resolution).

metavision_ml.preprocessing.event_to_tensor.histo_quantized(xypt, output_array, total_tbins_delta_t, downsampling_factor=0, reset=True, negative_bit_length=4, total_bit_length=8, normalization=False)

Computes histogram on a sequence of events.

Parameters

xypt (events) – structured array containing events
output_array (np.ndarray) – Pre allocated numpy array of shape (num_tbins, 2, height, width). If downsampling_factor is not 0 or normalization is True, the datatype should be floating. Otherwise it should be np.uint8.
total_tbins_delta_t (int) – Time interval of the extended time slice (with tbins).
downsampling_factor (int) – Parameter used to reduce the spatial dimension of the obtained feature. Actually multiply the coordinates by 2**(-downsampling_factor).
reset (boolean) – whether to reset output_array to 0 beforehand.
negative_bit_length (int) – negative bits length set by the camera settings. It controls the precision used to compute the tensor and must be integer >0 and <=8 in Histo3D mode.
total_bit_length – total bits used for store the data, total_bit_length = negative_bit_length + positive_bit_length
normalization (boolean) – If not used, the dtype of output_array should be set as uint8 and the data ranges [0, 2 ** negative_bit_length] for the negative channel and [0, 2 ** positive_bit_length] for the positive channel. If used, the dtype of output_array should be set as floating point and the data ranges [0, 1.0]

metavision_ml.preprocessing.event_to_tensor.multi_channel_timesurface(xypt, output_array, total_tbins_delta_t, downsampling_factor=0, reset=True, **kwargs)

Computes a linear two channels time surface n times per delta_t.

Input channels of the output_array variable must be even. If it is 4 there will be 2 micro delta_t, 3 channels if it is 6 and so on. This feature contains precise time information and allows to perceive higher frequency phenomenon than 1 / delta_t. The dtype of output_array must be sufficient to hold values up to total_tbins_delta_t without overflow

Parameters

xypt (events) – structured array containing events
output_array (np.ndarray) – (num_tbins,num_utbins*2,H,W) dtype MUST be floating point!
total_tbins_delta_t (int) – length in us of the interval in which events are taken
downsampling_factor – will divide by this power of 2 the event coordinates x & y.
reset (bool) – whether to reset output_array to 0 before computing the feature.

metavision_ml.preprocessing.event_to_tensor.timesurface(xypt, output_array, total_tbins_delta_t, downsampling_factor=0, reset=True, normed=True, **kwargs)

Computes a linear two channels time surface. The dtype of output_array must be sufficient to hold values up to total_tbins_delta_t without overflow

Parameters

xypt (events) – structured array containing events
output_array (np.ndarray) – Pre allocated numpy array of shape (num_tbins, 2, height, width)
total_tbins_delta_t (int) – Length in us of the interval in which events are taken
downsampling_factor (int) – Parameter used to reduce the spatial dimension of the obtained feature. Actually multiply the coordinates by 2**(-downsampling_factor).
reset (boolean) – whether to reset output_array to 0 beforehand.
normed (boolean) – if True, scales the timesurface between 0 and 1.

Write HDF5 feature files from event files

metavision_ml.preprocessing.hdf5.generate_hdf5(paths, output_folder, preprocess, delta_t, height=None, width=None, start_ts=0, max_duration=None, n_processes=2, box_labels=[], store_as_uint8=False, mode='delta_t', n_events=0, preprocess_kwargs={})

Generates HDF5 files of frames at a regular interval for dataset caching.

It is meant to produce a dataset with a smaller memory footprint, that is therefore easier to load from disk. Generated files share the name of the input file but have an HDF5 extension.

If max_duration is not None the naming of the output_file follows this pattern:: “{output_folder}/{path basename}_{index:d}.h5” where index allows you to distinguish the cut.

otherwise “{output_folder}/{path basename}.h5”

Example

>>> python3 metavision_ml/preprocessing/hdf5.py src_path/test/*.dat --o dst_path/test/ --height_width 480 640 --preprocess histo --box-labels src_path/test/*bbox.npy

Parameters

paths (string list) – Paths of input files.
output_folder (string) – Folder path where data will be written.
preprocess (string) – Name of the preprocessing function to use.
delta_t (int) – Period at which tensor are computed (in us).
height (int) – if None the features are not downsampled, however features are downsampled to height which must be the sensor’s height divided by a power of 2.
width (int) – if None the features are not downsampled, however features are downsampled to width which must be the sensor’s width divided by a power of 2.
start_ts (int) – Timestamp (in microseconds) from which the computation begins. Either a single int for all files or a list containing exactly one int per file.
max_duration (int) – If not None, limit the duration of the file to max_duration (in us).
n_processes (int) – Number of processes writing files simultaneously.
box_labels (string list) – path to npy box label files that matches each input file. if start_ts or max_duration are specified, these files will be cut accordingly.
store_as_uint8 (boolean) – if True, casts to byte before storing to save space. Only supports 0-1 normalized data.
mode (string) – Load by timeslice or number of events. Either “delta_t” or “n_events”.
n_events (int) – Number of events in the timeslice.
preprocess_kwargs (dictionary) – A dictionary contains the kwargs used by different preprocessing method

metavision_ml.preprocessing.hdf5.split_label(box_path, output_folder, start_ts=0, max_duration=None): Copy the labels according to split and max_durations

metavision_ml.preprocessing.hdf5.write_to_hdf5(path, start_ts=0, output_folder='.', delta_t=50000, preprocess='histo', height=None, width=None, max_duration=None, store_as_uint8=False, mode='delta_t', n_events=0, preprocess_kwargs={})

Generates a single HDF5 file of frames at a regular interval for dataset caching.

It is meant to produce a dataset with a smaller memory footprint, that is therefore easier to load from disk. Generated files share the name of the input RAW file but have an HDF5 extension.

The naming of the output_file follows this pattern:: “{output_folder}/{path basename}_{start_ts}.h5”

Parameters

path (string) – Path of input files.
output_folder (string) – Folder path where data will be written.
preprocess (string) – Name of the preprocessing function to use.
delta_t (int) – Period at which tensors are computed (in us).
height (int) – if None the features are not downsampled, however features are downsampled to height which must be the sensor’s height divided by a power of 2.
width (int) – if None the features are not downsampled, however features are downsampled to width which must be the sensor’s width divided by a power of 2.
start_ts (int) – Timestamp (in microseconds) from which the computation begins. Either a single int for all files or a list containing exactly one int per file.
max_duration (int) – If not None, limit the duration of the file to max_duration us.
store_as_uint8 (boolean) – if True, casts to byte before storing to save space. Only supports 0-1 normalized data.
mode (string) – Load by timeslice or number of events. Either “delta_t” or “n_events”.
n_events (int) – Number of events in the timeslice.
preprocess_kwargs (dictionary) – A dictionary contains the kwargs used by different preprocessing method

A collection of visualization utilities for preprocessings

Examples

>>> delta = 100000
>>> initial_ts = record.current_time
>>> events = record.load_delta_t(delta)  # load 100 milliseconds worth of events
>>> events['t'] -= int(initial_ts)  # events timestamp should be reset
>>> output_array = np.zeros((1, 2, height, width))  # preallocate output array
>>> histo(events, output_array, delta)
>>> img = viz_diff("histo")(output_array[0])
>>> cv2.imshow('img', img)
>>> cv2.waitKey()

metavision_ml.preprocessing.viz.filter_outliers(input_val, num_std=2)

Filter outliers in an array

Parameters: input_val (np.ndarray) – Array of any shape
Returns: Normalized array of same shape as input
Return type: output_array (np.ndarray)

metavision_ml.preprocessing.viz.gray_to_rgb(im)

Just Repeat image 3 times

Parameters: im (np.ndarray) – Array of shape (H,W)
Returns: array of shape (H,W,3)
Return type: output_array

metavision_ml.preprocessing.viz.normalize(im)

Normalizes image by min-max

Parameters: im (np.ndarray) – Array of any shape
Returns: Normalized array of same shape as input
Return type: output_array (np.ndarray)

metavision_ml.preprocessing.viz.viz_diff(im)

Visualize difference of histogram

Parameters: im (np.ndarray) – Array of shape (H,W)
Returns: Array of shape (H,W)
Return type: output_array (np.ndarray)

metavision_ml.preprocessing.viz.viz_diff_binarized(im)

Visualize binarized difference of events (“ON”-“OFF”)

Parameters: im (np.ndarray) – Array of shape (H,W)
Returns: Array of shape (H,W,3)
Return type: output_array (np.ndarray)

metavision_ml.preprocessing.viz.viz_event_cube_rgb(im, split_polarity=True)

Visualize 3 out of 6 channels in RGB mode.

Parameters

im (np.ndarray) – Array of shape (T,H,W) T images or T//2 group of images with 2 channels
split_polarity – whether each image is single-channel (ON-OFF) or (ON, OFF) 2 channel images.

Returns

Array of shape (H,W,3)

Return type

output_array (np.ndarray)

metavision_ml.preprocessing.viz.viz_histo(im)

Visualize difference of histogram

Parameters: im (np.ndarray) – Array of shape (2,H,W)
Returns: Array of shape (H,W)
Return type: output_array (np.ndarray)

metavision_ml.preprocessing.viz.viz_histo_binarized(im)

Visualize binarized histogram of events

Parameters: im (np.ndarray) – Array of shape (2,H,W)
Returns: Array of shape (H,W,3)
Return type: output_array (np.ndarray)

metavision_ml.preprocessing.viz.viz_histo_filtered(im, val_max=0.5)

visualize strongly filtered histo image with 3 channels

Args:
im (np.ndarray): Array of shape (2,H,W)

Returns:
output_array: array of shape (H,W,3)

metavision_ml.preprocessing.viz.viz_histo_rgb(im)

visualize histo image with 3 channels

Args:
im (np.ndarray): Array of shape (H,W)

Returns:
output_array: array of shape (H,W,3)

metavision_ml.preprocessing.viz.viz_multichannel_timesurface(tensor, blend=False)

Visualizes three channels of a multi channel timesurface.

Parameters

tensor (np.ndarray) – array of shape (T,H,W) T images or T//2 group of images with 2 channels
blend (boolean) – whether to blend different channels to visualize them as one.

Returns

array of shape (H,W,3)

Return type

output_array (np.ndarray)

metavision_ml.preprocessing.viz.viz_timesurface(im)

Visualize timesurface

Note: In order to generate a timesurface you need to call event_to_tensor.timesurface Typically, if you want to see an exponential decay timesurface you don’t set the arg “reset” to True in order to keep the latest event’s timestamp at each pixel.

Here we assume the timesurface has already been normalized between [0,1] either by min-max normalization or with exponential time-decay.

Parameters: im (np.ndarray) – Array of shape (2,H,W) or (H,W)
Returns: Array of shape (H,W,3)
Return type: output_array (np.ndarray)

class metavision_ml.preprocessing.CDProcessor(height, width, num_tbins=5, preprocessing='histo', downsampling_factor=0, preprocess_kwargs={})

Wrapper that simplifies the preprocessing.

Parameters

height – image height
width – image width
num_tbins – number of time slices
preprocessing – name of method, must be registered first
downsampling_factor (int) – Parameter used to reduce the spatial dimension of the obtained feature. Actually multiply the coordinates by 2**(-downsampling_factor).

get_array_dim()

Returns the shape of the preprocessing tensor.

Returns: shape of the tensor to be preprocessed.
Return type: shape (int tuple)

Iterator of feature tensor for a source of input events.

class metavision_ml.preprocessing.preprocessor_iterator.CDProcessorIterator(path, preprocess_function_name, mode='delta_t', start_ts=0, max_duration=None, delta_t=50000, n_events=10000, num_tbins=1, preprocess_kwargs={}, device=device(type='cpu'), height=None, width=None, **kwargs)

Provides feature tensors (torch.Tensor) at regular intervals using python implementation of EventCD preprocessors (CDProcessor).

Relies on the EventsIterator class. The different behaviours of EventsIterator can be leveraged.

Parameters

path (string) – Path to the file to read, or empty for a camera.
preprocess_function_name (string) – Name of the preprocessing function used to turn events into features. Can be any of the functions present in metavision_ml.preprocessing or one registered by the user.
mode (str) – Mode of Streaming (n_event, delta_t, mixed)
start_ts (int) – Start of EventIterator
max_duration (int) – Total Duration of EventIterator
delta_t (int) – Duration of used events slice in us.
num_tbins (int) – Number of TimeBins
preprocess_kwargs – dictionary of optional arguments to the preprocessing function. This can be used to override the default value of max_incr_per_pixel For instance. {“max_incr_per_pixel”: 20} to clip and normalize tensors by 20 at full resolution.
device (torch.device) – Torch device (defaults to cpu).
height (int) – if None the features are not downsampled, however features are downsampled to height which must be the sensor’s height divided by a power of 2.
width (int) – if None the features are not downsampled, however features are downsampled to width which must be the sensor’s width divided by a power of 2.
**kwargs – Arbitrary keyword arguments passed to the underlying EventsIterator.

mv_it

object used to read from the file or the camera.

Type: EventsIterator

array_dim

shape of the tensor (channel, height, width).

Type: tuple

cd_proc

class computing features from events into a preallocated memory array.

Type: CDProcessor

step

counter of iterations.

Type: int

event_input_height

original height of the sensor in pixels.

Type: int

event_input_width

original width of the sensor in pixels.

Type: int

Examples

>>> path = "example.raw"
>>> for tensor in Preprocessor(path, "event_cube", delta_t=10000):
>>>     # Returns a torch Tensor.
>>>     print(tensor.shape)

get_time(): Cut Inner Reader Time

get_vis_func(): Returns the visualization function corresponding to the preprocessing being used.

class metavision_ml.preprocessing.preprocessor_iterator.EventPreprocessorIterator(path, preprocess_function_name, mode='delta_t', start_ts=0, max_duration=None, delta_t=50000, n_events=10000, num_tbins=1, preprocess_kwargs={}, device=device(type='cpu'), height=None, width=None, **kwargs)

Provides feature tensors (torch.Tensor) at regular intervals using C+ implementation of EventCD preprocessors (EventPreprocessor).

Relies on the EventsIterator class. The different behaviours of EventsIterator can be leveraged.

Parameters

path (string) – Path to the file to read, or empty for a camera.
preprocess_function_name (string) –
Name of the preprocessing function used to turn events into features. Can be any of the following :
- ”diff” for a Differential frame
- ”hardware_diff” or “diff3d” for a quantized differential frame
- ”histo” for a Histogram of events
- ”histo3d” or “hardware_histo” for a quantized histogram
- ”event_cube” for an EventCube
- ”time_surface” for a TimeSurface
mode (str) – Mode of Streaming (n_event, delta_t, mixed)
start_ts (int) – Minimal start time of the EventIterator
max_duration (int) – Maximum duration of EventIterator
delta_t (int) – Duration of each slice of events in us.
num_tbins (int) – Number of TimeBins
preprocess_kwargs – dictionary of optional arguments for the preprocessing function.
device (torch.device) – Torch device (defaults to cpu).
height (int) – if not None, the features are downsampled to the provided height.
width (int) – if not None the features are downsampled to the provided width.
**kwargs – Arbitrary keyword arguments passed to the underlying EventsIterator.

mv_it

object used to read from the file or the camera.

Type: EventsIterator

array_dim

shape of the tensor (channel, height, width).

Type: tuple

evt_proc

class computing features from events into a preallocated memory array.

Type: EventPreprocessor

step

counter of iterations.

Type: int

event_input_height

original height of the sensor in pixels.

Type: int

event_input_width

original width of the sensor in pixels.

Type: int

Examples

>>> path = "example.raw"
>>> for tensor in Preprocessor(path, "event_cube", delta_t=10000):
>>>     # Returns a torch Tensor.
>>>     print(tensor.shape)

get_time(): Cut Inner Reader Time

class metavision_ml.preprocessing.preprocessor_iterator.HDF5Iterator(path, num_tbins=1, preprocess_kwargs={}, start_ts=0, device=device(type='cpu'), height=None, width=None)

Provides feature tensors (torch.Tensor) at regular intervals from a precomputed HDF5 file.

Parameters

path (string) – Path to the HDF5 file containing precomputed features.
height (int) – if None the features are not downsampled, however features are downsampled to height which must be the sensor’s height divided by a power of 2.
width (int) – if None the features are not downsampled, however features are downsampled to width which must be the sensor’s width divided by a power of 2.
device (torch.device) – Torch device (defaults to cpu).
start_ts (int) – First timestamp to consider in us. (Must be a multiple of the HDF5 file delta_t)

dataset

hDF5 dataset containg the precomputed features.

Type: h5py.Dataset

array_dim

shape of the tensor (channel, height, width).

Type: tuple

preprocess_dict

dictionary of the parameters used.

Type: dictionary

step

counter of iterations.

Type: int

event_input_height

original height of the sensor in pixels.

Type: int

event_input_width

original width of the sensor in pixels.

Type: int

Examples

>>> path = "example.h5"
>>> for tensor in HDF5Iterator(path, num_tbins=4):
>>>     # Returns a torch Tensor.
>>>     print(tensor.shape)

checks(preprocess_function_name, delta_t, mode='delta_t', n_events=0)

Convenience function to assert precomputed parameters

Parameters

preprocess_function_name (string) – Name of the preprocessing function used to turn events into features. Can be any of the functions present in metavision_ml.preprocessing or one registered by the user.
delta_t (int) – Duration of used events slice in us.

get_time(): Cut Inner Reader Time

get_vis_func(): Returns the visualization function corresponding to the preprocessing being used.

Registering a new preprocessing function

If you write your own preprocessing function you can make it available to existing modules with this API.

metavision_ml.preprocessing.register_new_preprocessing(preproc_name, n_input_channels, function, viz_function, kwargs={'max_incr_per_pixel': 5, 'preprocess_dtype': dtype('float32')})

Registers a new preprocessing function to be available across the package.

This must be done each time the python interpreter is invoked.

Parameters

preproc_name (string) – Name of the preprocessing function, has to be unique.
n_input_channels (int) – Number of channels in the resulting tensor.
function (function) – Preprocessing function. Its signature must be function(events, tensor, **kwargs) -> None
viz_function (function) – Visualization function. Its signature must be viz_function(tensor) -> img (nd.array H x W x 3)
kwargs (dict) – Dictionary of optional arguments to be passed to the function.

metavision_ml.preprocessing.get_preprocess_dict(preproc_name)

Returns a mutable dict describing a preprocessing function.

preproc_name (string): Name of the preprocessing function, already existing one or previously registered.

metavision_ml.preprocessing.get_preprocess_function_names(): Returns the names of all existing and registered preprocessing functions.