SDK ML Preprocessing API

A collection of preprocessing functions to turn a buffer of events into a sequence of 2 dimensional features

This works only for numpy structured arrays representation of events This makes intensive use of numba http://numba.pydata.org/ , which is awesome.

Examples

>>> delta = 100000
>>> initial_ts = record.current_time
>>> events = record.load_delta_t(delta)  # load 100 milliseconds worth of events
>>> events['t'] -= int(initial_ts)  # events timestamp should be reset
>>> output_array = np.zeros((1, 2, height, width))  # preallocate output array
>>> histo(events, output_array, delta)
metavision_ml.preprocessing.event_to_tensor.diff(xypt, output_array, total_tbins_delta_t, downsampling_factor=0, reset=True, max_incr_per_pixel=5.0)

Returns the difference of histogram of positive events and negative events. It requires output_array have a signed dtype.

Parameters
  • xypt (events) – structured array containing events

  • output_array – numpy float32 array [tbins, 1, height, width]

  • total_tbins_delta_t – duration of the timeslice

  • downsampling_factor (int) – Parameter used to reduce the spatial dimension of the obtained feature. Actually multiply the coordinates by 2**(-downsampling_factor).

  • reset (boolean) – whether to reset output_array to 0 beforehand.

  • max_incr_per_pixel – maximum number of increments per pixel (expressed in initial resolution).

metavision_ml.preprocessing.event_to_tensor.diff_quantized(xypt, output_array, total_tbins_delta_t, downsampling_factor=0, reset=True, negative_bit_length=8, normalization=False)

Returns the difference of histogram of positive events and negative events. It requires output_array have a signed dtype.

Parameters
  • xypt (events) – structured array containing events

  • output_array (np.ndarray) – Pre allocated numpy array of shape (num_tbins, 1, height, width). If downsampling_factor is not 0 or normalization is True, the datatype should be floating. Otherwise it should be np.int8.

  • total_tbins_delta_t – duration of the timeslice

  • downsampling_factor (int) – Parameter used to reduce the spatial dimension of the obtained feature. Actually multiply the coordinates by 2**(-downsampling_factor).

  • reset (boolean) – whether to reset output_array to 0 beforehand.

  • negative_bit_length (int) – negative bits length set by the camera settings. It controls the precision used to compute the tensor and must be integer >2 and <=8 in Diff3D mode.

  • normalization (boolean) – If not used, the dtype of output_array should be set as int8 and the data ranges [-2 ** (negative_bit_length - 1), 2 ** (negative_bit_length - 1) - 1]. If used, the dtype of output_array should be set as floating point and the data ranges [-1.0, 1.0]

metavision_ml.preprocessing.event_to_tensor.event_cube(xypt, output_array, total_tbins_delta_t, downsampling_factor=0, split_polarity=True, reset=True, max_incr_per_pixel=5)

Takes xypt events within a timeslice of length total_tbins_delta_t and updates an array with shape (num_tbins,num_utbins*2,H,W) microbins are interval that are used in the channels [Unsupervised Event-based Learning of Optical Flow, Zhu et al. 2018]

Note: you should load delta_t * (num_tbins+1) to avoid artefacts on last timeslice

because the support of the time bilinear kernel is 2 bins.

Parameters
  • xypt (events) – structured array containing events

  • output_array (np.ndarray) – (num_tbins,num_utbins*2,H,W) dtype MUST be floating point!

  • total_tbins_delta_t (int) – Length in us of the interval in which events are taken

  • downsampling_factor – will divide by this power of 2 the event coordinates x & y. (WARNING): This is not like in the paper where you should use bilinear weights for downsampling as well. A true event-based bilinear resize should contribute to 8 different cells in result tensor.

  • split_polarity (bool) – whether or not to split polarity into 2 channels or consider p as weight {-1,1}

  • reset (bool) – reset output_array, in most cases you should put this to True

  • max_incr_per_pixel – maximum number of increments per pixel (expressed in initial resolution).

metavision_ml.preprocessing.event_to_tensor.histo(xypt, output_array, total_tbins_delta_t, downsampling_factor=0, reset=True, max_incr_per_pixel=5.0)

Computes histogram on a sequence of events.

Parameters
  • xypt (events) – structured array containing events

  • output_array (np.ndarray) – Pre allocated numpy array of shape (num_tbins, 2, height, width)

  • total_tbins_delta_t (int) – Time interval of the extended time slice (with tbins).

  • downsampling_factor (int) – Parameter used to reduce the spatial dimension of the obtained feature. Actually multiply the coordinates by 2**(-downsampling_factor).

  • reset (boolean) – whether to reset output_array to 0 beforehand.

  • max_incr_per_pixel – maximum number of increments per pixel (expressed in initial resolution).

metavision_ml.preprocessing.event_to_tensor.histo_quantized(xypt, output_array, total_tbins_delta_t, downsampling_factor=0, reset=True, negative_bit_length=4, total_bit_length=8, normalization=False)

Computes histogram on a sequence of events.

Parameters
  • xypt (events) – structured array containing events

  • output_array (np.ndarray) – Pre allocated numpy array of shape (num_tbins, 2, height, width). If downsampling_factor is not 0 or normalization is True, the datatype should be floating. Otherwise it should be np.uint8.

  • total_tbins_delta_t (int) – Time interval of the extended time slice (with tbins).

  • downsampling_factor (int) – Parameter used to reduce the spatial dimension of the obtained feature. Actually multiply the coordinates by 2**(-downsampling_factor).

  • reset (boolean) – whether to reset output_array to 0 beforehand.

  • negative_bit_length (int) – negative bits length set by the camera settings. It controls the precision used to compute the tensor and must be integer >0 and <=8 in Histo3D mode.

  • total_bit_length – total bits used for store the data, total_bit_length = negative_bit_length + positive_bit_length

  • normalization (boolean) – If not used, the dtype of output_array should be set as uint8 and the data ranges [0, 2 ** negative_bit_length] for the negative channel and [0, 2 ** positive_bit_length] for the positive channel. If used, the dtype of output_array should be set as floating point and the data ranges [0, 1.0]

metavision_ml.preprocessing.event_to_tensor.multi_channel_timesurface(xypt, output_array, total_tbins_delta_t, downsampling_factor=0, reset=True, **kwargs)

Computes a linear two channels time surface n times per delta_t.

Input channels of the output_array variable must be even. If it is 4 there will be 2 micro delta_t, 3 channels if it is 6 and so on. This feature contains precise time information and allows to perceive higher frequency phenomenon than 1 / delta_t. The dtype of output_array must be sufficient to hold values up to total_tbins_delta_t without overflow

Parameters
  • xypt (events) – structured array containing events

  • output_array (np.ndarray) – (num_tbins,num_utbins*2,H,W) dtype MUST be floating point!

  • total_tbins_delta_t (int) – length in us of the interval in which events are taken

  • downsampling_factor – will divide by this power of 2 the event coordinates x & y.

  • reset (bool) – whether to reset output_array to 0 before computing the feature.

metavision_ml.preprocessing.event_to_tensor.timesurface(xypt, output_array, total_tbins_delta_t, downsampling_factor=0, reset=True, normed=True, **kwargs)

Computes a linear two channels time surface. The dtype of output_array must be sufficient to hold values up to total_tbins_delta_t without overflow

Parameters
  • xypt (events) – structured array containing events

  • output_array (np.ndarray) – Pre allocated numpy array of shape (num_tbins, 2, height, width)

  • total_tbins_delta_t (int) – Length in us of the interval in which events are taken

  • downsampling_factor (int) – Parameter used to reduce the spatial dimension of the obtained feature. Actually multiply the coordinates by 2**(-downsampling_factor).

  • reset (boolean) – whether to reset output_array to 0 beforehand.

  • normed (boolean) – if True, scales the timesurface between 0 and 1.

Write HDF5 tensor files from event files

metavision_ml.preprocessing.hdf5.generate_hdf5(paths, output_folder, preprocess, delta_t, height=None, width=None, start_ts=0, max_duration=None, n_processes=2, box_labels=[], store_as_uint8=False, mode='delta_t', n_events=0, preprocess_kwargs={})

Generates HDF5 files of frames at a regular interval for dataset caching.

It is meant to produce a dataset with a smaller memory footprint, that is therefore easier to load from disk. Generated files share the name of the input file but have an HDF5 extension.

If max_duration is not None the naming of the output_file follows this pattern:

“{output_folder}/{path basename}_{index:d}.h5” where index allows you to distinguish the cut.

otherwise “{output_folder}/{path basename}.h5”

Example

>>> python3 metavision_ml/preprocessing/hdf5.py src_path/test/*.dat --o dst_path/test/ --height_width 480 640 --preprocess histo --box-labels src_path/test/*bbox.npy
Parameters
  • paths (string list) – Paths of input files.

  • output_folder (string) – Folder path where data will be written.

  • preprocess (string) – Name of the preprocessing function to use.

  • delta_t (int) – Period at which tensor are computed (in us).

  • height (int) – if None the features are not downsampled, however features are downsampled to height which must be the sensor’s height divided by a power of 2.

  • width (int) – if None the features are not downsampled, however features are downsampled to width which must be the sensor’s width divided by a power of 2.

  • start_ts (int) – Timestamp (in microseconds) from which the computation begins. Either a single int for all files or a list containing exactly one int per file.

  • max_duration (int) – If not None, limit the duration of the file to max_duration (in us).

  • n_processes (int) – Number of processes writing files simultaneously.

  • box_labels (string list) – path to npy box label files that matches each input file. if start_ts or max_duration are specified, these files will be cut accordingly.

  • store_as_uint8 (boolean) – if True, casts to byte before storing to save space. Only supports 0-1 normalized data.

  • mode (string) – Load by timeslice or number of events. Either “delta_t” or “n_events”.

  • n_events (int) – Number of events in the timeslice.

  • preprocess_kwargs (dictionary) – A dictionary contains the kwargs used by different preprocessing method

metavision_ml.preprocessing.hdf5.split_label(box_path, output_folder, start_ts=0, max_duration=None)

Copy the labels according to split and max_durations

metavision_ml.preprocessing.hdf5.write_to_hdf5(path, start_ts=0, output_folder='.', delta_t=50000, preprocess='histo', height=None, width=None, max_duration=None, store_as_uint8=False, mode='delta_t', n_events=0, preprocess_kwargs={})

Generates a single HDF5 tensor file of frames at a regular interval for dataset caching.

It is meant to produce a dataset with a smaller memory footprint, that is therefore easier to load from disk. Generated files share the name of the input RAW file but have an HDF5 extension.

The naming of the output_file follows this pattern:

“{output_folder}/{path basename}_{start_ts}.h5”

Parameters
  • path (string) – Path of input files.

  • output_folder (string) – Folder path where data will be written.

  • preprocess (string) – Name of the preprocessing function to use.

  • delta_t (int) – Period at which tensors are computed (in us).

  • height (int) – if None the features are not downsampled, however features are downsampled to height which must be the sensor’s height divided by a power of 2.

  • width (int) – if None the features are not downsampled, however features are downsampled to width which must be the sensor’s width divided by a power of 2.

  • start_ts (int) – Timestamp (in microseconds) from which the computation begins. Either a single int for all files or a list containing exactly one int per file.

  • max_duration (int) – If not None, limit the duration of the file to max_duration us.

  • store_as_uint8 (boolean) – if True, casts to byte before storing to save space. Only supports 0-1 normalized data.

  • mode (string) – Load by timeslice or number of events. Either “delta_t” or “n_events”.

  • n_events (int) – Number of events in the timeslice.

  • preprocess_kwargs (dictionary) – A dictionary contains the kwargs used by different preprocessing method

A collection of visualization utilities for preprocessings

Examples

>>> delta = 100000
>>> initial_ts = record.current_time
>>> events = record.load_delta_t(delta)  # load 100 milliseconds worth of events
>>> events['t'] -= int(initial_ts)  # events timestamp should be reset
>>> output_array = np.zeros((1, 2, height, width))  # preallocate output array
>>> histo(events, output_array, delta)
>>> img = viz_diff("histo")(output_array[0])
>>> cv2.imshow('img', img)
>>> cv2.waitKey()
metavision_ml.preprocessing.viz.filter_outliers(input_val, num_std=2)

Filter outliers in an array

Parameters

input_val (np.ndarray) – Array of any shape

Returns

Normalized array of same shape as input

Return type

output_array (np.ndarray)

metavision_ml.preprocessing.viz.gray_to_rgb(im)

Just Repeat image 3 times

Parameters

im (np.ndarray) – Array of shape (H,W)

Returns

array of shape (H,W,3)

Return type

output_array

metavision_ml.preprocessing.viz.normalize(im)

Normalizes image by min-max

Parameters

im (np.ndarray) – Array of any shape

Returns

Normalized array of same shape as input

Return type

output_array (np.ndarray)

metavision_ml.preprocessing.viz.viz_diff(im)

Visualize difference of histogram

Parameters

im (np.ndarray) – Array of shape (H,W)

Returns

Array of shape (H,W)

Return type

output_array (np.ndarray)

metavision_ml.preprocessing.viz.viz_diff_binarized(im)

Visualize binarized difference of events (“ON”-“OFF”)

Parameters

im (np.ndarray) – Array of shape (H,W)

Returns

Array of shape (H,W,3)

Return type

output_array (np.ndarray)

metavision_ml.preprocessing.viz.viz_event_cube_rgb(im, split_polarity=True)

Visualize 3 out of 6 channels in RGB mode.

Parameters
  • im (np.ndarray) – Array of shape (T,H,W) T images or T//2 group of images with 2 channels

  • split_polarity – whether each image is single-channel (ON-OFF) or (ON, OFF) 2 channel images.

Returns

Array of shape (H,W,3)

Return type

output_array (np.ndarray)

metavision_ml.preprocessing.viz.viz_histo(im)

Visualize difference of histogram

Parameters

im (np.ndarray) – Array of shape (2,H,W)

Returns

Array of shape (H,W)

Return type

output_array (np.ndarray)

metavision_ml.preprocessing.viz.viz_histo_binarized(im)

Visualize binarized histogram of events

Parameters

im (np.ndarray) – Array of shape (2,H,W)

Returns

Array of shape (H,W,3)

Return type

output_array (np.ndarray)

metavision_ml.preprocessing.viz.viz_histo_filtered(im, val_max=0.5)

visualize strongly filtered histo image with 3 channels

Args:

im (np.ndarray): Array of shape (2,H,W)

Returns:

output_array: array of shape (H,W,3)

metavision_ml.preprocessing.viz.viz_histo_rgb(im)

visualize histo image with 3 channels

Args:

im (np.ndarray): Array of shape (H,W)

Returns:

output_array: array of shape (H,W,3)

metavision_ml.preprocessing.viz.viz_multichannel_timesurface(tensor, blend=False)

Visualizes three channels of a multi channel timesurface.

Parameters
  • tensor (np.ndarray) – array of shape (T,H,W) T images or T//2 group of images with 2 channels

  • blend (boolean) – whether to blend different channels to visualize them as one.

Returns

array of shape (H,W,3)

Return type

output_array (np.ndarray)

metavision_ml.preprocessing.viz.viz_timesurface(im)

Visualize timesurface

Note: In order to generate a timesurface you need to call event_to_tensor.timesurface Typically, if you want to see an exponential decay timesurface you don’t set the arg “reset” to True in order to keep the latest event’s timestamp at each pixel.

Here we assume the timesurface has already been normalized between [0,1] either by min-max normalization or with exponential time-decay.

Parameters

im (np.ndarray) – Array of shape (2,H,W) or (H,W)

Returns

Array of shape (H,W,3)

Return type

output_array (np.ndarray)

class metavision_ml.preprocessing.CDProcessor(height, width, num_tbins=5, preprocessing='histo', downsampling_factor=0, preprocess_kwargs={})

Wrapper that simplifies the preprocessing.

Parameters
  • height – image height

  • width – image width

  • num_tbins – number of time slices

  • preprocessing – name of method, must be registered first

  • downsampling_factor (int) – Parameter used to reduce the spatial dimension of the obtained feature. Actually multiply the coordinates by 2**(-downsampling_factor).

get_array_dim()

Returns the shape of the preprocessing tensor.

Returns

shape of the tensor to be preprocessed.

Return type

shape (int tuple)

Iterator of feature tensor for a source of input events.

class metavision_ml.preprocessing.preprocessor_iterator.CDProcessorIterator(path, preprocess_function_name, mode='delta_t', start_ts=0, max_duration=None, delta_t=50000, n_events=10000, num_tbins=1, preprocess_kwargs={}, device=device(type='cpu'), height=None, width=None, **kwargs)

Provides feature tensors (torch.Tensor) at regular intervals using python implementation of EventCD preprocessors (CDProcessor).

Relies on the EventsIterator class. The different behaviours of EventsIterator can be leveraged.

Parameters
  • path (string) – Path to the file to read, or empty for a camera.

  • preprocess_function_name (string) – Name of the preprocessing function used to turn events into features. Can be any of the functions present in metavision_ml.preprocessing or one registered by the user.

  • mode (str) – Mode of Streaming (n_event, delta_t, mixed)

  • start_ts (int) – Start of EventIterator

  • max_duration (int) – Total Duration of EventIterator

  • delta_t (int) – Duration of used events slice in us.

  • num_tbins (int) – Number of TimeBins

  • preprocess_kwargs – dictionary of optional arguments to the preprocessing function. This can be used to override the default value of max_incr_per_pixel For instance. {“max_incr_per_pixel”: 20} to clip and normalize tensors by 20 at full resolution.

  • device (torch.device) – Torch device (defaults to cpu).

  • height (int) – if None the features are not downsampled, however features are downsampled to height which must be the sensor’s height divided by a power of 2.

  • width (int) – if None the features are not downsampled, however features are downsampled to width which must be the sensor’s width divided by a power of 2.

  • **kwargs – Arbitrary keyword arguments passed to the underlying EventsIterator.

mv_it

object used to read from the file or the camera.

Type

EventsIterator

array_dim

shape of the tensor (channel, height, width).

Type

tuple

cd_proc

class computing features from events into a preallocated memory array.

Type

CDProcessor

step

counter of iterations.

Type

int

event_input_height

original height of the sensor in pixels.

Type

int

event_input_width

original width of the sensor in pixels.

Type

int

Examples

>>> path = "example.raw"
>>> for tensor in Preprocessor(path, "event_cube", delta_t=10000):
>>>     # Returns a torch Tensor.
>>>     print(tensor.shape)
get_time()

Cut Inner Reader Time

get_vis_func()

Returns the visualization function corresponding to the preprocessing being used.

class metavision_ml.preprocessing.preprocessor_iterator.EventPreprocessorIterator(path, preprocess_function_name, mode='delta_t', start_ts=0, max_duration=None, delta_t=50000, n_events=10000, num_tbins=1, preprocess_kwargs={}, device=device(type='cpu'), height=None, width=None, **kwargs)

Provides feature tensors (torch.Tensor) at regular intervals using C+ implementation of EventCD preprocessors (EventPreprocessor).

Relies on the EventsIterator class. The different behaviours of EventsIterator can be leveraged.

Parameters
  • path (string) – Path to the file to read, or empty for a camera.

  • preprocess_function_name (string) –

    Name of the preprocessing function used to turn events into features. Can be any of the following :

    • ”diff” for a Differential frame

    • ”hardware_diff” or “diff3d” for a quantized differential frame

    • ”histo” for a Histogram of events

    • ”histo3d” or “hardware_histo” for a quantized histogram

    • ”event_cube” for an EventCube

    • ”time_surface” for a TimeSurface

  • mode (str) – Mode of Streaming (n_event, delta_t, mixed)

  • start_ts (int) – Minimal start time of the EventIterator

  • max_duration (int) – Maximum duration of EventIterator

  • delta_t (int) – Duration of each slice of events in us.

  • num_tbins (int) – Number of TimeBins

  • preprocess_kwargs – dictionary of optional arguments for the preprocessing function.

  • device (torch.device) – Torch device (defaults to cpu).

  • height (int) – if not None, the features are downsampled to the provided height.

  • width (int) – if not None the features are downsampled to the provided width.

  • **kwargs – Arbitrary keyword arguments passed to the underlying EventsIterator.

mv_it

object used to read from the file or the camera.

Type

EventsIterator

array_dim

shape of the tensor (channel, height, width).

Type

tuple

evt_proc

class computing features from events into a preallocated memory array.

Type

EventPreprocessor

step

counter of iterations.

Type

int

event_input_height

original height of the sensor in pixels.

Type

int

event_input_width

original width of the sensor in pixels.

Type

int

Examples

>>> path = "example.raw"
>>> for tensor in Preprocessor(path, "event_cube", delta_t=10000):
>>>     # Returns a torch Tensor.
>>>     print(tensor.shape)
get_time()

Cut Inner Reader Time

class metavision_ml.preprocessing.preprocessor_iterator.HDF5Iterator(path, num_tbins=1, preprocess_kwargs={}, start_ts=0, device=device(type='cpu'), height=None, width=None)

Provides feature tensors (torch.Tensor) at regular intervals from a precomputed HDF5 file.

Parameters
  • path (string) – Path to the HDF5 file containing precomputed features.

  • height (int) – if None the features are not downsampled, however features are downsampled to height which must be the sensor’s height divided by a power of 2.

  • width (int) – if None the features are not downsampled, however features are downsampled to width which must be the sensor’s width divided by a power of 2.

  • device (torch.device) – Torch device (defaults to cpu).

  • start_ts (int) – First timestamp to consider in us. (Must be a multiple of the HDF5 file delta_t)

dataset

hDF5 dataset containg the precomputed features.

Type

h5py.Dataset

array_dim

shape of the tensor (channel, height, width).

Type

tuple

preprocess_dict

dictionary of the parameters used.

Type

dictionary

step

counter of iterations.

Type

int

event_input_height

original height of the sensor in pixels.

Type

int

event_input_width

original width of the sensor in pixels.

Type

int

Examples

>>> path = "example.h5"
>>> for tensor in HDF5Iterator(path, num_tbins=4):
>>>     # Returns a torch Tensor.
>>>     print(tensor.shape)
checks(preprocess_function_name, delta_t, mode='delta_t', n_events=0)

Convenience function to assert precomputed parameters

Parameters
  • preprocess_function_name (string) – Name of the preprocessing function used to turn events into features. Can be any of the functions present in metavision_ml.preprocessing or one registered by the user.

  • delta_t (int) – Duration of used events slice in us.

get_time()

Cut Inner Reader Time

get_vis_func()

Returns the visualization function corresponding to the preprocessing being used.

Registering a new preprocessing function

If you write your own preprocessing function you can make it available to existing modules with this API.

metavision_ml.preprocessing.register_new_preprocessing(preproc_name, n_input_channels, function, viz_function, kwargs={'max_incr_per_pixel': 5, 'preprocess_dtype': dtype('float32')})

Registers a new preprocessing function to be available across the package.

This must be done each time the python interpreter is invoked.

Parameters
  • preproc_name (string) – Name of the preprocessing function, has to be unique.

  • n_input_channels (int) – Number of channels in the resulting tensor.

  • function (function) – Preprocessing function. Its signature must be function(events, tensor, **kwargs) -> None

  • viz_function (function) – Visualization function. Its signature must be viz_function(tensor) -> img (nd.array H x W x 3)

  • kwargs (dict) – Dictionary of optional arguments to be passed to the function.

metavision_ml.preprocessing.get_preprocess_dict(preproc_name)

Returns a mutable dict describing a preprocessing function.

preproc_name (string): Name of the preprocessing function, already existing one or previously registered.

metavision_ml.preprocessing.get_preprocess_function_names()

Returns the names of all existing and registered preprocessing functions.