SDK ML Preprocessing API

A collection of preprocessing functions to turn a buffer of events into a sequence of 2 dimensional features

This works only for numpy structured arrays representation of events This makes intensive use of numba http://numba.pydata.org/ , which is awesome.

Examples

>>> delta = 100000
>>> initial_ts = record.current_time
>>> events = record.load_delta_t(delta)  # load 100 milliseconds worth of events
>>> events['t'] -= int(initial_ts)  # events timestamp should be reset
>>> output_array = np.zeros((1, 2, height, width))  # preallocate output array
>>> histo(events, output_array, delta)
metavision_ml.preprocessing.event_to_tensor.diff(xypt, output_array, total_tbins_delta_t, downsampling_factor=0, reset=True, max_incr_per_pixel=5.0)

Returns the difference of histogram of positive events and negative events. It requires output_array have a signed dtype.

Parameters
  • xypt (events) – structured array containing events

  • output_array – numpy float32 array [tbins, 1, height, width]

  • total_tbins_delta_t – duration of the timeslice

  • downsampling_factor (int) – Parameter used to reduce the spatial dimension of the obtained feature. Actually multiply the coordinates by 2**(-downsampling_factor).

  • reset (boolean) – whether to reset output_array to 0 beforehand.

  • max_incr_per_pixel – maximum number of increments per pixel (expressed in initial resolution).

metavision_ml.preprocessing.event_to_tensor.diff_quantized(xypt, output_array, total_tbins_delta_t, downsampling_factor=0, reset=True, negative_bit_length=8, normalization=False)

Returns the difference of histogram of positive events and negative events. It requires output_array have a signed dtype.

Parameters
  • xypt (events) – structured array containing events

  • output_array (np.ndarray) – Pre allocated numpy array of shape (num_tbins, 1, height, width). If downsampling_factor is not 0 or normalization is True, the datatype should be floating. Otherwise it should be np.int8.

  • total_tbins_delta_t – duration of the timeslice

  • downsampling_factor (int) – Parameter used to reduce the spatial dimension of the obtained feature. Actually multiply the coordinates by 2**(-downsampling_factor).

  • reset (boolean) – whether to reset output_array to 0 beforehand.

  • negative_bit_length (int) – negative bits length set by the camera settings. It controls the precision used to compute the tensor and must be integer >2 and <=8 in Diff3D mode.

  • normalization (boolean) – If not used, the dtype of output_array should be set as int8 and the data ranges [-2 ** (negative_bit_length - 1), 2 ** (negative_bit_length - 1) - 1]. If used, the dtype of output_array should be set as floating point and the data ranges [-1.0, 1.0]

metavision_ml.preprocessing.event_to_tensor.event_cube(xypt, output_array, total_tbins_delta_t, downsampling_factor=0, split_polarity=True, reset=True, max_incr_per_pixel=5)

Takes xypt events within a timeslice of length total_tbins_delta_t and updates an array with shape (num_tbins,num_utbins*2,H,W) microbins are interval that are used in the channels [Unsupervised Event-based Learning of Optical Flow, Zhu et al. 2018]

Note: you should load delta_t * (num_tbins+1) to avoid artefacts on last timeslice

because the support of the time bilinear kernel is 2 bins.

Parameters
  • xypt (events) – structured array containing events

  • output_array (np.ndarray) – (num_tbins,num_utbins*2,H,W) dtype MUST be floating point!

  • total_tbins_delta_t (int) – Length in us of the interval in which events are taken

  • downsampling_factor – will divide by this power of 2 the event coordinates x & y. (WARNING): This is not like in the paper where you should use bilinear weights for downsampling as well. A true event-based bilinear resize should contribute to 8 different cells in result tensor.

  • split_polarity (bool) – whether or not to split polarity into 2 channels or consider p as weight {-1,1}

  • reset (bool) – reset output_array, in most cases you should put this to True

  • max_incr_per_pixel – maximum number of increments per pixel (expressed in initial resolution).

metavision_ml.preprocessing.event_to_tensor.histo(xypt, output_array, total_tbins_delta_t, downsampling_factor=0, reset=True, max_incr_per_pixel=5.0)

Computes histogram on a sequence of events.

Parameters
  • xypt (events) – structured array containing events

  • output_array (np.ndarray) – Pre allocated numpy array of shape (num_tbins, 2, height, width)

  • total_tbins_delta_t (int) – Time interval of the extended time slice (with tbins).

  • downsampling_factor (int) – Parameter used to reduce the spatial dimension of the obtained feature. Actually multiply the coordinates by 2**(-downsampling_factor).

  • reset (boolean) – whether to reset output_array to 0 beforehand.

  • max_incr_per_pixel – maximum number of increments per pixel (expressed in initial resolution).

metavision_ml.preprocessing.event_to_tensor.histo_quantized(xypt, output_array, total_tbins_delta_t, downsampling_factor=0, reset=True, negative_bit_length=4, total_bit_length=8, normalization=False)

Computes histogram on a sequence of events.

Parameters
  • xypt (events) – structured array containing events

  • output_array (np.ndarray) – Pre allocated numpy array of shape (num_tbins, 2, height, width). If downsampling_factor is not 0 or normalization is True, the datatype should be floating. Otherwise it should be np.uint8.

  • total_tbins_delta_t (int) – Time interval of the extended time slice (with tbins).

  • downsampling_factor (int) – Parameter used to reduce the spatial dimension of the obtained feature. Actually multiply the coordinates by 2**(-downsampling_factor).

  • reset (boolean) – whether to reset output_array to 0 beforehand.

  • negative_bit_length (int) – negative bits length set by the camera settings. It controls the precision used to compute the tensor and must be integer >0 and <=8 in Histo3D mode.

  • total_bit_length – total bits used for store the data, total_bit_length = negative_bit_length + positive_bit_length

  • normalization (boolean) – If not used, the dtype of output_array should be set as uint8 and the data ranges [0, 2 ** negative_bit_length] for the negative channel and [0, 2 ** positive_bit_length] for the positive channel. If used, the dtype of output_array should be set as floating point and the data ranges [0, 1.0]

metavision_ml.preprocessing.event_to_tensor.multi_channel_timesurface(xypt, output_array, total_tbins_delta_t, downsampling_factor=0, reset=True, **kwargs)

Computes a linear two channels time surface n times per delta_t.

Input channels of the output_array variable must be even. If it is 4 there will be 2 micro delta_t, 3 channels if it is 6 and so on. This feature contains precise time information and allows to perceive higher frequency phenomenon than 1 / delta_t. The dtype of output_array must be sufficient to hold values up to total_tbins_delta_t without overflow

Parameters
  • xypt (events) – structured array containing events

  • output_array (np.ndarray) – (num_tbins,num_utbins*2,H,W) dtype MUST be floating point!

  • total_tbins_delta_t (int) – length in us of the interval in which events are taken

  • downsampling_factor – will divide by this power of 2 the event coordinates x & y.

  • reset (bool) – whether to reset output_array to 0 before computing the feature.

metavision_ml.preprocessing.event_to_tensor.timesurface(xypt, output_array, total_tbins_delta_t, downsampling_factor=0, reset=True, normed=True, **kwargs)

Computes a linear two channels time surface. The dtype of output_array must be sufficient to hold values up to total_tbins_delta_t without overflow

Parameters
  • xypt (events) – structured array containing events

  • output_array (np.ndarray) – Pre allocated numpy array of shape (num_tbins, 2, height, width)

  • total_tbins_delta_t (int) – Length in us of the interval in which events are taken

  • downsampling_factor (int) – Parameter used to reduce the spatial dimension of the obtained feature. Actually multiply the coordinates by 2**(-downsampling_factor).

  • reset (boolean) – whether to reset output_array to 0 beforehand.

  • normed (boolean) – if True, scales the timesurface between 0 and 1.

Write HDF5 feature files from event files

metavision_ml.preprocessing.hdf5.generate_hdf5(paths, output_folder, preprocess, delta_t, height=None, width=None, start_ts=0, max_duration=None, n_processes=2, box_labels=[], store_as_uint8=False, mode='delta_t', n_events=0, preprocess_kwargs={})

Generates HDF5 files of frames at a regular interval for dataset caching.

It is meant to produce a dataset with a smaller memory footprint, that is therefore easier to load from disk. Generated files share the name of the input file but have an HDF5 extension.

If max_duration is not None the naming of the output_file follows this pattern:

“{output_folder}/{path basename}_{index:d}.h5” where index allows you to distinguish the cut.

otherwise “{output_folder}/{path basename}.h5”

Example

>>> python3 metavision_ml/preprocessing/hdf5.py src_path/test/*.dat --o dst_path/test/ --height_width 480 640 --preprocess histo --box-labels src_path/test/*bbox.npy
Parameters
  • paths (string list) – Paths of input files.

  • output_folder (string) – Folder path where data will be written.

  • preprocess (string) – Name of the preprocessing function to use.

  • delta_t (int) – Period at which tensor are computed (in us).

  • height (int) – if None the features are not downsampled, however features are downsampled to height which must be the sensor’s height divided by a power of 2.

  • width (int) – if None the features are not downsampled, however features are downsampled to width which must be the sensor’s width divided by a power of 2.

  • start_ts (int) – Timestamp (in microseconds) from which the computation begins. Either a single int for all files or a list containing exactly one int per file.

  • max_duration (int) – If not None, limit the duration of the file to max_duration (in us).

  • n_processes (int) – Number of processes writing files simultaneously.

  • box_labels (string list) – path to npy box label files that matches each input file. if start_ts or max_duration are specified, these files will be cut accordingly.

  • store_as_uint8 (boolean) – if True, casts to byte before storing to save space. Only supports 0-1 normalized data.

  • mode (string) – Load by timeslice or number of events. Either “delta_t” or “n_events”.

  • n_events (int) – Number of events in the timeslice.

  • preprocess_kwargs (dictionary) – A dictionary contains the kwargs used by different preprocessing method

metavision_ml.preprocessing.hdf5.split_label(box_path, output_folder, start_ts=0, max_duration=None)

Copy the labels according to split and max_durations

metavision_ml.preprocessing.hdf5.write_to_hdf5(path, start_ts=0, output_folder='.', delta_t=50000, preprocess='histo', height=None, width=None, max_duration=None, store_as_uint8=False, mode='delta_t', n_events=0, preprocess_kwargs={})

Generates a single HDF5 file of frames at a regular interval for dataset caching.

It is meant to produce a dataset with a smaller memory footprint, that is therefore easier to load from disk. Generated files share the name of the input RAW file but have an HDF5 extension.

The naming of the output_file follows this pattern:

“{output_folder}/{path basename}_{start_ts}.h5”

Parameters
  • path (string) – Path of input files.

  • output_folder (string) – Folder path where data will be written.

  • preprocess (string) – Name of the preprocessing function to use.

  • delta_t (int) – Period at which tensors are computed (in us).

  • height (int) – if None the features are not downsampled, however features are downsampled to height which must be the sensor’s height divided by a power of 2.

  • width (int) – if None the features are not downsampled, however features are downsampled to width which must be the sensor’s width divided by a power of 2.

  • start_ts (int) – Timestamp (in microseconds) from which the computation begins. Either a single int for all files or a list containing exactly one int per file.

  • max_duration (int) – If not None, limit the duration of the file to max_duration us.

  • store_as_uint8 (boolean) – if True, casts to byte before storing to save space. Only supports 0-1 normalized data.

  • mode (string) – Load by timeslice or number of events. Either “delta_t” or “n_events”.

  • n_events (int) – Number of events in the timeslice.

  • preprocess_kwargs (dictionary) – A dictionary contains the kwargs used by different preprocessing method

A collection of visualization utilities for preprocessings

Examples

>>> delta = 100000
>>> initial_ts = record.current_time
>>> events = record.load_delta_t(delta)  # load 100 milliseconds worth of events
>>> events['t'] -= int(initial_ts)  # events timestamp should be reset
>>> output_array = np.zeros((1, 2, height, width))  # preallocate output array
>>> histo(events, output_array, delta)
>>> img = viz_diff("histo")(output_array[0])
>>> cv2.imshow('img', img)
>>> cv2.waitKey()
metavision_ml.preprocessing.viz.filter_outliers(input_val, num_std=2)

Filter outliers in an array

Parameters

input_val (np.ndarray) – Array of any shape

Returns

Normalized array of same shape as input

Return type

output_array (np.ndarray)

metavision_ml.preprocessing.viz.gray_to_rgb(im)

Just Repeat image 3 times

Parameters

im (np.ndarray) – Array of shape (H,W)

Returns

array of shape (H,W,3)

Return type

output_array

metavision_ml.preprocessing.viz.normalize(im)

Normalizes image by min-max

Parameters

im (np.ndarray) – Array of any shape

Returns

Normalized array of same shape as input

Return type

output_array (np.ndarray)

metavision_ml.preprocessing.viz.viz_diff(im)

Visualize difference of histogram

Parameters

im (np.ndarray) – Array of shape (H,W)

Returns

Array of shape (H,W)

Return type

output_array (np.ndarray)

metavision_ml.preprocessing.viz.viz_diff_binarized(im)

Visualize binarized difference of events (“ON”-“OFF”)

Parameters

im (np.ndarray) – Array of shape (H,W)

Returns

Array of shape (H,W,3)

Return type

output_array (np.ndarray)

metavision_ml.preprocessing.viz.viz_event_cube_rgb(im, split_polarity=True)

Visualize 3 out of 6 channels in RGB mode.

Parameters
  • im (np.ndarray) – Array of shape (T,H,W) T images or T//2 group of images with 2 channels

  • split_polarity – whether each image is single-channel (ON-OFF) or (ON, OFF) 2 channel images.

Returns

Array of shape (H,W,3)

Return type

output_array (np.ndarray)

metavision_ml.preprocessing.viz.viz_histo(im)

Visualize difference of histogram

Parameters

im (np.ndarray) – Array of shape (2,H,W)

Returns

Array of shape (H,W)

Return type

output_array (np.ndarray)

metavision_ml.preprocessing.viz.viz_histo_binarized(im)

Visualize binarized histogram of events

Parameters

im (np.ndarray) – Array of shape (2,H,W)

Returns

Array of shape (H,W,3)

Return type

output_array (np.ndarray)

metavision_ml.preprocessing.viz.viz_histo_filtered(im, val_max=0.5)

visualize strongly filtered histo image with 3 channels

Args:

im (np.ndarray): Array of shape (2,H,W)

Returns:

output_array: array of shape (H,W,3)

metavision_ml.preprocessing.viz.viz_histo_rgb(im)

visualize histo image with 3 channels

Args:

im (np.ndarray): Array of shape (H,W)

Returns:

output_array: array of shape (H,W,3)

metavision_ml.preprocessing.viz.viz_multichannel_timesurface(tensor, blend=False)

Visualizes three channels of a multi channel timesurface.

Parameters
  • tensor (np.ndarray) – array of shape (T,H,W) T images or T//2 group of images with 2 channels

  • blend (boolean) – whether to blend different channels to visualize them as one.

Returns

array of shape (H,W,3)

Return type

output_array (np.ndarray)

metavision_ml.preprocessing.viz.viz_timesurface(im)

Visualize timesurface

Note: In order to generate a timesurface you need to call event_to_tensor.timesurface Typically, if you want to see an exponential decay timesurface you don’t set the arg “reset” to True in order to keep the latest event’s timestamp at each pixel.

Here we assume the timesurface has already been normalized between [0,1] either by min-max normalization or with exponential time-decay.

Parameters

im (np.ndarray) – Array of shape (2,H,W) or (H,W)

Returns

Array of shape (H,W,3)

Return type

output_array (np.ndarray)

class metavision_ml.preprocessing.CDProcessor(height, width, num_tbins=5, preprocessing='histo', downsampling_factor=0, preprocess_kwargs={})

Wrapper that simplifies the preprocessing.

Parameters
  • height – image height

  • width – image width

  • num_tbins – number of time slices

  • preprocessing – name of method, must be registered first

  • downsampling_factor (int) – Parameter used to reduce the spatial dimension of the obtained feature. Actually multiply the coordinates by 2**(-downsampling_factor).

get_array_dim()

Returns the shape of the preprocessing tensor.

Returns

shape of the tensor to be preprocessed.

Return type

shape (int tuple)

Registering a new preprocessing function

If you write your own preprocessing function you can make it available to existing modules with this API.

metavision_ml.preprocessing.register_new_preprocessing(preproc_name, n_input_channels, function, viz_function, kwargs={'max_incr_per_pixel': 5, 'preprocess_dtype': dtype('float32')})

Registers a new preprocessing function to be available across the package.

This must be done each time the python interpreter is invoked.

Parameters
  • preproc_name (string) – Name of the preprocessing function, has to be unique.

  • n_input_channels (int) – Number of channels in the resulting tensor.

  • function (function) – Preprocessing function. Its signature must be function(events, tensor, **kwargs) -> None

  • viz_function (function) – Visualization function. Its signature must be viz_function(tensor) -> img (nd.array H x W x 3)

  • kwargs (dict) – Dictionary of optional arguments to be passed to the function.

metavision_ml.preprocessing.get_preprocess_dict(preproc_name)

Returns a mutable dict describing a preprocessing function.

preproc_name (string): Name of the preprocessing function, already existing one or previously registered.

metavision_ml.preprocessing.get_preprocess_function_names()

Returns the names of all existing and registered preprocessing functions.