SDK Core ML Video to Event Simulator API

Video stream dataset

Image stream data loader

class metavision_core_ml.video_to_event.video_stream_dataset.VideoDatasetIterator(metadata, height, width, rgb, mode='frames', min_tbins=3, max_tbins=10, min_dt=3000, max_dt=50000, batch_times=1, pause_probability=0.5, max_optical_flow_threshold=2.0, max_interp_consecutive_frames=20, max_number_of_batches_to_produce=None, crop_image=False)

Dataset Iterator streaming images and timestamps

Parameters
  • metadata (object) – path to picture or video

  • height (int) – height of input images / video clip

  • width (int) – width of input images / video clip

  • rgb (bool) – stream rgb videos

  • mode (str) – mode of batch sampling ‘frames’,’delta_t’,’random’

  • min_tbins (int) – minimum number of frames per batch step

  • max_tbins (int) – maximum number of frames per batch step

  • min_dt (int) – minimum duration of frames per batch step

  • max_dt (int) – maximum duration of frames per batch step

  • batch_times (int) – number of timesteps of training sequences

  • pause_probability (float) – probability to add a pause (no events) (works only with PlanarMotionStream)

  • max_optical_flow_threshold (float) – maximum allowed optical flow between two consecutive frames (works only with PlanarMotionStream)

  • max_interp_consecutive_frames (int) – maximum number of interpolated frames between two consecutive frames (works only with PlanarMotionStream)

  • max_number_of_batches_to_produce (int) – maximum number of batches to produce

  • crop_image (bool) – crop images or resize them

metavision_core_ml.video_to_event.video_stream_dataset.make_video_dataset(path, num_workers, batch_size, height, width, min_length, max_length, mode='frames', min_frames=5, max_frames=30, min_delta_t=5000, max_delta_t=50000, rgb=False, seed=None, batch_times=1, pause_probability=0.5, max_optical_flow_threshold=2.0, max_interp_consecutive_frames=20, max_number_of_batches_to_produce=None, crop_image=False)

Makes a video / moving picture dataset.

Parameters
  • path (str) – folder to dataset

  • batch_size (int) – number of video clips / batch

  • height (int) – height

  • width (int) – width

  • min_length (int) – min length of video

  • max_length (int) – max length of video

  • mode (str) – ‘frames’ or ‘delta_t’

  • min_frames (int) – minimum number of frames per batch

  • max_frames (int) – maximum number of frames per batch

  • min_delta_t (int) – in microseconds, minimum duration per batch

  • max_delta_t (int) – in microseconds, maximum duration per batch

  • rgb (bool) – retrieve frames in rgb

  • seed (int) – seed for randomness

  • batch_times (int) – number of time steps in training sequence

  • pause_probability (float) – probability to add a pause during the sequence (works only with PlanarMotionStream)

  • max_optical_flow_threshold (float) – maximum allowed optical flow between two consecutive frames (works only with PlanarMotionStream)

  • max_interp_consecutive_frames (int) – maximum number of interpolated frames between two consecutive frames (works only with PlanarMotionStream)

  • max_number_of_batches_to_produce (int) – maximum number of batches to produce. Makes sure the stream will not produce more than this number of consecutive batches using the same image or video.

metavision_core_ml.video_to_event.video_stream_dataset.pad_collate_fn(data_list)

Here we pad with last image/ timestamp to get a contiguous batch

CPU Event simulator

EventSimulator: Load a .mp4 video and start streaming events

class metavision_core_ml.video_to_event.simulator.EventSimulator(height, width, Cp, Cn, refractory_period, sigma_threshold=0.0, cutoff_hz=0, leak_rate_hz=0, shot_noise_rate_hz=0, verbose=False)

Event Simulator

Implementation is based on the following publications:

  • Video to Events: Recycling Video Datasets for Event Cameras: Daniel Gehrig et al.

  • V2E: From video frames to realistic DVS event camera streams: Tobi Delbruck et al.

This object allows to accumulate events by feeding it with images and (increasing) timestamps. The events are returned of type EventCD (see definition in event_io/dat_tools or metavision_sdk_base)

Parameters
  • Cp (float) – mean for ON threshold

  • Cn (float) – mean for OFF threshold

  • refractory_period (float) – min time between 2 events / pixel

  • sigma_threshold (float) – standard deviation for threshold array

  • cutoff_hz (float) – cutoff frequency for photodiode latency simulation

  • leak_rate_hz (float) – frequency of reference value leakage

  • shot_noise_rate_hz (float) – frequency for shot noise events

dynamic_moving_average(new_frame, ts, eps=1e-07)

Apply nonlinear lowpass filter here. Filter is 2nd order lowpass IIR that uses two internal state variables to store stages of cascaded first order RC filters. Time constant of the filter is proportional to the intensity value (with offset to deal with DN=0)

Parameters
  • new_frame (np.ndarray) – new image

  • ts (int) – new timestamp (us)

flush_events()

Erase current events

get_events()

Grab events

get_size()

Function returning the size of the imager which produced the events.

Returns

Tuple of int (height, width) which might be (None, None)

image_callback(img, img_ts)

Accumulates Events into internal buffer

Parameters
  • img (np.ndarray) – uint8 gray image of shape (H,W)

  • img_ts (int) – timestamp in micro-seconds.

Returns

current total number of events

Return type

num

leak_events(delta_t)

Leak events: switch in diff change amp leaks at some rate equivalent to some hz of ON events. Actual leak rate depends on threshold for each pixel. We want nominal rate leak_rate_Hz, so R_l=(dI/dt)/Theta_on, so R_l*Theta_on=dI/dt, so dI=R_l*Theta_on*dt

Parameters

delta_t (int) – time between 2 images (us)

log_image_callback(log_img, img_ts)

For debugging, log is done outside

reset()

Resets buffers

set_config(config='noisy')

Set configuration

Parameters

config (str) – name for configuration

shot_noise_events(event_buffer, ts, num_events, num_iters)

NOISE: add temporal noise here by simple Poisson process that has a base noise rate self.shot_noise_rate_hz. If there is such noise event, then we output event from each such pixel

the shot noise rate varies with intensity: for lowest intensity the rate rises to parameter. the noise is reduced by factor SHOT_NOISE_INTEN_FACTOR for brightest intensities

Parameters
  • ts (int) – timestamp

  • num_events (int) – current number of events

  • num_iters (int) – max events per pixel since last round

metavision_core_ml.video_to_event.simulator.eps_log(x, eps=1e-05)

Takes Log of image

Parameters

x – uint8 gray frame

metavision_core_ml.video_to_event.single_image_make_events_cpu.make_events_cpu(events, ref_values, last_img, last_event_timestamp, log_img, last_img_ts, delta_t, Cps, Cns, refractory_period)

produce events into AER format

Parameters
  • events (np.ndarray) – array in format EventCD

  • ref_values (np.ndarray) – current log intensity state / pixel (H,W)

  • last_img (np.ndarray) – last image log intensity (H,W)

  • last_event_timestamp (int) – last image timestamp

  • log_img (np.ndarray) – current log intensity image (H,W)

  • last_img_ts (np.ndarray) – last timestamps emitted / pixel (2,H,W)

  • delta_t (int) – current duration (us) since last image.

  • Cps (np.ndarray) – array of ON thresholds

  • Cns (np.ndarray) – array of OFF thresholds

  • refractory_period (int) – minimum time between 2 events / pixel

Simple Iterator built around the Metavision Reader classes.

class metavision_core_ml.video_to_event.simu_events_iterator.SimulatedEventsIterator(input_path, start_ts=0, mode='delta_t', delta_t=10000, n_events=10000, max_duration=None, relative_timestamps=False, height=- 1, width=- 1, Cp=0.11, Cn=0.1, refractory_period=0.001, sigma_threshold=0.0, cutoff_hz=0, leak_rate_hz=0, shot_noise_rate_hz=0, override_fps=0)

SimulatedEventsIterator is a small convenience class to generate an iterator of events from any video

reader

class handling the video (iterator of the frames and their timestamps).

delta_t

Duration of served event slice in us.

Type

int

max_duration

If not None, maximal duration of the iteration in us.

Type

int

end_ts

If max_duration is not None, last time_stamp to consider.

Type

int

relative_timestamps

Whether the timestamps of served events are relative to the current reader timestamp, or since the beginning of the recording.

Type

boolean

Parameters
  • input_path (str) – Path to the file to read.

  • start_ts (int) – First timestamp to consider (in us).

  • mode (string) – Load by timeslice or number of events. Either “delta_t” or “n_events”

  • delta_t (int) – Duration of served event slice in us.

  • n_events (int) – Number of events in the timeslice.

  • max_duration (int) – If not None, maximal duration of the iteration in us.

  • relative_timestamps (boolean) – Whether the timestamps of served events are relative to the current reader timestamp, or since the beginning of the recording.

  • Cp (float) – mean for ON threshold

  • Cn (float) – mean for OFF threshold

  • refractory_period (float) – min time between 2 events / pixel

  • sigma_threshold (float) – standard deviation for threshold array

  • cutoff_hz (float) – cutoff frequency for photodiode latency simulation

  • leak_rate_hz (float) – frequency of reference value leakage

  • shot_noise_rate_hz (float) – frequency for shot noise events

  • override_fps (int) – override fps of the input video.

Examples

>>> for ev in SimulatedEventsIterator("beautiful_record.mp4", delta_t=1000000, max_duration=1e6*60):
>>>     print("Rate : {:.2f}Mev/s".format(ev.size * 1e-6))
get_size()

Function returning the size of the imager which produced the events.

Returns

Tuple of int (height, width) which might be (None, None)

GPU Event Simulator

More efficient reimplementation. The main difference is cuda kernels & possibility to directly stream the voxel grid.

class metavision_core_ml.video_to_event.gpu_simulator.GPUEventSimulator(batch_size, height, width, c_mu=0.1, c_std=0.022, refractory_period=10, leak_rate_hz=0, cutoff_hz=0, shot_noise_hz=0)

GPU Event Simulator of events from frames & timestamps.

Implementation is based on the following publications:

  • Video to Events: Recycling Video Datasets for Event Cameras: Daniel Gehrig et al.

  • V2E: From video frames to realistic DVS event camera streams: Tobi Delbruck et al.

Parameters
  • batch_size (int) – number of video clips / batch

  • height (int) – height

  • width (int) – width

  • c_mu (float or list) – threshold average if scalar will consider same OFF and ON thresholds if list, will be considered as [ths_OFF, ths_ON]

  • c_std (float) – threshold standard deviation

  • period (refractory) – time before event can be triggered again

  • leak_rate_hz (float) – frequency of reference voltage leakage

  • cutoff_hz (float) – frequency for photodiode latency

  • shot_noise_hz (float) – frequency of shot noise events

Initializes internal Module state, shared by both nn.Module and ScriptModule.

count_events(log_images, video_len, image_ts, first_times, reset=True, persistent=True)

Estimates the number of events per pixel.

Parameters
  • log_images (Tensor) – shape (H, W, total_num_frames) tensor containing the video frames

  • video_len (Tensor) – shape (B,) len of each video in the batch.

  • images_ts (Tensor) – shape (B, max(video_len)) timestamp associated with each frame.

  • first_times (Tensor) – shape (B) whether the video is a new one or the continuation of one.

  • reset – do reset the count variable

Returns

B,H,W

Return type

counts

dynamic_moving_average(images, num_frames, timestamps, first_times, min_pixel_range=20, max_pixel_incr=20, eps=1e-07)

Converts byte images to log and performs a pass-band motion blur of incoming images. This simulates the latency of the photodiode w.r.t to incoming light dynamic.

Parameters
  • images (torch.Tensor) – H,W,T byte or float images in the 0 to 255 range

  • num_frames (torch.Tensor) – shape (B,) len of each video in the batch.

  • timestamps (torch.Tensor) – B,T timestamps

  • first_times (torch.Tensor) – B flags

  • eps (float) – epsilon factor

event_volume(log_images, video_len, image_ts, first_times, nbins, mode='bilinear', split_channels=False)

Computes a volume of discretized images formed after the events, without storing the AER events themselves. We go from simulation directly to this space-time quantized representation. You can obtain the event-volume of [Unsupervised Event-based Learning of Optical Flow, Zhu et al. 2018] by specifying the mode to “bilinear” or you can obtain a stack of histograms if mode is set to “nearest”.

Parameters
  • log_images (Tensor) – shape (H, W, total_num_frames) tensor containing the video frames

  • video_len (Tensor) – shape (B,) len of each video in the batch.

  • images_ts (Tensor) – shape (B, max(video_len)) timestamp associated with each frame.

  • first_times (Tensor) – shape (B) whether the video is a new one or the continuation of one.

  • nbins (int) – number of time-bins for the voxel grid

  • mode (str) – bilinear or nearest

  • split_channels – if True positive and negative events have a distinct channels instead of doing their difference in a single channel.

event_volume_sequence(log_images, video_len, image_ts, target_timestamps, first_times, nbins, mode='bilinear', split_channels=False)

Computes a volume of discretized images formed after the events, without storing the AER events themselves. We go from simulation directly to this space-time quantized representation. You can obtain the event-volume of [Unsupervised Event-based Learning of Optical Flow, Zhu et al. 2018] by specifying the mode to “bilinear” or you can obtain a stack of histograms if mode is set to “nearest”. Here, we also receive a sequence of target timestamps to cut non uniformly the event volumes.

Parameters
  • log_images (Tensor) – shape (H, W, total_num_frames) tensor containing the video frames

  • video_len (Tensor) – shape (B,) len of each video in the batch.

  • images_ts (Tensor) – shape (B, max(video_len)) timestamp associated with each frame.

  • first_times (Tensor) – shape (B) whether the video is a new one or the continuation of one.

  • nbins (int) – number of time-bins for the voxel grid

  • mode (str) – bilinear or nearest

  • split_channels – if True positive and negative events have a distinct channels instead of doing their difference in a single channel.

forward()

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

get_events(log_images, video_len, image_ts, first_times)

Retrieves the AER event list in a pytorch array.

Parameters
  • log_images (Tensor) – shape (H, W, total_num_frames) tensor containing the video frames

  • video_len (Tensor) – shape (B,) len of each video in the batch.

  • images_ts (Tensor) – shape (B, max(video_len)) timestamp associated with each frame.

  • first_times (Tensor) – shape (B) whether the video is a new one or the continuation of one.

Returns

N,5 in batch_index, x, y, polarity, timestamp (micro-seconds)

Return type

events

log_images(u8imgs, eps=1e-07)

Converts byte images to log

Parameters
  • u8imgs (torch.Tensor) – B,C,H,W,T byte images

  • eps (float) – epsilon factor

randomize_broken_pixels(first_times, video_proba=0.01, crazy_pixel_proba=0.0005, dead_pixel_proba=0.005)

Simulates dead & crazy pixels

Parameters
  • first_times – B video just started flags

  • video_proba – probability to simulate broken pixels

randomize_cutoff(first_times, cutoff_min=0, cutoff_max=900)

Randomizes the cutoff rates per video

Parameters
  • first_times – B video just started flags

  • cutoff_min – in hz

  • cutoff_max – in hz

randomize_leak(first_times, leak_min=0, leak_max=1)

Randomizes the leak rates per video

Parameters
  • first_times – B video just started flags

  • leak_min – in hz

  • leak_max – in hz

randomize_refractory_periods(first_times, ref_min=10, ref_max=1000)

Randomizes the refractory period per video

Parameters
  • first_times – B video just started flags

  • ref_min – in microseconds

  • ref_max – in microseconds

randomize_shot(first_times, shot_min=0, shot_max=1)

Randomizes the shot noise per video

Parameters
  • shot_min – in hz

  • shot_max – in hz

randomize_thresholds(first_times, th_mu_min=0.05, th_mu_max=0.2, th_std_min=0.001, th_std_max=0.01)

Re-Randomizes thresholds per video

Parameters
  • first_times – B video just started flags

  • th_mu_min (scalar or list of scalars) – min average threshold if list, will be considered as [th_mu_min_OFF, th_mu_min_ON]

  • th_mu_max (scalar or list of scalars) – max average threshold if list, will be considered as [th_mu_max_OFF, th_mu_max_ON]

  • th_std_min – min threshold standard deviation

  • th_std_max – max threshold standard deviation

Implementation in numba cuda GPU of kernels used to simulate events from images GPU kernels used to simulate events from images

cpu + cuda kernels for gpu simulation of cutoff