SDK Core ML Data API

Synthetic video generation

Camera Pose Generator:

This module allows to define a trajectory of camera poses and generate continuous homographies by interpolating when maximum optical flow is beyond a predefined threshold.

class metavision_core_ml.data.camera_poses.CameraPoseGenerator(height, width, max_frames=300, pause_probability=0.5, max_optical_flow_threshold=2.0, max_interp_consecutive_frames=20)

CameraPoseGenerator generates a series of continuous homographies with interpolation.

Parameters
  • height (int) – height of image

  • width (int) – width of image

  • max_frames (int) – maximum number of poses

  • pause_probability (float) – probability that the sequence contains a pause

  • max_optical_flow_threshold (float) – maximum optical flow between two consecutive frames

  • max_interp_consecutive_frames (int) – maximum number of interpolated frames between two consecutive frames

get_flow(rvec1, tvec1, rvec2, tvec2, height, width)

Computes Optical flow between 2 poses

Parameters
  • rvec1 (np.array) – rotation vector 1

  • tvec1 (np.array) – translation vector 1

  • rvec2 (np.array) – rotation vector 2

  • tvec2 (np.array) – translation vector 2

  • nt (np.array) – plane normal

  • depth (float) – depth from camera

  • height (int) – height of image

  • width (int) – width of image

get_image_transform(rvec1, tvec1, rvec2, tvec2)

Get Homography between 2 poses

Parameters
  • rvec1 (np.array) – rotation vector 1

  • tvec1 (np.array) – translation vector 1

  • rvec2 (np.array) – rotation vector 2

  • tvec2 (np.array) – translation vector 2

metavision_core_ml.data.camera_poses.add_random_pause(signal, max_pos_size_ratio=0.3)

Adds a random pause in a multidimensional signal

Parameters
  • signal (np.array) – TxD signal

  • max_pos_size_ratio (float) – size of pause relative to signal Length

metavision_core_ml.data.camera_poses.generate_homographies(rvecs, tvecs, nt, d)

Generates multiple homographies from rotation vectors

Parameters
  • rvecs (np.array) – N,3 rotation vectors

  • tvecs (np.array) – N,3 translation vectors

  • nt (np.array) – normal to camera

  • d (float) – depth

metavision_core_ml.data.camera_poses.generate_homographies_from_rotation_matrices(rot_mats, tvecs, nt, depth)

Generates multiple homographies from rotation matrices

Parameters
  • rot_mats (np.array) – N,3,3 rotation matrices

  • tvecs (np.array) – N,3 translation vectors

  • nt (np.array) – normal to camera

  • depth (float) – depth to camera

metavision_core_ml.data.camera_poses.generate_homography(rvec, tvec, nt, depth)

Generates a single homography

Parameters
  • rvec (np.array) – rotation vector

  • tvec (np.array) – translation vector

  • nt (np.array) – normal to camera

  • depth (float) – depth to camera

metavision_core_ml.data.camera_poses.generate_image_homographies_from_homographies(h, K, Kinv)

Multiplies homography left & right by intrinsic matrix

Parameters
  • h (np.array) – homographies N,3,3

  • K (np.array) – intrinsic

  • Kinv (np.ndarray) – inverse intrinsic

metavision_core_ml.data.camera_poses.generate_image_homography(rvec, tvec, nt, depth, K, Kinv)

Generates a single image homography

Parameters
  • rvec (np.array) – rotation vector

  • tvec (np.array) – translation vector

  • nt (np.array) – normal to camera

  • depth (float) – depth

  • K (np.array) – intrinsic matrix

  • Kinv (np.array) – inverse intrinsic matrix

metavision_core_ml.data.camera_poses.generate_smooth_signal(num_signals, num_samples, min_speed=0.0001, max_speed=0.1)

Generates a smooth signal

Parameters
  • num_signals (int) – number of signals to generate

  • num_samples (int) – length of multidimensional signal

  • min_speed (float) – minimum rate of change

  • max_speed (float) – maximum rate of change

metavision_core_ml.data.camera_poses.get_flow(rvec1, tvec1, rvec2, tvec2, nt, depth, K, Kinv, height, width)

Computes Optical Flow between 2 poses

Parameters
  • rvec1 (np.array) – rotation vector 1

  • tvec1 (np.array) – translation vector 1

  • rvec2 (np.array) – rotation vector 2

  • tvec2 (np.array) – translation vector 2

  • nt (np.array) – plane normal

  • depth (float) – depth from camera

  • K (np.array) – intrinsic matrix

  • Kinv (np.array) – inverse intrinsic matrix

  • height (int) – height of image

  • width (int) – width of image

  • infinite (bool) – plan is infinite or not

metavision_core_ml.data.camera_poses.get_grid(height, width)

Computes a 2d meshgrid

Parameters
  • height (int) – height of grid

  • width (int) – width of grid

metavision_core_ml.data.camera_poses.get_image_transform(rvec1, tvec1, rvec2, tvec2, nt, depth, K, Kinv)

Get image Homography between 2 poses (includes cam intrinsics)

Parameters
  • rvec1 (np.array) – rotation vector 1

  • tvec1 (np.array) – translation vector 1

  • rvec2 (np.array) – rotation vector 2

  • tvec2 (np.array) – translation vector 2

  • nt (np.array) – plane normal

  • depth (float) – depth from camera

  • K (np.array) – intrinsic

  • Kinv (np.ndarray) – inverse intrinsic

metavision_core_ml.data.camera_poses.get_transform(rvec1, tvec1, rvec2, tvec2, nt, depth)

Get geometric Homography between 2 poses

Parameters
  • rvec1 (np.array) – rotation vector 1

  • tvec1 (np.array) – translation vector 1

  • rvec2 (np.array) – rotation vector 2

  • tvec2 (np.array) – translation vector 2

  • nt (np.array) – plane normal

  • depth (float) – depth from camera

metavision_core_ml.data.camera_poses.interpolate_poses(rvecs, tvecs, nt, depth, K, Kinv, height, width, opt_flow_threshold=2, max_frames_per_bin=20)

Interpolate given poses

Parameters
  • rvecs (np.array) – N,3 rotation vectors

  • tvecs (np.array) – N,3 translation vectors

  • nt (np.array) – plane normal

  • depth (float) – depth to camera

  • K (np.array) – camera intrinsic

  • Kinv (np.array) – inverse camera intrinsic

  • height (int) – height of image

  • width (int) – width of image

  • opt_flow_threshold (float) – maximum flow threshold

  • max_frames_per_bin (int) – maximum number of pose interpolations between two consecutive poses of the original list of poses

metavision_core_ml.data.camera_poses.interpolate_times_tvecs(tvecs, key_times, inter_tvecs, inter_times, nums)

Interpolates between key times times & translation vectors

Parameters
  • tvecs (np.array) – key translation vectors (N, 3)

  • key_times (np.array) – key times (N, )

  • inter_tvecs (np.array) – interpolated translations (nums.sum(), 3)

  • inter_times (np.array) – interpolated times (nums.sum(),)

  • nums (np.array) – number of interpolation point between key points (N-1,) nums[i] is the number of points between key_times[i] (included) and key_times[i+1] (excluded) minimum is 1, which corresponds to key_times[i]

6-DOF motion in front of image plane All in numpy + OpenCV Applies continuous homographies to your picture in time. Also you can get the optical flow for this motion.

class metavision_core_ml.data.image_planar_motion_stream.PlanarMotionStream(image_filename, height, width, max_frames=1000, rgb=False, infinite=True, pause_probability=0.5, max_optical_flow_threshold=2.0, max_interp_consecutive_frames=20, crop_image=False)

Generates a planar motion in front of the image

Parameters
  • image_filename (str) – path to image

  • height (int) – desired height

  • width (int) – desired width

  • max_frames (int) – number of frames to stream

  • rgb (bool) – color images or gray

  • infinite (bool) – border is mirrored

  • pause_probability (float) – probability to add a pause during the stream

  • max_optical_flow_threshold (float) – maximum optical flow between two consecutive frames

  • max_interp_consecutive_frames (int) – maximum number of interpolated frames between two consecutive frames

  • crop_image (bool) – crop images or resize them

6-DOF motion in front of image plane and returns corners positions All in numpy + OpenCV Applies continuous homographies to your picture in time.

class metavision_core_ml.data.corner_planar_motion_stream.CornerPlanarMotionStream(image_filename, height, width, max_frames=1000, rgb=False, infinite=True, pause_probability=0.5, draw_corners_as_circle=True)

Generates a planar motion in front of the image, returning both images and Harris’ corners

Parameters
  • image_filename – path to image

  • height – desired height

  • width – desired width

  • max_frames – number of frames to stream

  • rgb – color images or gray

  • infinite – border is mirrored

  • pause_probability – probability of stream to pause

  • draw_corners_as_circle – if true corners will be 2 pixels circles

Video Streaming

Scheduling System for Videos

class metavision_core_ml.data.scheduling.Metadata(path, start_frame, end_frame)

Represents part of a file to be read.

Parameters
  • path (str) – path to video

  • start_frame (int) – first frame to seek to

  • end_frame (int) – last frame to read

metavision_core_ml.data.scheduling.build_image_metadata(folder, min_size, max_size, denominator=1)

Build Metadata from images

Parameters
  • folder (str) – path to pictures

  • min_size (int) – minimum number of frames

  • max_size (int) – maximum number of frames

  • denominator (int) – num_frames will always be a multiple of denominator. It is used to avoid having batches that are missing some frames and need to be padded. This happens when the number of time steps is not a multiple of num_frames.

metavision_core_ml.data.scheduling.build_metadata(folder, min_length, max_length, denominator=1)

Builds Metadata for Videos and Images

Parameters
  • folder (str) – path to videos or images

  • min_length (int) – minimum number of frames

  • max_length (int) – maximum number of frames

  • denominator (int) – denominator of number of frames for image metadata

metavision_core_ml.data.scheduling.build_video_metadata(folder)

Builds Metadata from videos

Parameters

folder (str) – path to videos (only looks in current directory, not subfolders)

metavision_core_ml.data.scheduling.split_video_metadata(metadatas, min_size, max_size)

Split video metadata into smaller ones.

Parameters
  • metadatas (list) – list of metadata objects

  • min_size (int) – minimum number of frames

  • max_size (int) – maximum number of frames

Video .mp4 or .avi Iterator Right now backend is OpenCV

class metavision_core_ml.data.video_stream.TimedVideoStream(video_filename, height=- 1, width=- 1, start_frame=0, max_frames=0, rgb=False, override_fps=0)

Video iterator opening both a video stream and a file of timestamps. If it does not exist, it generates them with a regular period of the frequency described in the metadata of the video or 1/override_fps if given. Timestamps are delivered in microseconds.

Parameters
  • input_path (str) – Path to the file to read.

  • height (int) – Height of the output frames.

  • width (int) – Width of the output frames.

  • start_frame (int) – First frame to seek to

  • max_frames (int) – Maximum number of frames loaded (if greater than the number of frames in the video it will return all of them).

  • rgb (bool) – Whether the output should be in rgb or greyscale.

  • override_fps (int) – Frequency of the generated timestamps Hz (if timestamp file is not available)

  • equal to 0 (If) –

  • frequency will be taken from the video's metadata (the) –

get_size()

Function returning the size of the imager which produced the events.

Returns

Tuple of int (height, width) which might be (None, None)

Generic data Streaming

Module that enables Parallel Multistreaming.

We define an IterableDataset that streams several iterables. When fed to a Pytorch DataLoader with batch_size=None, this streams batches from one worker at a time. This has the effect of enabling parallel streaming.

The StreamLoader is a class built on top of DataLoader, that fuses batches together so batches are always temporally coherent.

Notice that you can also avoid using this fusion and just use a regular DataLoader, and have multiple neural networks indexed by worker’s id.

class metavision_core_ml.data.stream_dataloader.StreamDataLoader(dataset, num_workers, collate_fn)

Wraps around the DataLoader to handle the asynchronous batches.

Parameters
  • dataset (StreamDataset) – dataset streaming multiple iterables

  • num_workers (int) – number of workers

  • collate_fn (function) – function to collate batch parts

cpu()

Sets the StreamDataLoader to leave tensors on CPU.

cuda(device=device(type='cuda'))

Sets the StreamDataLoader to copy tensors to GPU memory before returning them.

Parameters

device (torch.device) – The destination GPU device. Defaults to the current CUDA device.

to(device)

Sets the StreamDataLoader to copy tensors to the given device before returning them.

Parameters

device (torch.device) – The destination GPU device. For instance torch.device(‘cpu’) or torch.device(‘cuda’).

class metavision_core_ml.data.stream_dataloader.StreamDataset(stream_list, streamer, batch_size, padding_mode, fill_value, seed=None)

Stream Dataset An Iterable Dataset zipping a group of iterables streams together.

Parameters
  • stream_list (list) – list of streams (path/ metadata)

  • streamer (object) – an iterator (user defined)

  • batch_size (int) – total batch size

  • padding_mode (str) – “zeros” “data” or “none”, see “get_zip” function

  • fill_value (object) – padding value

  • seed (int) – seed integer to make the dataloading deterministic

metavision_core_ml.data.stream_dataloader.resample_to_batch_size(stream_list, batch_size)

Resamples list to fit batch_size iterators

Parameters
  • stream_list (list) – list of streams

  • batch_size (int) – batch size

metavision_core_ml.data.stream_dataloader.split_batch_size(batch_size, num_workers)

Returns the number of files to handle per worker

Parameters
  • batch_size (int) – total batch size

  • num_workers (int) – number of workers

metavision_core_ml.data.stream_dataloader.split_dataset_sizes(stream_list, split_sizes)

Splits with different sizes proportional to the number of files each worker has to handle.

Parameters
  • stream_list (list) – list of stream path

  • split_sizes (list) – batch size per worker