SDK Core ML Data API
Synthetic video generation
Camera Pose Generator:
This module allows to define a trajectory of camera poses and generate continuous homographies by interpolating when maximum optical flow is beyond a predefined threshold.
-
class
metavision_core_ml.data.camera_poses.
CameraPoseGenerator
(height, width, max_frames=300, pause_probability=0.5, max_optical_flow_threshold=2.0, max_interp_consecutive_frames=20) CameraPoseGenerator generates a series of continuous homographies with interpolation.
- Parameters
height (int) – height of image
width (int) – width of image
max_frames (int) – maximum number of poses
pause_probability (float) – probability that the sequence contains a pause
max_optical_flow_threshold (float) – maximum optical flow between two consecutive frames
max_interp_consecutive_frames (int) – maximum number of interpolated frames between two consecutive frames
-
get_flow
(rvec1, tvec1, rvec2, tvec2, height, width) Computes Optical flow between 2 poses
- Parameters
rvec1 (np.array) – rotation vector 1
tvec1 (np.array) – translation vector 1
rvec2 (np.array) – rotation vector 2
tvec2 (np.array) – translation vector 2
nt (np.array) – plane normal
depth (float) – depth from camera
height (int) – height of image
width (int) – width of image
-
get_image_transform
(rvec1, tvec1, rvec2, tvec2) Get Homography between 2 poses
- Parameters
rvec1 (np.array) – rotation vector 1
tvec1 (np.array) – translation vector 1
rvec2 (np.array) – rotation vector 2
tvec2 (np.array) – translation vector 2
-
metavision_core_ml.data.camera_poses.
add_random_pause
(signal, max_pos_size_ratio=0.3) Adds a random pause in a multidimensional signal
- Parameters
signal (np.array) – TxD signal
max_pos_size_ratio (float) – size of pause relative to signal Length
-
metavision_core_ml.data.camera_poses.
generate_homographies
(rvecs, tvecs, nt, d) Generates multiple homographies from rotation vectors
- Parameters
rvecs (np.array) – N,3 rotation vectors
tvecs (np.array) – N,3 translation vectors
nt (np.array) – normal to camera
d (float) – depth
-
metavision_core_ml.data.camera_poses.
generate_homographies_from_rotation_matrices
(rot_mats, tvecs, nt, depth) Generates multiple homographies from rotation matrices
- Parameters
rot_mats (np.array) – N,3,3 rotation matrices
tvecs (np.array) – N,3 translation vectors
nt (np.array) – normal to camera
depth (float) – depth to camera
-
metavision_core_ml.data.camera_poses.
generate_homography
(rvec, tvec, nt, depth) Generates a single homography
- Parameters
rvec (np.array) – rotation vector
tvec (np.array) – translation vector
nt (np.array) – normal to camera
depth (float) – depth to camera
-
metavision_core_ml.data.camera_poses.
generate_image_homographies_from_homographies
(h, K, Kinv) Multiplies homography left & right by intrinsic matrix
- Parameters
h (np.array) – homographies N,3,3
K (np.array) – intrinsic
Kinv (np.ndarray) – inverse intrinsic
-
metavision_core_ml.data.camera_poses.
generate_image_homography
(rvec, tvec, nt, depth, K, Kinv) Generates a single image homography
- Parameters
rvec (np.array) – rotation vector
tvec (np.array) – translation vector
nt (np.array) – normal to camera
depth (float) – depth
K (np.array) – intrinsic matrix
Kinv (np.array) – inverse intrinsic matrix
-
metavision_core_ml.data.camera_poses.
generate_smooth_signal
(num_signals, num_samples, min_speed=0.0001, max_speed=0.1) Generates a smooth signal
- Parameters
num_signals (int) – number of signals to generate
num_samples (int) – length of multidimensional signal
min_speed (float) – minimum rate of change
max_speed (float) – maximum rate of change
-
metavision_core_ml.data.camera_poses.
get_flow
(rvec1, tvec1, rvec2, tvec2, nt, depth, K, Kinv, height, width) Computes Optical Flow between 2 poses
- Parameters
rvec1 (np.array) – rotation vector 1
tvec1 (np.array) – translation vector 1
rvec2 (np.array) – rotation vector 2
tvec2 (np.array) – translation vector 2
nt (np.array) – plane normal
depth (float) – depth from camera
K (np.array) – intrinsic matrix
Kinv (np.array) – inverse intrinsic matrix
height (int) – height of image
width (int) – width of image
infinite (bool) – plan is infinite or not
-
metavision_core_ml.data.camera_poses.
get_grid
(height, width) Computes a 2d meshgrid
- Parameters
height (int) – height of grid
width (int) – width of grid
-
metavision_core_ml.data.camera_poses.
get_image_transform
(rvec1, tvec1, rvec2, tvec2, nt, depth, K, Kinv) Get image Homography between 2 poses (includes cam intrinsics)
- Parameters
rvec1 (np.array) – rotation vector 1
tvec1 (np.array) – translation vector 1
rvec2 (np.array) – rotation vector 2
tvec2 (np.array) – translation vector 2
nt (np.array) – plane normal
depth (float) – depth from camera
K (np.array) – intrinsic
Kinv (np.ndarray) – inverse intrinsic
-
metavision_core_ml.data.camera_poses.
get_transform
(rvec1, tvec1, rvec2, tvec2, nt, depth) Get geometric Homography between 2 poses
- Parameters
rvec1 (np.array) – rotation vector 1
tvec1 (np.array) – translation vector 1
rvec2 (np.array) – rotation vector 2
tvec2 (np.array) – translation vector 2
nt (np.array) – plane normal
depth (float) – depth from camera
-
metavision_core_ml.data.camera_poses.
interpolate_poses
(rvecs, tvecs, nt, depth, K, Kinv, height, width, opt_flow_threshold=2, max_frames_per_bin=20) Interpolate given poses
- Parameters
rvecs (np.array) – N,3 rotation vectors
tvecs (np.array) – N,3 translation vectors
nt (np.array) – plane normal
depth (float) – depth to camera
K (np.array) – camera intrinsic
Kinv (np.array) – inverse camera intrinsic
height (int) – height of image
width (int) – width of image
opt_flow_threshold (float) – maximum flow threshold
max_frames_per_bin (int) – maximum number of pose interpolations between two consecutive poses of the original list of poses
-
metavision_core_ml.data.camera_poses.
interpolate_times_tvecs
(tvecs, key_times, inter_tvecs, inter_times, nums) Interpolates between key times times & translation vectors
- Parameters
tvecs (np.array) – key translation vectors (N, 3)
key_times (np.array) – key times (N, )
inter_tvecs (np.array) – interpolated translations (nums.sum(), 3)
inter_times (np.array) – interpolated times (nums.sum(),)
nums (np.array) – number of interpolation point between key points (N-1,) nums[i] is the number of points between key_times[i] (included) and key_times[i+1] (excluded) minimum is 1, which corresponds to key_times[i]
6-DOF motion in front of image plane All in numpy + OpenCV Applies continuous homographies to your picture in time. Also you can get the optical flow for this motion.
-
class
metavision_core_ml.data.image_planar_motion_stream.
PlanarMotionStream
(image_filename, height, width, max_frames=1000, rgb=False, infinite=True, pause_probability=0.5, max_optical_flow_threshold=2.0, max_interp_consecutive_frames=20, crop_image=False) Generates a planar motion in front of the image
- Parameters
image_filename (str) – path to image
height (int) – desired height
width (int) – desired width
max_frames (int) – number of frames to stream
rgb (bool) – color images or gray
infinite (bool) – border is mirrored
pause_probability (float) – probability to add a pause during the stream
max_optical_flow_threshold (float) – maximum optical flow between two consecutive frames
max_interp_consecutive_frames (int) – maximum number of interpolated frames between two consecutive frames
crop_image (bool) – crop images or resize them
6-DOF motion in front of image plane and returns corners positions All in numpy + OpenCV Applies continuous homographies to your picture in time.
-
class
metavision_core_ml.data.corner_planar_motion_stream.
CornerPlanarMotionStream
(image_filename, height, width, max_frames=1000, rgb=False, infinite=True, pause_probability=0.5, draw_corners_as_circle=True) Generates a planar motion in front of the image, returning both images and Harris’ corners
- Parameters
image_filename – path to image
height – desired height
width – desired width
max_frames – number of frames to stream
rgb – color images or gray
infinite – border is mirrored
pause_probability – probability of stream to pause
draw_corners_as_circle – if true corners will be 2 pixels circles
Video Streaming
Scheduling System for Videos
-
class
metavision_core_ml.data.scheduling.
Metadata
(path, start_frame, end_frame) Represents part of a file to be read.
- Parameters
path (str) – path to video
start_frame (int) – first frame to seek to
end_frame (int) – last frame to read
-
metavision_core_ml.data.scheduling.
build_image_metadata
(folder, min_size, max_size, denominator=1) Build Metadata from images
- Parameters
folder (str) – path to pictures
min_size (int) – minimum number of frames
max_size (int) – maximum number of frames
denominator (int) – num_frames will always be a multiple of denominator. It is used to avoid having batches that are missing some frames and need to be padded. This happens when the number of time steps is not a multiple of num_frames.
-
metavision_core_ml.data.scheduling.
build_metadata
(folder, min_length, max_length, denominator=1) Builds Metadata for Videos and Images
- Parameters
folder (str) – path to videos or images
min_length (int) – minimum number of frames
max_length (int) – maximum number of frames
denominator (int) – denominator of number of frames for image metadata
-
metavision_core_ml.data.scheduling.
build_video_metadata
(folder) Builds Metadata from videos
- Parameters
folder (str) – path to videos (only looks in current directory, not subfolders)
-
metavision_core_ml.data.scheduling.
split_video_metadata
(metadatas, min_size, max_size) Split video metadata into smaller ones.
- Parameters
metadatas (list) – list of metadata objects
min_size (int) – minimum number of frames
max_size (int) – maximum number of frames
Video .mp4 or .avi Iterator Right now backend is OpenCV
-
class
metavision_core_ml.data.video_stream.
TimedVideoStream
(video_filename, height=- 1, width=- 1, start_frame=0, max_frames=0, rgb=False, override_fps=0) Video iterator opening both a video stream and a file of timestamps. If it does not exist, it generates them with a regular period of the frequency described in the metadata of the video or 1/override_fps if given. Timestamps are delivered in microseconds.
- Parameters
input_path (str) – Path to the file to read.
height (int) – Height of the output frames.
width (int) – Width of the output frames.
start_frame (int) – First frame to seek to
max_frames (int) – Maximum number of frames loaded (if greater than the number of frames in the video it will return all of them).
rgb (bool) – Whether the output should be in rgb or greyscale.
override_fps (int) – Frequency of the generated timestamps Hz (if timestamp file is not available)
equal to 0 (If) –
frequency will be taken from the video's metadata (the) –
-
get_size
() Function returning the size of the imager which produced the events.
- Returns
Tuple of int (height, width) which might be (None, None)
Generic data Streaming
Module that enables Parallel Multistreaming.
We define an IterableDataset that streams several iterables. When fed to a Pytorch DataLoader with batch_size=None, this streams batches from one worker at a time. This has the effect of enabling parallel streaming.
The StreamLoader is a class built on top of DataLoader, that fuses batches together so batches are always temporally coherent.
Notice that you can also avoid using this fusion and just use a regular DataLoader, and have multiple neural networks indexed by worker’s id.
-
class
metavision_core_ml.data.stream_dataloader.
StreamDataLoader
(dataset, num_workers, collate_fn) Wraps around the DataLoader to handle the asynchronous batches.
- Parameters
dataset (StreamDataset) – dataset streaming multiple iterables
num_workers (int) – number of workers
collate_fn (function) – function to collate batch parts
-
cpu
() Sets the StreamDataLoader to leave tensors on CPU.
-
cuda
(device=device(type='cuda')) Sets the StreamDataLoader to copy tensors to GPU memory before returning them.
- Parameters
device (torch.device) – The destination GPU device. Defaults to the current CUDA device.
-
to
(device) Sets the StreamDataLoader to copy tensors to the given device before returning them.
- Parameters
device (torch.device) – The destination GPU device. For instance torch.device(‘cpu’) or torch.device(‘cuda’).
-
class
metavision_core_ml.data.stream_dataloader.
StreamDataset
(stream_list, streamer, batch_size, padding_mode, fill_value, seed=None) Stream Dataset An Iterable Dataset zipping a group of iterables streams together.
- Parameters
stream_list (list) – list of streams (path/ metadata)
streamer (object) – an iterator (user defined)
batch_size (int) – total batch size
padding_mode (str) – “zeros” “data” or “none”, see “get_zip” function
fill_value (object) – padding value
seed (int) – seed integer to make the dataloading deterministic
-
metavision_core_ml.data.stream_dataloader.
resample_to_batch_size
(stream_list, batch_size) Resamples list to fit batch_size iterators
- Parameters
stream_list (list) – list of streams
batch_size (int) – batch size
-
metavision_core_ml.data.stream_dataloader.
split_batch_size
(batch_size, num_workers) Returns the number of files to handle per worker
- Parameters
batch_size (int) – total batch size
num_workers (int) – number of workers
-
metavision_core_ml.data.stream_dataloader.
split_dataset_sizes
(stream_list, split_sizes) Splits with different sizes proportional to the number of files each worker has to handle.
- Parameters
stream_list (list) – list of stream path
split_sizes (list) – batch size per worker