SDK Core ML Data API
Synthetic video generation
Camera Pose Generator:
This module allows to define a trajectory of camera poses and generate continuous homographies by interpolating when maximum optical flow is beyond a predefined threshold.
- class metavision_core_ml.data.camera_poses.CameraPoseGenerator(height, width, max_frames=300, pause_probability=0.5, max_optical_flow_threshold=2.0, max_interp_consecutive_frames=20)
CameraPoseGenerator generates a series of continuous homographies with interpolation.
- Parameters
height (int) – height of image
width (int) – width of image
max_frames (int) – maximum number of poses
pause_probability (float) – probability that the sequence contains a pause
max_optical_flow_threshold (float) – maximum optical flow between two consecutive frames
max_interp_consecutive_frames (int) – maximum number of interpolated frames between two consecutive frames
- get_flow(rvec1, tvec1, rvec2, tvec2, height, width)
Computes Optical flow between 2 poses
- Parameters
rvec1 (np.array) – rotation vector 1
tvec1 (np.array) – translation vector 1
rvec2 (np.array) – rotation vector 2
tvec2 (np.array) – translation vector 2
nt (np.array) – plane normal
depth (float) – depth from camera
height (int) – height of image
width (int) – width of image
- get_image_transform(rvec1, tvec1, rvec2, tvec2)
Get Homography between 2 poses
- Parameters
rvec1 (np.array) – rotation vector 1
tvec1 (np.array) – translation vector 1
rvec2 (np.array) – rotation vector 2
tvec2 (np.array) – translation vector 2
- metavision_core_ml.data.camera_poses.add_random_pause(signal, max_pos_size_ratio=0.3)
Adds a random pause in a multidimensional signal
- Parameters
signal (np.array) – TxD signal
max_pos_size_ratio (float) – size of pause relative to signal Length
- metavision_core_ml.data.camera_poses.generate_homographies(rvecs, tvecs, nt, d)
Generates multiple homographies from rotation vectors
- Parameters
rvecs (np.array) – N,3 rotation vectors
tvecs (np.array) – N,3 translation vectors
nt (np.array) – normal to camera
d (float) – depth
- metavision_core_ml.data.camera_poses.generate_homographies_from_rotation_matrices(rot_mats, tvecs, nt, depth)
Generates multiple homographies from rotation matrices
- Parameters
rot_mats (np.array) – N,3,3 rotation matrices
tvecs (np.array) – N,3 translation vectors
nt (np.array) – normal to camera
depth (float) – depth to camera
- metavision_core_ml.data.camera_poses.generate_homography(rvec, tvec, nt, depth)
Generates a single homography
- Parameters
rvec (np.array) – rotation vector
tvec (np.array) – translation vector
nt (np.array) – normal to camera
depth (float) – depth to camera
- metavision_core_ml.data.camera_poses.generate_image_homographies_from_homographies(h, K, Kinv)
Multiplies homography left & right by intrinsic matrix
- Parameters
h (np.array) – homographies N,3,3
K (np.array) – intrinsic
Kinv (np.ndarray) – inverse intrinsic
- metavision_core_ml.data.camera_poses.generate_image_homography(rvec, tvec, nt, depth, K, Kinv)
Generates a single image homography
- Parameters
rvec (np.array) – rotation vector
tvec (np.array) – translation vector
nt (np.array) – normal to camera
depth (float) – depth
K (np.array) – intrinsic matrix
Kinv (np.array) – inverse intrinsic matrix
- metavision_core_ml.data.camera_poses.generate_smooth_signal(num_signals, num_samples, min_speed=0.0001, max_speed=0.1)
Generates a smooth signal
- Parameters
num_signals (int) – number of signals to generate
num_samples (int) – length of multidimensional signal
min_speed (float) – minimum rate of change
max_speed (float) – maximum rate of change
- metavision_core_ml.data.camera_poses.get_flow(rvec1, tvec1, rvec2, tvec2, nt, depth, K, Kinv, height, width)
Computes Optical Flow between 2 poses
- Parameters
rvec1 (np.array) – rotation vector 1
tvec1 (np.array) – translation vector 1
rvec2 (np.array) – rotation vector 2
tvec2 (np.array) – translation vector 2
nt (np.array) – plane normal
depth (float) – depth from camera
K (np.array) – intrinsic matrix
Kinv (np.array) – inverse intrinsic matrix
height (int) – height of image
width (int) – width of image
infinite (bool) – plan is infinite or not
- metavision_core_ml.data.camera_poses.get_grid(height, width)
Computes a 2d meshgrid
- Parameters
height (int) – height of grid
width (int) – width of grid
- metavision_core_ml.data.camera_poses.get_image_transform(rvec1, tvec1, rvec2, tvec2, nt, depth, K, Kinv)
Get image Homography between 2 poses (includes cam intrinsics)
- Parameters
rvec1 (np.array) – rotation vector 1
tvec1 (np.array) – translation vector 1
rvec2 (np.array) – rotation vector 2
tvec2 (np.array) – translation vector 2
nt (np.array) – plane normal
depth (float) – depth from camera
K (np.array) – intrinsic
Kinv (np.ndarray) – inverse intrinsic
- metavision_core_ml.data.camera_poses.get_transform(rvec1, tvec1, rvec2, tvec2, nt, depth)
Get geometric Homography between 2 poses
- Parameters
rvec1 (np.array) – rotation vector 1
tvec1 (np.array) – translation vector 1
rvec2 (np.array) – rotation vector 2
tvec2 (np.array) – translation vector 2
nt (np.array) – plane normal
depth (float) – depth from camera
- metavision_core_ml.data.camera_poses.interpolate_poses(rvecs, tvecs, nt, depth, K, Kinv, height, width, opt_flow_threshold=2, max_frames_per_bin=20)
Interpolate given poses
- Parameters
rvecs (np.array) – N,3 rotation vectors
tvecs (np.array) – N,3 translation vectors
nt (np.array) – plane normal
depth (float) – depth to camera
K (np.array) – camera intrinsic
Kinv (np.array) – inverse camera intrinsic
height (int) – height of image
width (int) – width of image
opt_flow_threshold (float) – maximum flow threshold
max_frames_per_bin (int) – maximum number of pose interpolations between two consecutive poses of the original list of poses
- metavision_core_ml.data.camera_poses.interpolate_times_tvecs(tvecs, key_times, inter_tvecs, inter_times, nums)
Interpolates between key times times & translation vectors
- Parameters
tvecs (np.array) – key translation vectors (N, 3)
key_times (np.array) – key times (N, )
inter_tvecs (np.array) – interpolated translations (nums.sum(), 3)
inter_times (np.array) – interpolated times (nums.sum(),)
nums (np.array) – number of interpolation point between key points (N-1,) nums[i] is the number of points between key_times[i] (included) and key_times[i+1] (excluded) minimum is 1, which corresponds to key_times[i]
6-DOF motion in front of image plane All in numpy + OpenCV Applies continuous homographies to your picture in time. Also you can get the optical flow for this motion.
- class metavision_core_ml.data.image_planar_motion_stream.PlanarMotionStream(image_filename, height, width, max_frames=1000, rgb=False, infinite=True, pause_probability=0.5, max_optical_flow_threshold=2.0, max_interp_consecutive_frames=20, crop_image=False, saturation_max_factor=1.0)
Generates a planar motion in front of the image
- Parameters
image_filename (str) – path to image
height (int) – desired height
width (int) – desired width
max_frames (int) – number of frames to stream
rgb (bool) – color images or gray
infinite (bool) – border is mirrored
pause_probability (float) – probability to add a pause during the stream
max_optical_flow_threshold (float) – maximum optical flow between two consecutive frames
max_interp_consecutive_frames (int) – maximum number of interpolated frames between two consecutive frames
crop_image (bool) – crop images or resize them
saturation_max_factor (float) – multiplicative factor of saturated pixels (only for tiff 16 bits images. Use 1.0 to disable)
6-DOF motion in front of image plane and returns corners positions All in numpy + OpenCV Applies continuous homographies to your picture in time.
- class metavision_core_ml.data.corner_planar_motion_stream.CornerPlanarMotionStream(image_filename, height, width, max_frames=1000, rgb=False, infinite=True, pause_probability=0.5, draw_corners_as_circle=True)
Generates a planar motion in front of the image, returning both images and Harris’ corners
- Parameters
image_filename – path to image
height – desired height
width – desired width
max_frames – number of frames to stream
rgb – color images or gray
infinite – border is mirrored
pause_probability – probability of stream to pause
draw_corners_as_circle – if true corners will be 2 pixels circles
Video Streaming
Scheduling System for Videos
- class metavision_core_ml.data.scheduling.Metadata(path, start_frame, end_frame)
Represents part of a file to be read.
- Parameters
path (str) – path to video
start_frame (int) – first frame to seek to
end_frame (int) – last frame to read
- metavision_core_ml.data.scheduling.build_image_metadata(folder, min_size, max_size, denominator=1)
Build Metadata from images
- Parameters
folder (str) – path to pictures
min_size (int) – minimum number of frames
max_size (int) – maximum number of frames
denominator (int) – num_frames will always be a multiple of denominator. It is used to avoid having batches that are missing some frames and need to be padded. This happens when the number of time steps is not a multiple of num_frames.
- metavision_core_ml.data.scheduling.build_metadata(folder, min_length, max_length, denominator=1)
Builds Metadata for Videos and Images
- Parameters
folder (str) – path to videos or images
min_length (int) – minimum number of frames
max_length (int) – maximum number of frames
denominator (int) – denominator of number of frames for image metadata
- metavision_core_ml.data.scheduling.build_tiff_image_metadata(folder, min_size, max_size, denominator=1)
Build Metadata from images
- Parameters
folder (str) – path to pictures
min_size (int) – minimum number of frames
max_size (int) – maximum number of frames
denominator (int) – num_frames will always be a multiple of denominator. It is used to avoid having batches that are missing some frames and need to be padded. This happens when the number of time steps is not a multiple of num_frames.
- metavision_core_ml.data.scheduling.build_video_metadata(folder)
Builds Metadata from videos
- Parameters
folder (str) – path to videos (only looks in current directory, not subfolders)
- metavision_core_ml.data.scheduling.split_video_metadata(metadatas, min_size, max_size)
Split video metadata into smaller ones.
- Parameters
metadatas (list) – list of metadata objects
min_size (int) – minimum number of frames
max_size (int) – maximum number of frames
Video .mp4 or .avi Iterator Right now backend is OpenCV
- class metavision_core_ml.data.video_stream.TimedVideoStream(video_filename, height=- 1, width=- 1, start_frame=0, max_frames=0, rgb=False, override_fps=0)
Video iterator opening both a video stream and a file of timestamps. If it does not exist, it generates them with a regular period of the frequency described in the metadata of the video or 1/override_fps if given. Timestamps are delivered in microseconds.
- Parameters
input_path (str) – Path to the file to read.
height (int) – Height of the output frames.
width (int) – Width of the output frames.
start_frame (int) – First frame to seek to
max_frames (int) – Maximum number of frames loaded (if greater than the number of frames in the video it will return all of them).
rgb (bool) – Whether the output should be in rgb or greyscale.
override_fps (int) – Frequency of the generated timestamps Hz (if timestamp file is not available)
0 (If equal to) –
metadata (the frequency will be taken from the video's) –
- get_size()
Function returning the size of the imager which produced the events.
- Returns
Tuple of int (height, width) which might be (None, None)
Generic data Streaming
Module that enables Parallel Multistreaming.
We define an IterableDataset that streams several iterables. When fed to a Pytorch DataLoader with batch_size=None, this streams batches from one worker at a time. This has the effect of enabling parallel streaming.
The StreamLoader is a class built on top of DataLoader, that fuses batches together so batches are always temporally coherent.
Notice that you can also avoid using this fusion and just use a regular DataLoader, and have multiple neural networks indexed by worker’s id.
- class metavision_core_ml.data.stream_dataloader.StreamDataLoader(dataset, num_workers, collate_fn)
Wraps around the DataLoader to handle the asynchronous batches.
- Parameters
dataset (StreamDataset) – dataset streaming multiple iterables
num_workers (int) – number of workers
collate_fn (function) – function to collate batch parts
- cpu()
Sets the StreamDataLoader to leave tensors on CPU.
- cuda(device=device(type='cuda'))
Sets the StreamDataLoader to copy tensors to GPU memory before returning them.
- Parameters
device (torch.device) – The destination GPU device. Defaults to the current CUDA device.
- to(device)
Sets the StreamDataLoader to copy tensors to the given device before returning them.
- Parameters
device (torch.device) – The destination GPU device. For instance torch.device(‘cpu’) or torch.device(‘cuda’).
- class metavision_core_ml.data.stream_dataloader.StreamDataset(stream_list, streamer, batch_size, padding_mode, fill_value, seed=None)
Stream Dataset An Iterable Dataset zipping a group of iterables streams together.
- Parameters
stream_list (list) – list of streams (path/ metadata)
streamer (object) – an iterator (user defined)
batch_size (int) – total batch size
padding_mode (str) – “zeros” “data” or “none”, see “get_zip” function
fill_value (object) – padding value
seed (int) – seed integer to make the dataloading deterministic
- metavision_core_ml.data.stream_dataloader.resample_to_batch_size(stream_list, batch_size)
Resamples list to fit batch_size iterators
- Parameters
stream_list (list) – list of streams
batch_size (int) – batch size
- metavision_core_ml.data.stream_dataloader.split_batch_size(batch_size, num_workers)
Returns the number of files to handle per worker
- Parameters
batch_size (int) – total batch size
num_workers (int) – number of workers
- metavision_core_ml.data.stream_dataloader.split_dataset_sizes(stream_list, split_sizes)
Splits with different sizes proportional to the number of files each worker has to handle.
- Parameters
stream_list (list) – list of stream path
split_sizes (list) – batch size per worker