SDK ML Algorithms
-
class
Metavision
::
CDProcessing
Processes CD event to compute neural network input frame (3 dimensional tensor)
This is the base class. It handles the rescaling of the events if necessary. It also provides accessors to get the shape of the output tensor. Derived class implement the computation. Calling operator() on this base class triggers the computation
Public Functions
-
inline
CDProcessing
(timestamp delta_t, int network_input_width, int network_input_height, int event_input_width = 0, int event_input_height = 0, bool use_CHW = true) Constructs a CDProcessing object to ease the neural network input frame.
- Parameters
delta_t – Delta time used to accumulate events inside the frame
network_input_width – Neural network input frame’s width
network_input_height – Neural network input frame’s height
event_input_width – Sensor’s width
event_input_height – Sensor’s height
use_CHW – Boolean to define frame dimension order, True if the fields’ frame order is (Channel, Height, Width)
-
inline size_t
get_frame_size
() const Gets the frame size.
- Returns
the frame size in pixel (height * width * channels)
-
inline size_t
get_frame_width
() const Gets the network’s input frame’s width.
- Returns
Network input frame’s width
-
inline size_t
get_frame_height
() const Gets the network’s input frame’s height.
- Returns
Network input frame’s height
-
inline size_t
get_frame_channels
() const Gets the number of channel in network input frame.
- Returns
Number of channel in network input frame
-
inline bool
is_CHW
() const Checks the tensor’s dimension order.
- Returns
true if the dimension order is (channel, height, width)
-
inline std::vector<size_t>
get_frame_shape
() const Gets the shape of the frame (3 dim, either CHW or HWC)
- Returns
a vector of sizes
-
template<typename
InputIt
>
inline voidoperator()
(const timestamp cur_frame_start_ts, InputIt begin, InputIt end, float *frame, int frame_size) const Updates the frame depending on the input events.
- Template Parameters
InputIt – type of input iterator (either a container iterator or raw pointer to EventCD)
- Parameters
cur_frame_start_ts – starting timestamp of the current frame
begin – Begin iterator
end – End iterator
frame – Pointer to the frame (input/output)
frame_size – Input frame size
-
inline
-
class
Metavision
::
NonMaximumSuppressionWithRescaling
Rescales events from network input format to the sensor’s size and suppresses Non-Maximum overlapping boxes.
Public Functions
-
inline
NonMaximumSuppressionWithRescaling
() Builds non configured NonMaximumSuppressionWithRescaling object.
-
inline
NonMaximumSuppressionWithRescaling
(std::size_t num_classes, int events_input_width, int events_input_height, int network_input_width, int network_input_height, float iou_threshold) Constructs object that rescales detected boxes and suppresses Non-Maximum overlapping boxes.
- Parameters
num_classes – Number of possible class returned by neural network
events_input_width – Sensor’s width
events_input_height – Sensor’s height
network_input_width – Neural network input frame’s width
network_input_height – Neural network input frame’s height
iou_threshold – Threshold on IOU metrics to consider that two boxes are matching
-
template<typename
InputIt
, typenameOutputIt
>
inline voidprocess_events
(const InputIt it_begin, const InputIt it_end, OutputIt inserter) Rescales and filters boxes.
- Template Parameters
InputIt – Read-Only input iterator type
OutputIt – Read-Write output iterator type
- Parameters
it_begin – Iterator to the first box
it_end – Iterator to the past-the-end box
inserter – Output iterator or back inserter
-
inline void
set_iou_threshold
(float threshold) Sets Intersection Over Union (IOU) threshold.
Note
Intersection Over Union (IOU) is the ratio of the intersection area over union area
- Parameters
threshold – Threshold on IOU metrics to consider that two boxes are matching
-
inline void
ignore_class_id
(std::size_t class_id) Configures the computation to ignore some class identifier.
- Parameters
class_id – Identifier of the class to be ignored
Public Static Functions
-
static inline void
compute_nms_per_class
(std::list<EventBbox> &bbox_list, float iou_threshold) Suppresses non-maximum overlapping boxes over a list of EventBbox-es.
Note
The list is modified in-place. The result is sorted by confidence.
- Parameters
bbox_list – [inout] List of EventBbox on which to apply the Non-maximum suppression
iou_threshold – Threshold above which two boxes are considered to overlap
-
inline
-
class
Metavision
::
ObjectDetectorTorchJit
Public Functions
-
inline
ObjectDetectorTorchJit
(const std::string &directory, int frame_width, int frame_height, int network_input_width = 0, int network_input_height = 0, bool use_cuda = false, int ignore_first_n_prediction_steps = 0, int gpu_id = 0) Constructor for ObjectDetectorTorchJit.
Note
When network_input_width and network_input_height are different from frame_width and frame_height, the corresponding rescaling is performed on the output bounding boxes, such that the output detection are still returned in the original input frame of the events
- Parameters
directory – Name of the directory containing at least two files:
model.ptjit : PyTorch model exported using torch.jit
info_ssd_jit.json : JSON file which contains several information about the neural network (type of input features, dimensions, accumulation time, list of classes, default thresholds, etc.)
frame_width – Sensor’s width
frame_height – Sensor’s height
network_input_width – Neural network’s width which could be smaller than frame_width. In this case the network will work on a downscaled size
network_input_height – Neural network’s height which could be smaller than frame_height. In this case the network will work a downscaled size
use_cuda – Boolean to indicate if we use gpu or not
ignore_first_n_prediction_steps – Number of discarded neural network predictions at the beginning of a sequence. Depending on initial conditions, recurrent models sometimes have a transitory regime in which they initially produce unreliable detections before they enter normal working regime.
gpu_id – GPU identification number that allows the selection of the gpu if several are available.
-
inline void
use_cpu
() Performs all computations on the CPU.
-
inline bool
use_gpu_if_available
(int gpu_id = 0) Performs the computations on the GPU if there is one.
- Parameters
gpu_id – ID of the gpu on which the computations must be performed
- Returns
Boolean to indicate if the provided gpu_id is available
-
template<typename
OutputIt
>
inline voidprocess
(Frame_t &input, OutputIt bbox_first, timestamp ts) Computes the detection given the provided input tensor.
- Parameters
input – Chunk of memory which corresponds to input tensor
bbox_first – Output iterator to add the detection boxes
ts – Timestamp of current timestep. Output boxes will have this timestamp
-
inline bool
is_half
() const Returns true if the model runs at half precision.
- Returns
Whether the model runs at half precision
-
inline int
get_network_height
() const Returns the input frame height.
- Returns
Network input height in pixels
-
inline int
get_network_width
() const Returns the input frame width.
- Returns
Network input width in pixels
-
inline int
get_network_input_channels
() const Returns the number of channels in the input frame.
- Returns
Network input channel number
-
inline int
get_network_input_size
() const Returns the network input size.
- Returns
Size of the input frame
-
inline Metavision::timestamp
get_accumulation_time
() const Returns the time during which the events are accumulated to compute the NN input tensor.
- Returns
Delta time used to generate the input frame
-
inline CDProcessing &
get_cd_processor
() Returns the object responsible for computing the content of the input tensor.
- Returns
CDProcessing to ease the input frame generation
-
inline const std::vector<std::string> &
get_labels
() const Returns a vector of labels for the classes of the neural network.
- Returns
Vector of strings containing labels
-
inline void
set_ts
(Metavision::timestamp ts) Initializes the internal timestamp of the object detector.
This is needed in order to use the start_ts parameter in the pipeline to start at a ts > 0
- Parameters
ts – time at which the first slice of time starts
-
inline void
set_detection_threshold
(float threshold) Uses this detection threshold instead of the default value read from the JSON file.
This is the lower bound on the confidence score for a detection box to be accepted. It takes values in range ]0;1[ Low value -> more detections High value -> less detections
- Parameters
threshold – Lower bound on the detector confidence score
-
inline void
set_iou_threshold
(float threshold) Uses this IOU threshold for NMS instead of the default value read from the JSON file.
Non-Maximum suppression discards detection boxes which are too similar to each other, keeping only the best one of such group. This similarity criterion is based on the measure of Intersection-Over-Union between the considered boxes. This threshold is the upper bound on the IOU for two boxes to be considered distinct (and therefore not filtered out by the Non-Maximum Suppression). It takes values in range ]0;1[ Low value -> less overlapping boxes High value -> more overlapping boxes
- Parameters
threshold – Upper bound on the IOU for two boxes to be considered distinct
-
inline void
reset
() Resets the memory cells of the neural network.
Neural networks used as object detectors are usually RNNs (typically LSTMs). Use this function to reset the memory of the neural network when feeding new inputs unrelated to the previous ones : call reset() before applying the same object detector on a new sequence
-
inline