Note
This C++ sample has a corresponding Python sample.
Generic Tracking using C++
The Analytics API provides the Metavision::TrackingAlgorithm
class for generic object tracking.
The sample metavision_generic_tracking
shows how to use Metavision::TrackingAlgorithm
to track objects.
Note that the Analytics API also provides a lighter implementation of Metavision::TrackingAlgorithm
restricted to non-colliding objects: Metavision::SpatterTrackerAlgorithm
demonstrated in the
Spatter Tracking sample.
The source code of this sample can be found in
<install-prefix>/share/metavision/sdk/analytics/cpp_samples/metavision_generic_tracking
when installing Metavision SDK
from installer or packages. For other deployment methods, check the page
Path of Samples.
Expected Output
Metavision Generic Tracking sample visualizes all events and the tracked objects by showing a bounding box around each tracked object with an ID of the tracked object shown next to the bounding box:
Setup & requirements
By default, Metavision Generic Tracking looks for objects of size between 10x10 an 300x300 pixels.
Use the command line options --min-size
and --max-size
to adapt the sample to your scene.
How to start
You can directly execute pre-compiled binary installed with Metavision SDK or compile the source code as described in this tutorial.
To start the sample based on recorded data, provide the full path to a RAW file (here, we use a file from our Sample Recordings):
Linux
./metavision_generic_tracking -i traffic_monitoring.raw
Windows
metavision_generic_tracking.exe -i traffic_monitoring.raw
To start the sample based on the live stream from your camera, run:
Linux
./metavision_generic_tracking
Windows
metavision_generic_tracking.exe
When using a live camera, we recommend to use the sensor’s STC (available on Gen4.1 sensors and newer).
To do so, you can use the command line option --input-camera-config
(or -j
) with a JSON file containing the STC settings
(to create such a JSON file, check the camera settings section).
If your sensor does not have an STC or if you don’t enable it, our software STC algorithm will be enabled
(see SpatioTemporalContrast class
).
You can tune the threshold of this algorithm using the --sw-stc-threshold
option and disable it by passing the value 0 to this option.
Here is how to launch the sample with a JSON camera settings file:
Linux
./metavision_generic_tracking -j path/to/my_settings.json
Windows
metavision_generic_tracking.exe -j path\to\my_settings.json
To check for additional options:
Linux
./metavision_generic_tracking -h
Windows
metavision_generic_tracking.exe -h
Code Overview
The sample implements the following pipeline:
Note
The pipeline also allows to apply a software filter on the events using the algorithm
SpatioTemporalContrastAlgorithm
that can be enabled and configured using the --stc-threshold
command line option.
Other software filters could be used like ActivityNoiseFilter
or TrailFilter
.
Alternatively, to filter out events on a live camera, you could enable some hardware filters of the ESP blocks
of your sensor (on Gen4.1 and newer). For example you could configure an STC using the --input-camera-config
option
as explained above in the “How to start” section.
The tracking algorithm consumes CD events and produces tracking results (i.e.
Metavision::EventTrackingData
).
Those tracking results contain the bounding boxes with unique IDs of tracked objects.
The algorithm is synchronous meaning that it will try to detect and track objects at every call. However, the tracking
results may be affected by the amount of data passed during the call. That’s why the
RollingEventBuffer
and the
EventBufferReslicer
classes are both used in the sample to
de-correlate the tracking frequency from the accumulation time.
Indeed, the RollingEventBuffer
class first allows for creating overlapping
time-slices of events by implementing a rolling window over an event stream. Two strategies are possible to define the
rolling window width: every N events or every N us. The second strategy is used in this sample:
using RollingEventBufferConfig = Metavision::RollingEventBufferConfig;
event_buffer_ = RollingEventBuffer(
RollingEventBufferConfig::make_n_us(static_cast<Metavision::timestamp>(accumulation_time_us_)));
In addition to that, the EventBufferReslicer
class is used to control
the frequency at which the algorithm is called. To do so, the slicer is configured to re-slice the events at the
expected frequency:
const auto update_period = static_cast<Metavision::timestamp>(1'000'000.f / update_frequency_);
const auto cond = Metavision::EventBufferReslicerAlgorithm::Condition::make_n_us(update_period);
slicer_ = std::make_unique<Metavision::EventBufferReslicerAlgorithm>(
[&](Metavision::EventBufferReslicerAlgorithm::ConditionStatus status, Metavision::timestamp ts,
std::size_t n_events) { tracker_callback(status, ts, n_events); },
cond);
Events produced by the camera are first filtered and then pushed to the slicer which pushes them into the rolling buffer:
camera_->cd().add_callback([this](const Metavision::EventCD *begin, const Metavision::EventCD *end) {
const Metavision::EventCD *begin_it = begin;
const Metavision::EventCD *end_it = end;
filtered_events_.clear();
if (stc_algo_) {
filtered_events_.reserve(std::distance(begin_it, end_it));
const auto last = stc_algo_->process_events(begin_it, end_it, filtered_events_.begin());
const auto size = std::distance(filtered_events_.begin(), last);
begin_it = filtered_events_.data();
end_it = begin_it + size;
}
slicer_->process_events(begin_it, end_it, [&](const auto sub_slice_begin_it, const auto sub_slice_end_it) {
event_buffer_.insert_events(sub_slice_begin_it, sub_slice_end_it);
});
});
When the end of the slice is detected the tracking callback is executed:
void Pipeline::tracker_callback(Metavision::EventBufferReslicerAlgorithm::ConditionStatus, Metavision::timestamp ts,
std::size_t) {
tracked_objects_.clear();
tracker_->process_events(event_buffer_.cbegin(), event_buffer_.cend(), std::back_inserter(tracked_objects_));
for (const auto &obj : tracked_objects_)
tracked_object_ids_.insert(obj.object_id_);
if (display_) {
Metavision::BaseFrameGenerationAlgorithm::generate_frame_from_events(event_buffer_.cbegin(),
event_buffer_.cend(), back_img_);
Metavision::draw_tracking_results(ts, tracked_objects_.cbegin(), tracked_objects_.cend(), back_img_);
if (video_writer_)
video_writer_->write_frame(ts, back_img_);
if (window_)
window_->show(back_img_);
}
}
In this callback the tracking algorithm is applied on the events accumulated in the rolling buffer and the tracking results are immediately displayed if the visualization is enabled. The following image shows an example of output:
Note
Both the tracking frequency and the accumulation time are configurable with command line parameters
Algorithms Overview
The generic tracking algorithm consists of 4 main parts:
Cluster making
Data association
Tracker initialization
Tracking
The tracking algorithm can be configured via Metavision::TrackingConfig
.
1. Cluster making
In the cluster making part, clusters are built from input events.
Two clustering methods are implemented:
Metavision::TrackingConfig::ClusterMaker::SIMPLE_GRID
- builds clusters based on a regular grid (default):The camera FOV is divided in elementary cells using a regular grid; the size of the cells is defined by
Metavision::TrackingConfig::cell_width_
xMetavision::TrackingConfig::cell_height_
.For each cell, the number of events in the cell for a given time-slice (
Metavision::TrackingConfig::cell_deltat_
) is compared to the activation-threshold (Metavision::TrackingConfig::activation_threshold_
), if it exceeds the threshold then the cell is considered as activeActive cells are connected into clusters
Metavision::TrackingConfig::ClusterMaker::MEDOID_SHIFT
- builds clusters from events based on spatial and temporal distances between neighboring events. If the spatial and temporal distances between the event and its neighboring event are smaller thanMetavision::TrackingConfig::medoid_shift_spatial_dist_
andMetavision::TrackingConfig::medoid_shift_temporal_dist_
, then the event goes to the same cluster as its neighbour, otherwise, it creates a new cluster.
2. Data association
In the data association part, clusters are associated to trackers.
The following data association methods are implemented:
Metavision::TrackingConfig::DataAssociation::NEAREST
- associates a cluster to the nearest trackerMetavision::TrackingConfig::DataAssociation::IOU
- associates a cluster to the tracker with the largest IOU (Intersection Over Union) area between the tracker and the cluster (default). If no IOU association is available but the distance between the tracker and the cluster is smaller thanMetavision::TrackingConfig::iou_max_dist_
, then the cluster is associated using the nearest criterion. The IOU criterion has the priority over the distance criterion.
3. Tracker initialization
In the tracker initialization part, new trackers are initialized, and bounding box proposals are made from input clusters/events.
One of the following motion models is used to predict the position of the tracker:
Metavision::TrackingConfig::MotionModel::SIMPLE
- assumes that the velocity is constant (default)Metavision::TrackingConfig::MotionModel::INSTANT
- takes the last measured velocityMetavision::TrackingConfig::MotionModel::SMOOTH
- models velocity as a smooth evolving quantityMetavision::TrackingConfig::MotionModel::KALMAN
- models velocity in a mode Kalman
4. Tracking
Two types of trackers are implemented:
Metavision::TrackingConfig::Tracker::ELLIPSE
- based on an event by event update of the pose and the shape of the tracker. The tracker is represented as a Gaussian, and the update is performed by weighting each event contribution withMetavision::TrackingConfig::EllipseUpdateFunction
and updating the tracker’s pose/size usingMetavision::TrackingConfig::EllipseUpdateMethod
.Metavision::TrackingConfig::Tracker::CLUSTERKF
- Kalman tracker (default) uses the result of clustering as an observation state to predict the current state of the tracker.