Note

This C++ sample has a corresponding Python sample.

Generic Tracking using C++

The Analytics API provides the Metavision::TrackingAlgorithm class for generic object tracking.

The sample metavision_generic_tracking shows how to use Metavision::TrackingAlgorithm to track objects.

Note that the Analytics API also provides a lighter implementation of Metavision::TrackingAlgorithm restricted to non-colliding objects: Metavision::SpatterTrackerAlgorithm demonstrated in the Spatter Tracking sample.

The source code of this sample can be found in <install-prefix>/share/metavision/sdk/analytics/cpp_samples/metavision_generic_tracking when installing Metavision SDK from installer or packages. For other deployment methods, check the page Path of Samples.

Expected Output

Metavision Generic Tracking sample visualizes all events and the tracked objects by showing a bounding box around each tracked object with an ID of the tracked object shown next to the bounding box:

Setup & requirements

By default, Metavision Generic Tracking looks for objects of size between 10x10 an 300x300 pixels. Use the command line options --min-size and --max-size to adapt the sample to your scene.

How to start

You can directly execute pre-compiled binary installed with Metavision SDK or compile the source code as described in this tutorial.

To start the sample based on recorded data, provide the full path to a RAW file (here, we use a file from our Sample Recordings):

Linux

./metavision_generic_tracking -i traffic_monitoring.raw

Windows

metavision_generic_tracking.exe -i traffic_monitoring.raw

To start the sample based on the live stream from your camera, run:

Linux

./metavision_generic_tracking

Windows

metavision_generic_tracking.exe

When using a live camera, we recommend to use the sensor’s STC (available on Gen4.1 sensors and newer). To do so, you can use the command line option --input-camera-config (or -j) with a JSON file containing the STC settings (to create such a JSON file, check the camera settings section). If your sensor does not have an STC or if you don’t enable it, our software STC algorithm will be enabled (see SpatioTemporalContrast class). You can tune the threshold of this algorithm using the --sw-stc-threshold option and disable it by passing the value 0 to this option.

Here is how to launch the sample with a JSON camera settings file:

Linux

./metavision_generic_tracking -j path/to/my_settings.json

Windows

metavision_generic_tracking.exe -j path\to\my_settings.json

To check for additional options:

Linux

./metavision_generic_tracking -h

Windows

metavision_generic_tracking.exe -h

Code Overview

The sample implements the following pipeline:

Note

The pipeline also allows to apply a software filter on the events using the algorithm SpatioTemporalContrastAlgorithm that can be enabled and configured using the --stc-threshold command line option. Other software filters could be used like ActivityNoiseFilter or TrailFilter. Alternatively, to filter out events on a live camera, you could enable some hardware filters of the ESP blocks of your sensor (on Gen4.1 and newer). For example you could configure an STC using the --input-camera-config option as explained above in the “How to start” section.

The tracking algorithm consumes CD events and produces tracking results (i.e. Metavision::EventTrackingData). Those tracking results contain the bounding boxes with unique IDs of tracked objects.

The algorithm is synchronous meaning that it will try to detect and track objects at every call. However, the tracking results may be affected by the amount of data passed during the call. That’s why the RollingEventBuffer and the EventBufferReslicer classes are both used in the sample to de-correlate the tracking frequency from the accumulation time.

Indeed, the RollingEventBuffer class first allows for creating overlapping time-slices of events by implementing a rolling window over an event stream. Two strategies are possible to define the rolling window width: every N events or every N us. The second strategy is used in this sample:

using RollingEventBufferConfig = Metavision::RollingEventBufferConfig;
event_buffer_                  = RollingEventBuffer(
    RollingEventBufferConfig::make_n_us(static_cast<Metavision::timestamp>(accumulation_time_us_)));

In addition to that, the EventBufferReslicer class is used to control the frequency at which the algorithm is called. To do so, the slicer is configured to re-slice the events at the expected frequency:

const auto update_period = static_cast<Metavision::timestamp>(1'000'000.f / update_frequency_);
const auto cond          = Metavision::EventBufferReslicerAlgorithm::Condition::make_n_us(update_period);
slicer_                  = std::make_unique<Metavision::EventBufferReslicerAlgorithm>(
    [&](Metavision::EventBufferReslicerAlgorithm::ConditionStatus status, Metavision::timestamp ts,
        std::size_t n_events) { tracker_callback(status, ts, n_events); },
    cond);

Events produced by the camera are first filtered and then pushed to the slicer which pushes them into the rolling buffer:

camera_->cd().add_callback([this](const Metavision::EventCD *begin, const Metavision::EventCD *end) {
    const Metavision::EventCD *begin_it = begin;
    const Metavision::EventCD *end_it   = end;
    filtered_events_.clear();

    if (stc_algo_) {
        filtered_events_.reserve(std::distance(begin_it, end_it));

        const auto last = stc_algo_->process_events(begin_it, end_it, filtered_events_.begin());
        const auto size = std::distance(filtered_events_.begin(), last);

        begin_it = filtered_events_.data();
        end_it   = begin_it + size;
    }

    slicer_->process_events(begin_it, end_it, [&](const auto sub_slice_begin_it, const auto sub_slice_end_it) {
        event_buffer_.insert_events(sub_slice_begin_it, sub_slice_end_it);
    });
});

When the end of the slice is detected the tracking callback is executed:

void Pipeline::tracker_callback(Metavision::EventBufferReslicerAlgorithm::ConditionStatus, Metavision::timestamp ts,
                                std::size_t) {
    tracked_objects_.clear();
    tracker_->process_events(event_buffer_.cbegin(), event_buffer_.cend(), std::back_inserter(tracked_objects_));

    for (const auto &obj : tracked_objects_)
        tracked_object_ids_.insert(obj.object_id_);

    if (display_) {
        Metavision::BaseFrameGenerationAlgorithm::generate_frame_from_events(event_buffer_.cbegin(),
                                                                             event_buffer_.cend(), back_img_);
        Metavision::draw_tracking_results(ts, tracked_objects_.cbegin(), tracked_objects_.cend(), back_img_);

        if (video_writer_)
            video_writer_->write_frame(ts, back_img_);

        if (window_)
            window_->show(back_img_);
    }
}

In this callback the tracking algorithm is applied on the events accumulated in the rolling buffer and the tracking results are immediately displayed if the visualization is enabled. The following image shows an example of output:

Expected Output from Metavision Tracking Sample

Note

Both the tracking frequency and the accumulation time are configurable with command line parameters

Algorithms Overview

The generic tracking algorithm consists of 4 main parts:

Cluster making
Data association
Tracker initialization
Tracking

The tracking algorithm can be configured via Metavision::TrackingConfig.

1. Cluster making

In the cluster making part, clusters are built from input events.

Two clustering methods are implemented:

Metavision::TrackingConfig::ClusterMaker::SIMPLE_GRID - builds clusters based on a regular grid (default):
- The camera FOV is divided in elementary cells using a regular grid; the size of the cells is defined by Metavision::TrackingConfig::cell_width_ x Metavision::TrackingConfig::cell_height_.
- For each cell, the number of events in the cell for a given time-slice (Metavision::TrackingConfig::cell_deltat_) is compared to the activation-threshold (Metavision::TrackingConfig::activation_threshold_), if it exceeds the threshold then the cell is considered as active
- Active cells are connected into clusters
Metavision::TrackingConfig::ClusterMaker::MEDOID_SHIFT - builds clusters from events based on spatial and temporal distances between neighboring events. If the spatial and temporal distances between the event and its neighboring event are smaller than Metavision::TrackingConfig::medoid_shift_spatial_dist_ and Metavision::TrackingConfig::medoid_shift_temporal_dist_, then the event goes to the same cluster as its neighbour, otherwise, it creates a new cluster.

2. Data association

In the data association part, clusters are associated to trackers.

The following data association methods are implemented:

Metavision::TrackingConfig::DataAssociation::NEAREST - associates a cluster to the nearest tracker
Metavision::TrackingConfig::DataAssociation::IOU - associates a cluster to the tracker with the largest IOU (Intersection Over Union) area between the tracker and the cluster (default). If no IOU association is available but the distance between the tracker and the cluster is smaller than Metavision::TrackingConfig::iou_max_dist_, then the cluster is associated using the nearest criterion. The IOU criterion has the priority over the distance criterion.

3. Tracker initialization

In the tracker initialization part, new trackers are initialized, and bounding box proposals are made from input clusters/events.

One of the following motion models is used to predict the position of the tracker:

Metavision::TrackingConfig::MotionModel::SIMPLE - assumes that the velocity is constant (default)
Metavision::TrackingConfig::MotionModel::INSTANT - takes the last measured velocity
Metavision::TrackingConfig::MotionModel::SMOOTH - models velocity as a smooth evolving quantity
Metavision::TrackingConfig::MotionModel::KALMAN - models velocity in a mode Kalman

4. Tracking

Two types of trackers are implemented:

Metavision::TrackingConfig::Tracker::ELLIPSE - based on an event by event update of the pose and the shape of the tracker. The tracker is represented as a Gaussian, and the update is performed by weighting each event contribution with Metavision::TrackingConfig::EllipseUpdateFunction and updating the tracker’s pose/size using Metavision::TrackingConfig::EllipseUpdateMethod.
Metavision::TrackingConfig::Tracker::CLUSTERKF - Kalman tracker (default) uses the result of clustering as an observation state to predict the current state of the tracker.