Inference Pipeline of Optical Flow using C++

Two models from our pre-trained models and their associated C++ samples are proposed to demonstrate Optical Flow inference from the Metavision SDK inference Model class:

  • model_flow.ptjit is a Torchjit model which takes single event cubes as input

  • model_flow.onnx is an ONNX model taking two successive histograms as input.

The source code of these samples can be found respectively in <install-prefix>/share/metavision/sdk/ml/cpp_samples/metavision_optical_flow and <install-prefix>/share/metavision/sdk/ml/cpp_samples/metavision_optical_flow_dual_input when installing Metavision SDK from installer or packages. For other deployment methods, check the page Path of Samples.

Warning

The SDK is delivered as pre-compiled binaries with the support for Torch models. Support for ONNX models can be enabled by compiling the SDK from source with the options -DUSE_ONNXRUNTIME=ON and -DONNXRUNTIME_DIR=<ONNX_FOLDER>.

Expected Output

These scripts work as follows:

  • Reads events from a camera or a recording

  • Preprocesses them to the dedicated format (event cubes or histograms here) into a Tensor structure thanks to an event preprocessor

  • Feeds these inputs to the Model instance

  • Retrieve the optical flow from the output tensor.

  • Displays it

An example of the output is shown below:


Warning

Disclaimer: The provided models are neither expected to provide state-of-the-art results, neither supposed to be computationally efficient. They provide a basis for ML exploration with our event-based technology. In particular, with default parameters and depending on the sensor used, they might not run live with your camera and ML pipeline.

Setup & requirements

To run the script, you will need:

Note

Since HDF5 tensor file contains preprocessed features, you need to be sure that the same preprocessing method is used for the flow model and the HDF5 file. For instance, our trained flow model uses event_cube method, so if you want to use HDF5 tensor files as input, they need to be processed with event_cube as well.

How to start

First, you need to compile the sample. Here, we assume you followed the Machine Learning Module Dependencies in the installation guide that requires to deploy libtorch in a LIBTORCH_DIR_PATH directory. If so, use those cmake commands to compile:

cmake .. -DCMAKE_PREFIX_PATH=`LIBTORCH_DIR_PATH` -DCMAKE_BUILD_TYPE=Release
cmake --build . --config Release

To start the script based on recorded data, you need to provide the full path to the input file and the path to the pre-trained model. For example:

Linux

metavision_optical_flow -m /path/to/model_flow.ptjit -i pedestrians.raw --display

Windows

metavision_optical_flow.exe -m /path/to/model_flow.ptjit -i pedestrians.raw --display

Note

  1. To read directly from a camera, you don’t need the -i option

  2. Use -o /path/to/output.avi to generate an AVI video

Warning

To make the best use of the model, it is important to correctly set the accumulation time interval (--delta-t) to match the object’s speed of motion. It is especially important for objects with high-speed motions: a long accumulation time would produce erroneous results.

To find the full list of options, run:

Linux

metavision_optical_flow -h

Windows

metavision_optical_flow.exe -h