Inference Pipeline of Object Classification

The script allows you to quickly set up an inference pipeline for object classification.

The source code of this sample can be found in <install-prefix>/share/metavision/sdk/ml/python_samples/classification_inference when installing Metavision SDK from installer or packages. For other deployment methods, check the page Path of Samples.

Expected Output

The script takes event stream as input and generates a sequence of predictions.

The demo below shows a live Rock-Paper-Scissor game based on our inference script:

Setup & requirements

To run the script, you will need:

  • a pre-trained TorchScript model with a JSON file of hyperparameters (e.g. convRNN_chifoumi.zip from our pre-trained models

  • an event-based camera or a RAW, DAT or HDF5 input file.

Note

The model might be sensitive to the input frame resolution. If you have downsampled the tensor input during training, it’s better to pass the same tensor resolution during the inference as well, by using the “–height-width” argument.

How to start

To start the script based on recorded data, you need to provide the full path to the pre-trained model and the path to the input file. Leave the file path empty if you want to use a live camera. For example:

Linux

python3 classification_inference.py /path/to/model -p "gesture_a.raw"

Windows

python classification_inference.py /path/to/model -p "gesture_a.raw"

Note

  1. To read directly from a camera, provide the camera serial number if there are several cameras otherwise leave it blank.

  2. Use -w /path/to/output to generate a mp4 video

Warning

Normally you should set the accumulation time interval (--delta-t) the same value as the one during the training. But if there is bandwidth constraint to run it live, you can try to increase the value accordingly, at a potential loss of accuracy.

To find the full list of options, run:

Linux

python3 classification_inference.py -h

Windows

python classification_inference.py -h