Skip to content


Thumbnail collection generator

Register name used to load filter: thumbs
This is a JavaScript filter, not checked during graph resolution and needs explicit loading.
Author: GPAC team

This filter generates screenshots from a video stream.

The input video is down-sampled by the scale factor. The output size is configured based on the number of images per line and per column in the grid.
Once configured, the output size is no longer modified.

The snap option indicates to use one video frame every given seconds. If value is 0, all input frames are used.

If the number of rows is 0, it will be computed based on the source duration and desired snap time, and will default to 10 if it cannot be resolved.

To output one image per input frame, use :grid=1x1.

If a single image per output frame is used, the default value for snap is 0 and for scale is 1.
Otherwise, the default value for snap is 1 second and for scale is 10.

A single line of text can be inserted over each frame. Predefined keywords can be used in input text, identified as $KEYWORD$:
ts: replaced by packet timestamp
timescale: replaced by PID timescale
time: replaced by packet time as
cpu: replaced by current CPU usage of process
mem: replaced by current memory usage of process
version: replaced by GPAC version
fversion: replaced by GPAC full version
mae: replaced by Mean Absolute Error with previous frame
mse: replaced by Mean Square Error with previous frame
P4CC, PropName: replaced by corresponding PID property


gpac -i src reframer:saps=1 thumbs:snap=30:grid=6x30 -o dump/$num$.png
This will generate images from key-frames only, inserting one image every 30 seconds. Using key-frame filtering is much faster but may give unexpected results if there are not enough key-frames in the source.


gpac -i src thumbs:snap=0:grid=5x5 -o dump/$num$.png
This will generate one image containing 25 frames every second at 25 fps.

If a single image per output frame is used and the scaling factor is 1, the input packet is reused as input with text and graphics overlaid.


gpac -i src thumbs:grid=1x1:txt='Frame $time$' -o dump/$num$.png
This will inject text over each frame and keep timing and other packet properties.

A json output can be specified in input list to let applications retrieve frame position in output image from its timing.

Scene change detection

The filter can compute the absolute and/or square error metrics between consecutive images and drop image if the computed metric is less than the given threshold.
If both mae and mse thresholds are 0, scene detection is not performed (default).
If both mae and mse thresholds are not 0, the frame is added if it passes both thresholds.

For both metrics, a value of 0 means all pixels are the same, a value of 100 means all pixels have 100% intensity difference (e.g. black versus white).

The scene detection is performed after the snap filtering and uses:
- the previous frame in the stream, whether it was added or not, if scref is not set,
- the last added frame otherwise.

Typical thresholds for scene cut detection are 14 to 20 for mae and 5 to 7 for mse.

Since this is a costly process, it is recommended to use it combined with key-frames selection:


gpac -i src reframer:saps=1 thumbs:mae=15 -o dump/$num$.png

The maxsnap option can be used to force insertion after the given time if no scene cut is found.


grid (v2di, default: 6x0): number of images per lines and columns
scale (dbl, default: -1): scale factor for input size
mae (uint, default: 0, minmax: 0,100): scene diff threshold using Mean Absolute Error
mse (uint, default: 0, minmax: 0,100): scene diff threshold using Mean Square Error
lw (dbl, default: 0.0): line width between images in pixels
lc (str, default: white): line color
clear (str, default: white): clear color
snap (dbl, default: -1): duration between images, 0 for all images
maxsnap (dbl, default: -1): maximum duration between two thumbnails when scene change detection is enabled
pfmt (pfmt, default: rgb): output pixel format
txt (str, default: ): text to insert per thumbnail
(str, default: white): text color
tb (str, default: black): text shadow
font (str, default: SANS): font to use
fs (dbl, default: 10): font size to use in percent of scaled height
tv (dbl, default: 0): text vertical position in percent of scaled height
thread (sint, default: -1): number of threads for software rasterizer, -1 for all available cores
blt (bool, default: true): use blit instead of software rasterizer
scref (bool, default: false): use last inserted image as reference for scene change detection
dropfirst (bool, default: false): drop first image
list (str, default: null): export json list of frame times and positions to given file
lxy (bool, default: false): add explicit x and y in json export