Skip to content


ISOBMFF/QT multiplexer

Register name used to load filter: mp4mx
This filter may be automatically loaded during graph resolution.

This filter multiplexes streams to ISOBMFF (14496-12 and derived specifications) or QuickTime

Tracks and Items

By default all input PIDs with ItemID property set are multiplexed as items, otherwise they are multiplexed as tracks.
To prevent source items to be multiplexed as items, use -itemid option from ISOBMFF demultiplexer.

gpac -i source.mp4:itemid=false -o file.mp4

To force non-item streams to be multiplexed as items, use #ItemID option on that PID:

gpac -i source.jpg:#ItemID=1 -o file.mp4


The store option allows controlling if the file is fragmented or not, and when not fragmented, how interleaving is done. For cases where disk requirements are tight and fragmentation cannot be used, it is recommended to use either flat or fstart modes.

The vodcache option allows controlling how DASH onDemand segments are generated:
- If set to on, file data is stored to a temporary file on disk and flushed upon completion, no padding is present.
- If set to insert, SIDX/SSIX will be injected upon completion of the file by shifting bytes in file. In this case, no padding is required but this might not be compatible with all output sinks and will take longer to write the file.
- If set to replace, SIDX/SSIX size will be estimated based on duration and DASH segment length, and padding will be used in the file before the final SIDX. If input PIDs have the properties DSegs set, this will used be as the number of segments.
The on and insert modes will produce exactly the same file, while the mode replace may inject a free box before the sidx.

Custom boxes

Custom boxes can be specified as box patches:
For movie-level patch, the boxpatch option of the filter should be used.
Per PID box patch can be specified through the PID property boxpatch.

gpac -i source:#boxpatch=myfile.xml -o mux.mp4
Per Item box patch can be specified through the PID property boxpatch.
gpac -i source:1ItemID=1:#boxpatch=myfile.xml -o mux.mp4

The box patch is applied before writing the initial moov box in fragmented mode, or when writing the complete file otherwise.
The box patch can either be a filename or the full XML string.


When tagging is enabled, the filter will watch the property CoverArt and all custom properties on incoming PID.
The built-in tag names are indicated by MP4Box -h tags.
QT tags can be specified using qtt_NAME property names, and will be added using formatting specified in MP4Box -h tags.
Other tag class may be specified using tag_NAME property names, and will be added if tags is set to all using:
- NAME as a box 4CC if NAME is four characters long
- NAME as a box 4CC if NAME is 3 characters long, and will be prefixed by 0xA9
- the CRC32 of the NAME as a box 4CC if NAME is not four characters long

User data

The filter will look for the following PID properties to create user data entries:
udtab: set the track user-data box to the property value which must be a serialized box array blob
mudtab: set the movie user-data box to the property value which must be a serialized box array blob
udta_U4CC: set track user-data box entry of type U4CC to property value
mudta_U4CC: set movie user-data box entry of type U4CC to property value
* tkgp_T4CC: set/remove membership to track group with type T4CC and ID given by property value. A negative value N removes from track group with ID -N


gpac -i src.mp4:#udta_tagc='My Awesome Tag' -o tag.mp4  
gpac -i src.mp4:#mudtab=data@box.bin -o tag.mp4

Custom sample group descriptions and sample auxiliary info

The filter watches the following custom data properties on incoming packets:
grp_A4CC: maps packet to sample group description of type A4CC and entry set to property payload
grp_A4CC_param: same as above and sets sample to group grouping_type_parameter to param
sai_A4CC: adds property payload as sample auxiliary information of type A4CC
sai_A4CC_param: same as above and sets aux_info_type_parameterto param

The property grp_EMSG consists in one or more EventMessageBox as defined in MPEG-DASH.
- in fragmented mode, presence of these boxes in a packet will start a new fragment, with the boxes written before the moof
- in regular mode, an internal sample group of type EMSG is currently used for emsg box storage


The filter watches the property FileNumber on incoming packets to create new files (regular mode) or new segments (DASH mode).

The filter watches the property DSIWrap (4CC as int or string) on incoming PID to wrap decoder configuration in a box of given type (unknown wrapping)

-i unkn.mkv:#ISOMSubtype=VIUK:#DSIWrap=cfgv -o t.mp4
This will wrap the unknown stream using VIUK code point in stsd and wrap any decoder configuration data in a cfgv box.

If pad_sparse is set, the filter watches the property Sparse on incoming PID to decide whether empty packets should be injected to keep packet duration info.
Such packets are only injected when a whole in the timeline is detected.
- if Sparse is absent, empty packet is inserted for unknown text and metadata streams
- if Sparse is true, empty packet is inserted for all stream types
- if Sparse is false, empty packet is never injected

The default media type used for a PID can be overriden using property StreamSubtype.

-i [-i ...]  -o test.mp4 
This will force the text stream to use sbtl handler type instead of default text one.
Subtitle streams may be used as chapters by setting the property IsChap on the desired PID.
-i  [-i ...] -o test.mp4 
This will force the text stream to be used as a QT chapter track.


m4sys (bool, default: false): force MPEG-4 Systems signaling of tracks
dref (bool, default: false): only reference data from source file - not compatible with all media sources
ctmode (enum, default: edit): set composition offset mode for video tracks
edit: uses edit lists to shift first frame to presentation time 0
noedit: ignore edit lists and does not shift timeline
* negctts: uses ctts v1 with possibly negative offsets and no edit lists

dur (frac, default: 0): only import the specified duration. If negative, specify the number of coded frames to import
pack3gp (uint, default: 1): pack a given number of 3GPP audio frames in one sample
importer (bool, default: false): compatibility with old importer, displays import progress
pack_nal (bool, default: false): repack NALU size length to minimum possible size for NALU-based video (AVC/HEVC/...)
xps_inband (enum, default: no): use inband (in sample data) parameter set for NALU-based video (AVC/HEVC/...)
no: parameter sets are not inband, several sample descriptions might be created
pps: picture parameter sets are inband, all other parameter sets are in sample description
all: parameter sets are inband, no parameter sets in sample description
both: parameter sets are inband, signaled as inband, and also first set is kept in sample description
mix: creates non-standard files using single sample entry with first PSs found, and moves other PS inband
auto: keep source config, or defaults to no if source is not ISOBMFF

store (enum, default: inter): file storage mode
inter: perform precise interleave of the file using cdur (requires temporary storage of all media)
flat: write samples as they arrive and moov at end (fastest mode)
fstart: write samples as they arrive and moov before mdat
tight: uses per-sample interleaving of all tracks (requires temporary storage of all media)
frag: fragments the file using cdur duration
sfrag: fragments the file using cdur duration but adjusting to start with SAP1/3

cdur (frac, default: -1/1): chunk duration for flat and interleaving modes or fragment duration for fragmentation modes
0: no specific interleaving but moov first
negative: defaults to 1.0 unless overridden by storage profile

moovts (sint, default: 600): timescale to use for movie. A negative value picks the media timescale of the first track added
moof_first (bool, default: true): generate fragments starting with moof then mdat
abs_offset (bool, default: false): use absolute file offset in fragments rather than offsets from moof
fsap (bool, default: true): split truns in video fragments at SAPs to reduce file size
subs_sidx (sint, default: -1): number of subsegments per sidx
0: single sidx
>0: hierarchical or daisy-chained sidx
<0: disables sidx
-2: removes sidx if present in source PID

m4cc (str): 4 character code of empty box to append at the end of a segment (DASH mode) or of a fragment (non-DASH mode)
chain_sidx (bool, default: false): use daisy-chaining of SIDX
msn (uint, default: 1): sequence number of first moof to N
msninc (uint, default: 1): sequence number increase between moof boxes
tfdt (lfrac, default: 0): set initial decode time (tfdt) of first traf
tfdt_traf (bool, default: false): force tfdt box in each traf
nofragdef (bool, default: false): disable default flags in fragments
straf (bool, default: false): use a single traf per moof (smooth streaming and co)
strun (bool, default: false): use a single trun per traf (smooth streaming and co)
psshs (enum, default: moov): set pssh boxes store mode
moof: in first moof of each segments
moov: in movie box
both: in movie box and in first moof of each segment
none: pssh is discarded

sgpd_traf (bool, default: false): store sample group descriptions in traf (duplicated for each traf). If not used, sample group descriptions are stored in the movie box
vodcache (enum, default: replace): enable temp storage for VoD dash modes
on: use temp storage of complete file for sidx and ssix injection
insert: insert sidx and ssix by shifting bytes in output file
* replace: precompute pace requirements for sidx and ssix and rewrite file range at end

noinit (bool, default: false): do not produce initial moov, used for DASH bitstream switching mode
tktpl (enum, default: yes): use track box from input if any as a template to create new track
no: disables template
yes: clones the track (except edits and decoder config)
* udta: only loads udta

mudta (enum, default: yes): use udta and other moov extension boxes from input if any
no: disables import
yes: clones all extension boxes
* udta: only loads udta

mvex (bool, default: false): set mvex boxes after trak boxes
sdtp_traf (enum, default: no): use sdtp box in traf box rather than using flags in trun sample entries
no: do not use sdtp
sdtp: use sdtp box to indicate sample dependencies and do not write info in trun sample flags
* both: use sdtp box to indicate sample dependencies and also write info in trun sample flags

trackid (uint, default: 0): track ID of created track for single track. Default 0 uses next available trackID
fragdur (bool, default: false): fragment based on fragment duration rather than CTS. Mostly used for MP4Box -frag option
btrt (bool, default: true): set btrt box in sample description
styp (str): set segment styp major brand (and optionally version) to the given 4CC[.version]
mediats (sint, default: 0): set media timescale. A value of 0 means inherit from PID, a value of -1 means derive from samplerate or frame rate
ase (enum, default: v0): set audio sample entry mode for more than stereo layouts
v0: use v0 signaling but channel count from stream, recommended for backward compatibility
v0s: use v0 signaling and force channel count to 2 (stereo) if more than 2 channels
v1: use v1 signaling, ISOBMFF style (will mux raw PCM as ISOBMFF style)
v1qt: use v1 signaling, QTFF style
* v2qt: use v2 signaling, QTFF style (lpcm entry type)

ssix (bool, default: false): create ssix box when sidx box is present, level 1 mapping I-frames byte ranges, level 0xFF mapping the rest
ccst (bool, default: false): insert coding constraint box for video tracks
maxchunk (uint, default: 0): set max chunk size in bytes for runs (only used in non-fragmented mode). 0 means no constraints
noroll (bool, default: false): disable roll sample grouping
norap (bool, default: false): disable rap sample grouping
saio32 (bool, default: false): use 32 bit offset for side data location instead of 64 bit offset
tfdt64 (bool, default: false): use 64 bit tfdt and sidx even for 32 bits timestamps
compress (enum, default: no): set top-level box compression mode
no: disable box compression
moov: compress only moov box (uses cmov for QT)
moof: compress only moof boxes
sidx: compress moof and sidx boxes
ssix: compress moof, sidx and ssix boxes
all: compress moov, moof, sidx and ssix boxes

fcomp (bool, default: false): force using compress box even when compressed size is larger than uncompressed
otyp (bool, default: false): inject original file type when using compressed boxes
trun_inter (bool, default: false): interleave samples in trun based on the temporal level, the lowest level are stored first (this will create as many trun boxes as required)
truns_first (bool, default: false): store track runs before sample group description and sample encryption information
block_size (uint, default: 10000): target output block size, 0 for default internal value (10k)
boxpatch (str): apply box patch before writing
deps (bool, default: true): add samples dependencies information
mfra (bool, default: false): enable movie fragment random access when fragmenting (ignored when dashing)
forcesync (bool, default: false): force all SAP types to be considered sync samples (might produce non-compliant files)
refrag (bool, default: false): use track fragment defaults from initial file if any rather than computing them from PID properties (used when processing standalone segments/fragments)
itags (enum, default: strict): tag injection mode
none: do not inject tags
strict: only inject recognized itunes tags
* all: inject all possible tags

keep_utc (bool, default: false): force all new files and tracks to keep the source UTC creation and modification times
pps_inband (bool, default: no): when xps_inband is set, inject PPS in each non SAP 1/2/3 sample
moovpad (uint, default: 0): insert free box of given size after moov for future in-place editing
cmaf (enum, default: no): use CMAF guidelines (turns on mvex, truns_first, strun, straf, tfdt_traf, chain_sidx and restricts subs_sidx to -1 or 0)
no: CMAF not enforced
cmfc: use CMAF cmfc guidelines
* cmf2: use CMAF cmf2 guidelines (turns on nofragdef)

pad_sparse (bool, default: true): inject sample with no data (size 0) to keep durations in unknown sparse text and metadata tracks
force_dv (bool, default: false): force DV sample entry types even when AVC/HEVC compatibility is signaled
dvsingle (bool, default: false): ignore DolbyVision profile 8 in xps inband mode if profile 5 is already set
tsalign (bool, default: true): enable timeline realignment to 0 for first sample - if false, this will keep original timing with empty edit (possibly long) at begin)
chapm (enum, default: both): chapter storage mode
off: disable chapters
tk: use chapter track (QT-style)
udta: use user-data box chapters
both: use both chapter tracks and udta

patch_dts (bool, default: false): patch previous samples duration when dts do not increase monotonically
uncv (enum, default: prof): use uncv (ISO 23001-17) for raw video
off: disabled (always the case when muxing to QT)
gen: enabled, do not write profile
prof: enabled and write profile if known
tiny: enabled and write reduced version if profile known and compatible

trunv1 (bool, default: false): force using version 1 of trun regardless of media type or CMAF brand