Subtitle loader¶
Register name used to load filter: txtin
This filter may be automatically loaded during graph resolution.
This filter reads subtitle data from input PID to produce subtitle frames on a single PID.
The filter supports the following formats:
- SRT: https://en.wikipedia.org/wiki/SubRip
- WebVTT: https://www.w3.org/TR/webvtt1/
- TTXT: https://wiki.gpac.io/xmlformats/TTXT-Format-Documentation
- QT 3GPP Text XML (TexML): Apple QT6, likely deprecated
- TTML: https://www.w3.org/TR/ttml2/
- SUB: one subtitle per line formatted as
{start_frame}{end_frame}text
- SSA (Substation Alpha): basic parsing support for common files
Input files must be in UTF-8 or UTF-16 format, with or without BOM. The internal frame format is:
- WebVTT (and srt if desired): ISO/IEC 14496-30 VTT cues
- TTML: ISO/IEC 14496-30 XML subtitles
- stxt and sbtt: ISO/IEC 14496-30 text stream and text subtitles
- Others: 3GPP/QT Timed Text
TTML Support¶
If ttml_split option is set, the TTML document is split in independent time segments by inspecting all overlapping subtitles in the body.
Empty periods in TTML will result in empty TTML documents or will be skipped if no_empty option is set.
The first sample has a CTS assigned as indicated by ttml_cts:
- a numerator of -2 indicates the first CTS is 0
- a numerator of -1 indicates the first CTS is the first active time in document
- a numerator >= 0 indicates the CTS to use for first sample
When TTML splitting is disabled, the duration of the TTML sample is given by ttml_dur if not 0, or set to the document duration
By default, media resources are kept as declared in TTML2 documents.
ttml_embed can be used to embed inside the TTML sample the resources in <head>
or <body>
:
- for
<source>
,<image>
,<audio>
,<font>
, local URIs indicated insrc
will be loaded andsrc
rewritten. - for
<data>
with base64 coding, the data will be decoded,<data>
element removed and parent <source> rewritten withsrc
attribute inserted.
The embedded data is added as a subsample to the TTML frame, and the referring elements will use src=urn:mpeg:14496-30:N
with N
the index of the subsample.
A subtitle zero
may be specified using ttml_zero. This will remove all subtitles before the given time T0
, and rewrite each subtitle begin/end T
to T-T0
using millisecond accuracy.
Warning: Original time formatting (tick, frames/subframe ...) will be lost when this option is used, converted to HH:MM:SS.ms
.
The subtitle zero time must be prefixed with T
when the option is not set as a global argument:
Example
gpac -i test.ttml:ttml_zero=T10:00:00 [...]
MP4Box -add test.ttml:sopt:ttml_zero=T10:00:00 [...]
gpac -i test.ttml --ttml_zero=10:00:00 [...]
gpac -i test.ttml --ttml_zero=T10:00:00 [...]
MP4Box -add test.ttml --ttml_zero=10:00:00 [...]
Simple Text Support¶
The text loader can convert input files in simple text streams of a single packet, by forcing the codec type on the input:
Example
The content of the source file will be the payload of the text sample. The stxtmod option allows specifying WebVTT, TX3G or simple text mode for output format.
In this mode, the stxtdur option is used to control the duration of the generated subtitle:
- a positive value always forces the duration
- a negative value forces the duration if input packet duration is not known
Notes¶
When reframing simple text streams from demuxers (e.g. subtitles from MKV), the output format of these streams can be selected using stxtmod.
When importing SRT, SUB or SSA files, the output format of the PID can be selected using stxtmod.
Options¶
nodefbox (bool, default: false): skip default text box
noflush (bool, default: false): skip final sample flush for srt
fontname (str): default font
fontsize (uint, default: 18): default font size
lang (str): default language
width (uint, default: 0): default width of text area
height (uint, default: 0): default height of text area
txtx (uint, default: 0): default horizontal offset of text area: -1 (left), 0 (center) or 1 (right)
txty (uint, default: 0): default vertical offset of text area: -1 (bottom), 0 (center) or 1 (top)
zorder (sint, default: 0): default z-order of the PID
timescale (uint, default: 1000): default timescale of the PID
ttml_split (bool, default: true): split ttml doc in non-overlapping samples
ttml_cts (lfrac, default: -1/1): first sample cts - see filter help
ttml_dur (frac, default: 0/1): sample duration when not spliting split - see filter help
ttml_embed (bool, default: false): force embedding TTML resources
ttml_zero (str): set subtitle zero time for TTML
no_empty (bool, default: false): do not send empty samples
stxtdur (frac, default: 1): duration for simple text
stxtmod (enum, default: tx3g): text stream mode for simple text streams and SRT inputs
- stxt: output PID formatted as simple text stream
- sbtt: output PID formatted as subtitle text stream
- tx3g: output PID formatted as TX3G/Apple stream
- vtt: output PID formatted as WebVTT stream
- webvtt: same as vtt (for backward compatiblity