Subtitles come in a myriad of formats, and not all of them work in every container format, and not all containers are as easy to work with. Converting Matroska (mkv) files to MP4s cause issues when the mkv has PGS (Blu-Ray) subtitles.
PGS subtitles can be embedded in Matroska video files, but not MP4s. But tagging and putting other information in Matroska files is a pain in the butt, and there aren’t many good tools for it. DVD Subtitles, however, can be put into MP4 files. Even though technically it isn’t part of the standard, most players recognize and play them fine.
PGS and DVD (VobSub) subtitles are both bitmap formats, and are trickier to convert than text based subtitles. There are two methods I have found that work well in converting from PGS to VobSub (versus using OCR to convert which is often messy, especially if it’s not pure text).
The first method is with BDSup2Sub. This is a <shudder> java based tool that does a very good job of converting and preserving as much information as possible. It has both a GUI and a CLI interface. It seems to do a slightly better job than the second method. You will have too manually set the produced tracks to the proper size (usually 1920×1080) in Subler after adding them, or they will be horribly large and unusable.
The second method is using ffmpeg. Finding details on the ffmpeg method has been a challenge. The documentation is sparse, but I’ve pieced enough together to make it work acceptably.
This one allows you to recontainer, and convert from mkv to MP4 at the same time. An example command:
ffmpeg -fix_sub_duration -i source.mkv -c:v copy -a:a copy -c:s dvdsub destination.mp4
The -fix_sub_duration
option is critical. Without it, the subtitles never clear, so they just write one after the other on top of each other.
Other options available:
Subtitle options:
-s size set frame size (WxH or abbreviation)
-sn disable subtitle
-scodec codec force subtitle codec ('copy' to copy stream)
-stag fourcc/tag force subtitle tag/fourcc
-fix_sub_duration fix subtitles duration
-canvas_size size set canvas size (WxH or abbreviation)
-spre preset set the subtitle options to the indicated preset
You will want to set the canvas size to the original source video size, regardless of what size you’ve rescaled the video to. i.e. if the source is a Blu-Ray at 1080p (1920×1080), and you encode the video to 720p (1280×720), you would set the canvas size (i.e. the size of the subtitles) to 1920×1080, and the player will scale it to the video playback size correctly.
ffmpeg doesn’t seem to bring any palette information across, or just defaults to gray. I need to do more rigorous testing when I can find some good samples of PGS subtitles that have fades, or multiple colors.
I wrote a script that uses BDSup2Sub and mkvextract to automatically pull all the subs out and convert them into a usable form for Subler. It takes one parameter; the name of the file (usually a .MKV):
#!/bin/bash
trap 'rm -f /tmp/*.$$ ; exit 0' 0 1 2 3 13 15
ffprobe "$1" 2>&1 | grep Stream | grep "Subtitle:" | while read line
do
tracknum=$( echo "${line}" | cut -f 2 -d ':' | egrep -o "^[0-9]+" )
lang=$( echo "${line}" | cut -f 2 -d ':' | cut -f 2- -d '(' | tr -d ')' )
type=$( echo "${line}" | awk '{ print $4;}' )
case "${type}"
in
"hdmv_pgs_subtitle" | "hdmv_pgs_subtitle,")
ext="sup"
;;
"subrip")
ext="srt"
;;
"dvd_subtitle" | "dvd_subtitle,")
ext="sub"
;;
"ass")
ext="ass"
;;
"ssa")
ext="ssa"
;;
*)
ext="unknown_type"
;;
esac
echo "${tracknum}:$( basename "$1" | tr ' ' '_' ).${tracknum}.${lang}.${ext}"
done > /tmp/extractsubs.$$
# For the life of me I can't get it to work with spaces in the names, so I just convert
# spaces to underscores, and it works. Maybe I'll use another character, then convert it
# back when it's done.
/Applications/MKVToolNix-*.0.0.app/Contents/MacOS/mkvextract "$1" tracks $( cat /tmp/extractsubs.$$ )
cat /tmp/extractsubs.$$ | egrep ".sup$" | cut -f 2- -d ':' | while read line
do
java -jar /Applications/BDSup2Sub512.jar -o "$( basename "${line}" .sup ).sub" "${line}"
rm "${line}"
done
Neat!
I stumbled across your site looking at another topic, but I’ll have to give this a try. I had been using Subler to OCR PGS subtitles, but this seems more ideal.
Having an option to avoid burning in PGS subs is great. I wonder if there’s a way to flag forced subtitles only in the MP4 container after.
Thanks!
There is using subler to set the forced flag, but it only works with text based subs. I am not sure if BDSup2Sub will pass the forced information across, and if so, if the player will honor it, since *technically* putting DVD image based subs in an MP4 isn’t valid (though almost every player will use them). For the forced ones, which is usually only a handful of lines, I will just OCR them, and clean up any mistakes by hand, and then put them in as an SRT, and set the flags with subler. Let me know if you manage to get them working as images.