Extracting Subtitles from MKV Files

When I am converting videos from one format to another, I often need to convert the susbtitles as well. I wrote a script to extract (and convert PGS to VobSub). This is a bash script that I am running on my Mac. If you want to run it on linux, just update the PATHs to the commands as necessary. The major tools/dependencies you have to have installed that aren’t available by default are ffprobe, mkvextract (part of the mkvtoolnix suite), java, and BDSup2Sub.

The script takes one parameter; the filename of the mkv.

If the filename has spaces, the subtitle files that are extracted will have underscores in place of the spaces. It makes the script a lot easier to code.

If PGS (.sup) subtitles are extracted, it will automatically convert them to VobSub (SUB/IDX) files for embedding in MP4 containers.


trap 'rm -f /tmp/*.$$ ; exit 0' 0 1 2 3 13 15

ffprobe "$1"  2>&1 | grep Stream | grep "Subtitle:" | while read line
	tracknum=$( echo "${line}" | cut -f 2 -d ':' | egrep -o "^[0-9]+" )
	lang=$( echo "${line}" | cut -f 2 -d ':' | cut -f 2- -d '(' | tr -d ')' )
	type=$( echo "${line}" | awk '{ print $4;}' )

	case "${type}"
		"hdmv_pgs_subtitle" | "hdmv_pgs_subtitle,")
		"dvd_subtitle" | "dvd_subtitle,")

	echo "${tracknum}:$( basename "$1" | tr ' ' '_' ).${tracknum}.${lang}.${ext}"
	# echo "${tracknum}:$( basename "$1" ).${tracknum}.${lang}.${ext}"
done > /tmp/extractsubs.$$

# For the life of me I can't get it to work with spaces in the names, so I just convert
# spaces to underscores, and it works.  Maybe I'll use another character, then convert it
# back when it's done.

/Applications/MKVToolNix-*.0.0.app/Contents/MacOS/mkvextract "$1" tracks $( cat /tmp/extractsubs.$$ )

cat /tmp/extractsubs.$$ | egrep ".sup$" | cut -f 2- -d ':' | while read line
	java -jar /Applications/BDSup2Sub512.jar -o "$( basename "${line}" .sup ).sub" "${line}"
	rm "${line}"

Leave a Comment