There’s been some discussion on the Matterhorn list recently about caption file formats, and I thought it might be useful to describe what we’re doing with file formats for SpokenMedia.
SpokenMedia uses two file formats, our original .wrd
files output from the recognition process and Timed Text Markup Language (TTML). We also need to handle two other caption file formats .srt
and .sbv
.
There is a nice discussion of the YouTube format at SBV file format for Youtube Subtitles and Captions and a link to a web-based tool to convert .srt
files to .sbv
files.
We’ll cover our implementation of TTML in a separate post.
Continue Reading