This site has been archived

How Google Translate Works

Google posted a high level overview of how Google Translate works.

Source: Google

An interesting hack from Yahoo! Openhack India

Sound familiar?

Automatic, Real-time close captioning/translation for flickr videos.

We captured the audio stream that comes out to speaker and gave as input to mic. Used Microsoft Speech API and Julius to convert the speech to text. Used a GreaseMonkey script to sync with transcription server(our local box) and video and displayed the transcribed text on the video. Before displaying the actual text on the video, based on the user’s choice we translate the text and show it on video. (We used Google’s Translate API for this).

Srithar, B. (2010). Yahoo! Openhack India 2010- FlicksubZ. Retrieved on July 28, 2010 from Srithar’s Blog Website:

Check out the whole post.

Making Progress

In the last month or two we’ve made some good progress with getting additional parts of the SpokenMedia workflow into a working state.

Here’s a workflow diagram showing what we can do with SpokenMedia today.

SpokenMedia Workflow, June 2010
Source: Brandon Muramatsu

SpokenMedia Workflow, June 2010

(The bright yellow indicates features working in the last two months, the gray indicates features we’ve had working since December 2009, and the light yellow indicates features on which we’ve just started working.)

Continue Reading

Universal Subtitles

Here’s the problem: web video is beginning to rival television, but there isn’t a good open resource for subtitling. Here’s our mission: we’re trying to make captioning, subtitling, and translating video publicly accessible in a way that’s free and open, just like the Web.

The SpokenMedia project was born out of the research into automatic lecture transcription from the Spoken Language Systems group at MIT. Our approach has been two fold. We have been focusing on working with researchers to improve the automatic creation of transcripts–to enable search, and perhaps accessible captions. We’ve been working hard with researchers and doing what we can do from a process standpoint to improve accuracy. We have also been working on tools to address accuracy from a human editing perspective. In this approach we would provide these tools to lecture video publishers, but have considered setup a process to enable crowdsourced editing.

Recently we learned of a new project, Universal Subtitles (now Amara) and their Mozilla design challenge for Collaborative Subtitles. Both (?) projects/approaches are interesting and we’ll be keeping our eye on their progress. (Similarly with UToronto’s OpenCaps project that’s part of the Opencast Matterhorn suite).

Here’s a screenshot from the Universal Subtitle project.

Universal Subtitle Project

Universal Subtitle Project

Here’s a screenshot of the caption widget from the Collaborative Subtitling project.

Collaborative Subtitling Mockup
Source: Brandon Muramatsu/Collaborative Subtitling Challenge Mockup

Collaborative Subtitling Mockup

HTML5 Video

In a recent email from the Opencast community, I received a link to a post titled, “HTML5 video Libraries, Toolkits and Players” that gathers some of the currently available info on HTML5 Video. HTML5 Video is something that the SpokenMedia project will begin investigating “soon”.

To help you understand and get the most from this new tag, we have listed below a selection of the best HTML5 video libraries, frameworks, toolkits and players.

Source: Speckboy Design Magazine. (2010, April 23). HTML5 video Libraries, Toolkits and Players. Retrieved on April 25, 2010 from Speckboy Design Magazine Website:

YouTube Auto-Captions

YouTube announced in early March that they would be extending their pilot program to enable auto-captioning for all channels.

The highlights…YouTube has announced that they’re doing this to improve accessibility, and…

  • Captions will initially only be available for English videos.
  • Auto-captions requires clearly spoken audio/
  • Auto-captions aren’t perfect, the owner will need to check that they’re accurate.
  • Auto-captions will be available for all channels.

We think this is great, if YouTube can automatically caption files, at scale and with high accuracy, that’s a great step forward for all videos, and definitely the lecture videos that we’ve been interested in the SpokenMedia project.

Though, as with SpokenMedia’s approach that builds on Jim Glass’ Spoken Language Systems research, they still have a ways to go on accuracy.

At this early date though, we can still see some significant advantages to our approach:

  • You don’t have to host your videos through YouTube to use the service SpokenMedia is developing. (YouTube locks the videos you upload into their player and service.)
  • SpokenMedia will provide a timed-aligned transcript file that you can download and use in other applications. (YouTube allows the channel publishers to download a transcript, edit it, and then reupload it for time code alignment. However, they don’t allow the public at large to download the transcript.)
  • SpokenMedia will provide an editor to improve the accuracy of the transcripts.
  • SpokenMedia will enable you to use the transcripts in other applications like search, and will let you start playing a segment within a video. (Though I’m pretty sure YouTube will be using transcripts to help users find videos–and I personally think that’s the real driver behind auto-captions search and keyword advertising. And if you know how to do it, you can construct a URL to link to a particular timepoint in a YouTube-hosted video.)

In any event, if you’ve watched the recent slidecasts of the last couple SpokenMedia presentations, you’ll see that we’ve included the impact of Auto-Captions on SpokenMedia.

YouTube EDU and iTunesU

An interesting article on TechCrunch today about YouTube EDU and iTunesU.

YouTube has reported on the one year anniversary of the launch of YouTube EDU:

MIT on YouTube EDU
Source: Brandon/YouTube EDU

MIT on YouTube EDU

YouTube EDU is now one of the largest online video repositories of higher education content in the world. We have tripled our partner base to over 300 universities and colleges, including University of Cambridge, Yale, Stanford, MIT, University of Chicago and The Indian Institutes of Technology. We have grown to include university courses in seven languages across 10 countries. We now have over 350 full courses, a 75% increase from a year ago and thousands of aspiring students have viewed EDU videos tens of millions of times. And today, the EDU video library stands at over 65,000 videos.

Source: YouTube. (2010, March 25). More Courses and More Colleges – YouTube EDU Turns One. Retrieved on March 25, 2010 from YouTube Website:

The TechCrunch article also lists the stats of iTunesU as 600 university partners and 250,000 videos.

IIHS Demo: How’d we do it?

Workflow Used in IIHS Demo
Source: Brandon Muramatsu

Workflow Used in IIHS Demo

SpokenMedia and NPTEL: Initial Thoughts

During our trip to India in early January 2010, Brandon Muramatsu, Andrew McKinney and Vijay Kumar met with Prof. Mangala Sunder and the Indian National Programme on Technology Enhanced Learning (NPTEL) team at the Indian Institute of Technology-Madras.

The SpokenMedia project and NPTEL are in discussions to bring the automated lecture transcription process under development at MIT to NPTEL to:

  • Radically reduce transcription and captioning time (from 26 hours to as little as 2 hours).
  • Improve initial transcription accuracy via a research and development program.
  • Improve search and discovery of lecture video via transcripts.
  • Improve accessibility of video lectures for the diverse background of learners in India, and worldwide, via captioned video.

Continue Reading

SpokenMedia at the IIHS Curriculum Conference

Brandon Muramatsu and Andrew McKinney presented on SpokenMedia at the Indian Institute for Human Settlements (IIHS) Curriculum Conference in Bangalore, India on January 5, 2010.

Along with Peter Wilkins, we developed a demonstration of SpokenMedia technology using automatic lecture transcription to transcribe videos from IIHS. We developed a new JavaScript player that allowed us to view and search transcripts, and that supports transcripts in multiple languages. View the demo.

Cite as: Muramatsu, B., McKinney, A. & Wilkins, P. (2010, January 6). IIHS Open Framework-SpokenMedia. Presentation at the Indian Institute for Human Settlements Curriculum Conference: Bangalore, India, January 5, 2010. Retrieved January 23, 2010 from Slideshare Web site:

Creative Commons License Unless otherwise specified, the Spoken Media Website by the MIT Office of Digital Learning, Strategic Education Initiatives is licensed under a Creative Commons Attribution 4.0 International License.