SpokenMedia – SpokenMedia

Video from OCWC Global Presentation

Brandon Muramatsu — Wed, 26 May 2010 18:15:52 +0000

The OpenCourseWare Consortium has posted the video of our talk during OCWC Global 2010 in Hanoi, Vietnam.

SpokenMedia Project from OpenCourseWare Consortium on Vimeo.

Cite as: Muramatsu, B., McKinney, A. & Wilkins, P. (2010, May 5). Opening Up IIHS Video with SpokenMedia. Presentation at OCWC Global 2010: Hanoi, Vietnam, May 5, 2010. Retrieved May 6, 2010 from Vimeo Web site: http://vimeo.com/11969270

SpokenMedia at OER10

Brandon Muramatsu — Tue, 23 Mar 2010 11:48:31 +0000

Brandon Muramatsu presented on SpokenMedia at the OER10 Conference in Cambridge, UK on March 23, 2010.

Improving the OER Experience: Enabling Rich Media Notebooks of OER Video and Audio from Brandon Muramatsu

View more presentations from Brandon Muramatsu.

Cite as: Muramatsu, B., McKinney, A. & Wilkins, P. (2010, March 23). Improving the OER Experience: Rich Media Notebooks of OER Video and Audio. Presentation at OER10: Cambridge, UK, March 23, 2010. Retrieved March 23, 2010 from Slideshare Web site: http://www.slideshare.net/bmuramatsu/improving-the-oer-experience-enabling-rich-media-notebooks-of-oer-video-and-audio

SpokenMedia at NERCOMP 2010

Brandon Muramatsu — Tue, 09 Mar 2010 15:02:54 +0000

Brandon Muramatsu, Andrew McKinney and Peter Wilkins presented on SpokenMedia at the NERCOMP 2010 Conference in Providence, Rhode Island on March 9, 2010.

SpokenMedia: Automatic Lecture Transcription and Rich Media Notebooks from Brandon Muramatsu

View more presentations from Brandon Muramatsu.

Cite as: Muramatsu, B., McKinney, A. & Wilkins, P. (2010, March 9). SpokenMedia: Automatic Lecture Transcription and Rich Media Notebooks. Presentation at NERCOMP 2010: Providence, Rhode Island, March 9, 2010. Retrieved March 9, 2010 from Slideshare Web site: http://www.slideshare.net/bmuramatsu/spokenmedia-automatic-lecture-transcription-and-rich-media-notebooks

SpokenMedia and NPTEL: Initial Thoughts

Brandon Muramatsu — Sun, 31 Jan 2010 16:42:00 +0000

During our trip to India in early January 2010, Brandon Muramatsu, Andrew McKinney and Vijay Kumar met with Prof. Mangala Sunder and the Indian National Programme on Technology Enhanced Learning (NPTEL) team at the Indian Institute of Technology-Madras.

The SpokenMedia project and NPTEL are in discussions to bring the automated lecture transcription process under development at MIT to NPTEL to:

Radically reduce transcription and captioning time (from 26 hours to as little as 2 hours).

Improve initial transcription accuracy via a research and development program.

Improve search and discovery of lecture video via transcripts.

Improve accessibility of video lectures for the diverse background of learners in India, and worldwide, via captioned video.

NPTEL, the National Programme on Technology Enhanced Learning, is a program funded by the Indian Ministry for Human Resource Development and a collaboration of a number of participating Indian Institutes of Technology. As part of Phase I, they have published approximately 4,500 hours of lecture videos in engineering courses that comply with the model curriculum suggested by All India Council for Technical Education. In an even more ambitious Phase II, they plan to add approximately 40,000 additional hours of lecture video for science and engineering courses.

The current NPTEL transcription and captioning process is labor and time intensive. During our discussions, we learned that it takes approximately 26 hours on aveage to transcribe and caption a single hour of video. Even with the initial hand transcription, they are averaging 50% accuracy.

Source: Brandon Muramatsu

NPTEL Current Transcription and Captioning Process

We discussed using the untrained SpokenMedia software to improve the efficiency of this initial process. Our initial experiments suggest that the untrained recognizer can achieve 40-60% accuracy, which is in the same range as the current NPTEL hand process. Thus we propose replacing the hand transcription and captioning steps with a two step recognition and editing process. Using a single processor, the recognition step takes on the order of 1.5 hours per 1 hour of video. Using this estimate coupled with the existing editing time in use at NPTEL, the overall process might be reduced from 26 hours to approximately 10 hours.

Source: Brandon Muramatsu

Initial Improved NPTEL Transcription and Captioning Process

And we discussed initiating an applied research and development project to create baseline acoustic (speaker) models for Indian English for use in the automated lecture transcription process, with the goal of improving the automated transcription accuracy to the same range of American English transcription (as high as 80-85% accuracy). The use of improved acoustic models and parallizing the recognition process might reduce the total transcription time to approximately 2 hours.

Source: Brandon Muramatsu

Goal NPTEL Transcription and Captioning Process

SpokenMedia at the IIHS Curriculum Conference

Brandon Muramatsu — Sat, 23 Jan 2010 16:41:26 +0000

Brandon Muramatsu and Andrew McKinney presented on SpokenMedia at the Indian Institute for Human Settlements (IIHS) Curriculum Conference in Bangalore, India on January 5, 2010.

Along with Peter Wilkins, we developed a demonstration of SpokenMedia technology using automatic lecture transcription to transcribe videos from IIHS. We developed a new JavaScript player that allowed us to view and search transcripts, and that supports transcripts in multiple languages. View the demo.

IIHS Open Framework-SpokenMedia from Brandon Muramatsu

View more presentations from Brandon Muramatsu.

Cite as: Muramatsu, B., McKinney, A. & Wilkins, P. (2010, January 6). IIHS Open Framework-SpokenMedia. Presentation at the Indian Institute for Human Settlements Curriculum Conference: Bangalore, India, January 5, 2010. Retrieved January 23, 2010 from Slideshare Web site: http://www.slideshare.net/bmuramatsu/iihs-open-frameworkspoken-media

SpokenMedia at EdTech Fair

Brandon Muramatsu — Fri, 16 Oct 2009 17:53:46 +0000

Brandon Muramatsu, Andrew McKinny and Phillip Long presented at the EdTech Fair at MIT in Cambridge, MA on October 14, 2009. We provided an on-going demonstration on the automated lecture transcription, search and playback functions of the SpokenMedia project.

Welcome to the SpokenMedia Project

Brandon Muramatsu — Tue, 13 Oct 2009 23:48:50 +0000

Introduction The SpokenMedia Project is developing a software application suite/web-based service that automatically creates transcripts from academic-style lectures and provides the basis for a rich media notebook for learning. The system takes lecture media, in standard digital formats such as .mp4 and .mp3, and processes them to produce a searchable archive of digital video-/audio-based learning materials. The system allows for ad hoc retrieval of the media stream associated with a section of the audio track containing the target words or phrases. The system plays back the media, presenting the transcript of the spoken words synchronized with the speaker’s voice and marked by a cursor that follows along in sync with the lecture audio. The project’s goal is to increase the effectiveness of web-based lecture media by improving the search and discoverability of specific, relevant media segments and enabling users to interact with rich media segments in more educationally relevant ways.

Where does it fit in? The system is envisioned as a service that can be integrated directly into individual campus podcasting solutions; the architecture of the system will be flexible enough to integrate with existing workflows associated with lecture recording systems, learning management systems and repositories. The service is intended to plug-in to the processing stage of a simplified podcast workflow as illustrated below. A goal of this project is to integrate the system as part of a Podcast Producer-based workflow using an underlying Xgrid to perform the automated speech recognition processing.

Source: Brandon Muramatsu

(Simplified) Workflow with Transcript Creation

How does it work? The process for creating media-linked transcripts, as illustrated below, takes as inputs the lecture media, a domain model containing words likely to be used in the lecture, and a speaker model selected to most closely match the speaker(s) in the lecture. The output from the speech processor is an XML file containing the words spoken and their time codes. The time-coded transcripts and lecture media are brought back together and are viewable through a rich media browser.

Source: Brandon Muramatsu

SpokenMedia Workflow

Key steps in the workflow include the use of a collection of texts, such as lecture notes, slide presentations, reference papers and articles, that are processed to extract all key words and phrases to form a domain model. The creation of individual speaker models (such as from faculty teaching term-long courses) increases the accuracy of the recognizer and therefore transcription accuracy. The speech processing is multithreaded to process audio file segments in parallel; similar to video processing in existing Podcast Producer workflows, the speech processing is suited for an Xgrid or cloud-based (e.g., Amazon EC2) deployment.

SpokenMedia: Content, Content Everywhere…What video? Where? at OpenEd 2009

Brandon Muramatsu — Thu, 13 Aug 2009 01:48:27 +0000

Brandon Muramatsu presented on SpokenMedia at the Open Education 2009 Conference in August 2009 in Vancouver, British Columbia, Canada.

SpokenMedia: Content, Content Everywhere…What video? Where? at OpenEd 2009 from Brandon Muramatsu

View more presentations from Brandon Muramatsu.

Cite as: Muramatsu, B. (2009). SpokenMedia: Content, Content Everywhere…What video? Where?: Improving the Discoverability of OER video and audio lectures. Presentation at the Open Education 2009: Vancouver, British Columbia, August 12, 2009. Retrieved August 17, 2009 from Slideshare Web site: http://www.slideshare.net/bmuramatsu/spokenmedia-content-content-everywherewhat-video-where-at-opened-2009#?type=presentation

In the uStream video below, the SpokenMedia presentation starts at about 19:30 in. The first part of the presentation is Mara Hancock from UC Berkeley talking about Opencast Matterhorn. (Unfortunately they forgot to start saving the stream at the start of her talk.)

Cite as: Muramatsu, B. (2009). SpokenMedia: Content, Content Everywhere…What video? Where? Presentation at Open Education 2009: Vancouver, British Columbia, August 12, 2009. Retrieved August 17, 2009 from uStream Web site: http://www.ustream.tv/flash/video/1972941

SpokenMedia Project: Media-Linked Transcripts and Rich Media Notebooks for Learning and Teaching at T4E 2009

Brandon Muramatsu — Fri, 07 Aug 2009 01:39:40 +0000

Brandon Muramatsu presented on SpokenMedia three times in India in August 2009–at the 2009 Technology for Education Workshop, Microsoft Research India and IEEE Computer Society Bangalore Section.

The presentation to the IEEE-CS Bangalore Section was also the best presentation of the three–this presentation really wants to be an hour long, and we got great questions from the audience. Unfortunately I forgot to record the presentation, it would have made a great slidecast.

Embedded below is the presentation to the Technology for Education 2009 Conference, the one with a slidecast.

SpokenMedia Project: Media-Linked Transcripts and Rich Media Notebooks for Learning and Teaching at T4E 2009 from Brandon Muramatsu

View more presentations from Brandon Muramatsu.

Cite as: Muramatsu, B. (2009). SpokenMedia Project: Media-Linked Transcripts and Rich Media Notebooks for Learning and Teaching at T4E 2009. Presentation at the Technology for Education Conference, Bangalore, India, August 4, 2010. Retrieved August 6, 2009 from Slideshare Web site: http://www.slideshare.net/bmuramatsu/spokenmedia-project-medialinked-transcripts-and-rich-media-notebooks-for-learning-and-teaching

Building Community for Rich Media Notebooks: The SpokenMedia Project at NMC 2009

Brandon Muramatsu — Sat, 20 Jun 2009 19:43:55 +0000

Brandon Muramatsu and Phillip Long presented at the NMC Summer Conference in Monterey, CA on June 12, 2009.

Building Community for Rich Media Notebooks: The SpokenMedia Project at NMC 2009 from Brandon Muramatsu

View more presentations from Brandon Muramatsu.

Cite as: Muramatsu, B., McKinney, A., Long, P.D. & Zornig, J. (2009, June 12). Building Community for Rich Media Notebooks: The SpokenMedia Project. 2009 New Media Consortium Summer Conference, Monterey, CA on June 12, 2009. Retrieved October 13, 2009 from Slideshare Web site: http://www.slideshare.net/bmuramatsu/building-community-for-rich-media-notebooks-the-spokenmedia-project