Wednesday, March 10, 2010

Tutorial on Automated Captions

First off, I want to take the time to commend Google for undertaking a process of improving accessibility for the Deaf and hard of hearing around the United States and the entire world. For this post, I wanted to take the time to explain to other people how simple the process is of adding captions to YouTube. Fair warning to the reader, though. I use far too many words to explain things, but I'd like to think I've made the big picture much clearer and given the reader all the options they need to begin the process of captioning their own videos or encouraging others to do so.

Here we go. If there is a YouTube video that you would like to see that is not captioned (which means it is likely a video that is a month old or more, as every video now uploaded to YouTube has automated captioning capabilities within them), you have three ways to go about it.

1) If the video is incredibly popular (50,000 hits or more), and you have time to wait for the video, you can try to send the owner of the video a message and ask the owner if he or she would be willing to upload a transcript file that you send them via e-mail to the captions. Always emphasize to the owner of the video in question that the captions can be toggled on and off, as hearing people can be weird about seeing captions on videos. Keep a finger crossed too, because some owners don't respond while other owners are inactive. I have been lucky with two owners, Marcaeld and Aravoth, who graciously agreed to add my text transcripts to their very highly-viewed videos. Other folks may not be as lucky, as I was assisted by the fact that both owners frequent the same political forum that I do and that we are philosophically aligned in many ways.

Here is Marcaeld's "Ron Paul and Stephen Baldwin Debate Marijuana", which has my text transcript file serving as the captions for the video. You will notice some of his videos have captured the actual closed captioning on the screen at the time the segment had aired on TV, but the problem is the real-time TV captions lag, which is why I created the more accurate and timely text transcript.

Ron Paul and Stephen Baldwin Debate Marijuana Laws:


Here is Aravoth's video with my text transcript. The problem here is that there is "dead air" for 10-15 seconds in many of Aravoth's videos, as he relies a lot on black backgrounds with white text and no video or sound other than music. This confuses YouTube's automated captioning tool, as you will see within this video. For folks who wish to create captions of their own, try to avoid uploading videos like this, unless you want to also take the time to personally edit the time codes within the video.

"When In The Course Of Human Events...:"


2) If you decide to do it yourself, either because the owner in question did not respond to your request or because you are in a hurry/bored, you need to google "download free flv youtube videos" and choose from several links the best fit for you and your computer. I was partial to Moyeo, but there were bugs, and I've been using http://catchvideo.net/ since then (the download process is much slower, though). Once the program is installed, simply download a video from YouTube by copying and pasting the URL of the video onto the program. Once the download is complete, go ahead and upload the very same video you just downloaded back onto YouTube on your account. Voila, YouTube's new tools now use automated captioning.

Here's the problem with automated captions that rely solely on voice recognition. They're barely 40% accurate at best (not that I'm complaining. If I need to repeat myself, I will -- BRAVO, GOOGLE!). In a few months to a couple of years at most, the technology should be perfected. In the meantime, this is what we have. The automated captions give us a fairly good sense of what is being discussed, but are prone to distorting the message of what is actually said. It almost looks like a second grader is writing the captions at times. An example:

Ron Paul to Barack Obama: Don't Assassinate US Citizens!: (You may have to open the video to a new window, click on the CC button, and select "transcribe audio" to activate the voice-recognition non-transcript guided captions)


3) This is where the text transcript comes in. If you can find a transcript of the dialogue, all you have to do is copy and paste it onto a notepad document. I get rid of the "John: ______" "Rob: ________" stuff, and get rid of everything except what is actually said. I then do a little brush up work with the text transcript by breaking it up into cohesive sentences, i.e.

Barry met Sally who met Robert. Then, they went to the store. Then they all bought Ice Cream. Sally ended up getting sick and went home early

becomes:
Barry met Sally who met Robert. Then, they went to the store.

Then they all bought Ice Cream.

Sally ended up getting sick and went home early

Once I feel satisfied, I save the file. I then go to "My Videos" on my YouTube account and click on "Captions". I then click on "Add New Captions or Transcript", select the text transcript I just worked on, click on "transcript file", click on the empty "Name (optional)" field (leaving it blank), and then finally clicking on "upload file". Now, YouTube uses both the transcript text file AND the voice recognition software to create much more professional and appealing captions.

Here's the same video from above, now "guided" with a text transcript found somewhere on the web.

Ron Paul to Barack Obama: Don't Assassinate US Citizens!:


After doing it a few times, the process becomes so simple, it's scary. In one or two years, things will be amazing in terms of where voice recognition software takes the deaf community.