Video and Accessibility / Close Captioning

Now that video production and distribution is within reach of all of us we need to learn more about the tools to make them 508 compliant. My halting efforts are detailed below:

There’s a parcity of software applications to convert sound into text. The $300 Dragon Naturally Speaking 10 is highly rated, but does not offer a sample download. So I was stuck with the $39, try it you’ll like it, Wave-To-Text Converter from United Research Labs. After 30 minutes of churning on a 2 minute sound file it came up with this cryptic gem:

alley and governmental action program will win infants and children the ninth and Nevada -hyphen Agers is vents nutrition counseling health care referral residents of violent authorized under everyone likes is still a shares every responsibility act like to live and work program the ice storm drains that a nutrition to mothers don’t because of a violent and it’s important to understand properly only the lonely transaction the procedure for checking out is that it’s not high and that’s a quiet little more things that incorrectly that could cause the stolen announced on Tuesday that costs the swift.

Here’s the real text from ‘Cashiers Make the Difference’:

Wic is a federally funded supplemental nutrition program for women, infants, and children.
Wic provides, at no cost to the participants, nutritious food, nutrition counseling, healthcare referrals, and breast feeding support.
For a Wic authorized vendor everyone who works in the store shares a great responsibility. By participating in the Wic program your store brings better nutrition to mothers and their children. Because of the vital role that the vendor has its important to understand the proper way to handle a WIC transaction. The procedure for checking out is different; it’s not hard, but it does require a little more attention. If done incorrectly it can cost the store money and that sometimes means it costs you.

So much for machine aided stenography 😉

Using the freeware that Google suggests for editing the close-caption text, basically putting a phrase based timestamp on the video, I was able to link the video and caption file without difficulty

When a closed-caption video is uploaded to YouTube a ‘cc’ icon appears in the lower-right of the screen. Alternately, you can embed the caption directly into the movie using a Microsoft product called, DivLand Media Subtitler and VirtualDub.

The end result looks like this (captioning done in first minute of video; click on ‘cc’ to toggle):

Leave a Comment


Leave a Reply

Allen Sheaprd

Thank you for the update.

I saw one CC version that runs on two PCs. It not only does speech to text but inserts the speakers ID you put in. It will also add meta data like “Door Bell sounds” or “Music plays” that is needed but not always included.

HHS has a pretty good version they use for thier webcasts – but it has trouble with some words and technical terms.

Roy Stiles

We have been working to conquer the same problems on a larger scale. From a price perspective it will be MUCH cheaper to find an automated solution that has an upfront cost that will not charge you based on throughput like a transcription service will. Those automated solutions like the ones your trying as well as the higher end/pricier stuff that we’ve looked all have the same issue with accuracy that will require varying levels of human intervention to make corrections. More expensive solutions are more accurate for the most part but it takes the same amount of time if not more for a human to check and correct an inaccurate transcript as it does to just create it from scratch on your own.

There is a company that I’ve looked at that provides a blend of automated transcription and human verification at a lower cost than 100% human transcription. 3playmedia.com is the site and they may be a good fit for occasional media producers that dont need to automate their processes. Another company that claims an automated solution that has a high accuracy rate and inline captioning on-the-fly is VidiTalk but what I’ve seen so far leads me to think they are more in the R&D phase then in Execution.

Whoever cracks the code for automated transcription with timecoded embedded captioning will have the holy grail of 508 compliance within the media industry. I dont believe anyone is truly there yet, but there should be some promising stuff on the horizon as increased use of media in government pushes that demand higher.

Allen Sheaprd

MSG Roy,

That is a good point – real time v.s. recorded. From what I have seen the recorded playback is better.

It is not only 508 compliance – having second languag subtitles is another high goal. Interesting that its CC for the deaf and subtitles for the hearing.