Text-to-Speech Narration is Being Forced on Audio Description Users

In a cartoon restaurant, a robot with a television for a head puts its hands on its hips beside Sponge Bob. The anthropomorphic yellow sponge frowns deeply, hangs his arms at his sides and looks sideways at the robot.

Text-to-Speech Narration is Being Forced on Audio Description Users

A debate continues among audio description users: Should audio description (AD) narrators perform in a neutral style which mirrors the objective quality of description or opt for a more performance-oriented cadence that reacts to each scene’s tone? A case can be made for either style but, despite this disagreement, AD users seem to agree on at least one thing: Text-to-Speech (TTS) narration is terrible.

 

Users’ complaints about audio description are often peeves, issues that could use some massaging to improve the experience by a small degree. However, grievances concerning a TTS narrator nearly always describe a ruined experience and an inability to suffer through this type of narration.

 

If it seems obvious that a grating computer voice is no substitute for mellifluous human tones, that’s because it is. The thousands of complaints and internet comments on the subject merely confirm what is all but a fact.

 

Given the obvious drop in quality from a human voice actor to a TTS narrator, we must conclude that the latter’s use is willful ignorance on the part of providers. It’s especially upsetting that the offending streaming services use a mixed bag of TTS and human description. This tactic intentionally makes it harder for consumers to ‘vote with their dollars’ because in on-demand marketplaces they have no way of knowing if a title has TTS description before purchasing it.

 

This issue widens a familiar chink in the armor of the otherwise fabulous 21st Century Communications and Video Accessibility Act. The act specifies that a certain percentage of a company’s content must be described but does not ensure the quality of the audio description. This gives companies that only provide description to stay out of legal trouble free rein to produce unlistenable audio description narration tracks.

 

If litigation is the only thing that will motivate some folks, we’ve got to implement legislative protections against this type of low-end content’s production. Therefore, I urge the reader to reach out to the American Council of the Blind or similar organizations that consolidate the voice of the VI community. Make these representatives aware of the egregiousness of this issue and how common your grievance is.

 

Some visually impaired users think that a Devil’s bargain can be struck. They believe that while TTS is lower quality, its automated nature would proliferate audio description more quickly. This misconception stems from some users’ belief that text-to-speech audio description is also written by a computer. This is not so. A program advanced enough to decide what images best serve a visual story and craft a supplemental narrative has not yet been built. Given that scriptwriting is the longest, most costly part of audio description production, further implementing TTS would have a marginal impact on audio description’s availability.

 

Wadjet will never produce a description track with a TTS narrator. We are committed to hiring wonderful performers who not only voice our scripts with verve and style, but who also reflect the cultures and life experiences represented in the programs and visual narratives we proudly make accessible.

1 comment

  1. Thanks for illuminating the subject of writing costs and for this:

    “…TTS is lower quality. It’s automated nature would proliferate audio description more quickly. This misconception stems from some users’ belief that text-to-speech audio description is also written by a computer. This is not so.”

Leave a Reply

%d bloggers like this: