Can machines make audio description easier?

Error message

Deprecated function: Array and string offset access syntax with curly braces is deprecated in include_once() (line 14 of /home/mediacc/public_html/themes/engines/phptemplate/phptemplate.engine).
Tuesday, 11 December 2012 16:02pm

The question of whether automated tools help or hinder the production of access services, such as audio description, was a much discussed topic at the Languages and the Media conference held in Berlin in November.

The main issue is the effect of replacing a human voice with a synthetic one on the experience for the viewer. This and a number of other potential issues are discussed in more detail below.

Are machine voices less effective than human voices?

Beatrice Caruso of SWISS TXT Ltd arguesthat we should take our cues from the intended audience for audio description. Blind people are used to artificial voices that are already used for screen readers, navigation devices and smart phones (often with a choice of the voice). These voices have become more ‘natural’ over time as well. 

Aline Remael from the University of Antwerp has undertaken work looking at translation of audio description from one language to another and how audio descriptionworks with audio subtitles (on a subtitled film where the subtitles are read out so that a blind person can access them). She suggests that machine voices can be used to distinguish the audio subtitling and audio description from the human dialogue in the soundtrack, making it easier to distinguish the different elements. Remael speculates whether this could be further used in the future by having multiple artificial voices?

Are there cost efficiencies from using machine voices?

Caruso argues that replacing the voice talent with a machine means you don’t have to pay the voice talent. However, this is not the only saving. By not having a human working in a studio at a particular time, you can edit, fix, change and modify the description anytime and the machine will ‘revoice’ it as required. Furthermore, this means that the editors and quality control functions can work in their own time and off-site.

Anecdotally, the Canadians report that the voiceover talent is paid six times more per hour than the person creating the audio description. It should be noted that the recording of the voice is a small portion of the audio description task. In many cases, the person writing the description also records it.

Is the quality from a machine less than using a human?

Again, Caruso argues that poor quality is inevitably down to poor scripting, not the method used for voice over. Furthermore, the speech synthesis system used by SWISS TXT Ltd identifies any words that are unknown and thus have to be transcribed phonetically to ensure that they are pronounced properly. These improvements to the system’s lexicon are saved to constantly improve the synthesis engine.

Where to next?

Matamala, Fernandez and Cortiz from the Universitat Autonoma de Barcelona are doing further research into the acceptability/comprehension of audio description using machine voices for blind people. They are working primarily in Catalan and Spanish, which have fewer choices of different synthesised voices to English, however their preliminary work suggests that synthesised voices are acceptable to blind audiences.

What does the Australian expert say?

Chris Mikul, project manager at Media Access Australia, is a qualified audio describer and has trained people in Australia and New Zealand in how to audio describe. “The voiceover component of audio description is only a small part of the overall task. Clearly, production companies will look for ways to save costs and synthetic voicing can generate some small cost savings.”

“I take the view that we need to encourage more audio description on television, in cinemas and online. Anything that helps to reduce barriers to that uptake is a good thing. I’m confident that there is enough work being done to ensure that the scripts are good quality and that flows on to good voice description, whether that is done by a human or a machine.”
 

For a more detailed look at how audio description is written and performed see our feature Aussie TV finds its voice.


Top of page