Five live-caption quality issues from the UK

Error message

Deprecated function: Array and string offset access syntax with curly braces is deprecated in include_once() (line 14 of /home/mediacc/public_html/themes/engines/phptemplate/phptemplate.engine).
Friday, 28 November 2014 10:08am

The quality of live captioning was a major topic discussed at the recent The Future of Subtitling conference held in London on 10 November 2014. Media Access Australia CEO Alex Varley presented at the conference and he gives his impressions on the main issues from the UK perspective.

Woman using live captioning set-up with microphone, headphones and computer

1. The regulator sees value in monitoring quality

The UK regulator Ofcom is actively engaging in monitoring live captioning and trialling new techniques, such as delaying the broadcast by 25 seconds so that the captioners can prepare synchronised captions (this is on Welsh-language channel S4C). Ofcom is also assessing caption quality on randomly selected samples of live programs. The broadcasters do not know which sample programs will be picked and the reporting of the results is public.

According to Peter Bourton, Head of TV Content Policy at Ofcom, “We believe that even the process of monitoring these categories of programs has helped to focus the attention of the broadcasters and their access service contractors to maximise the quality and accuracy of their news products.”

2. The best way to improve live captioning is to avoid it

The Ofcom research has shown that the process of monitoring quality has encouraged broadcasters to focus on getting scripts, running orders and pre-recorded video packages to caption providers in advance. This allows for prepared captions to be created and sent out in synchronisation with the program and live captions are only used for live crosses, live interviews and other truly live segments. The next Ofcom report will also analyse the delivery times for pre-recorded programs to see which are delivered so late that they have to be captioned live.

3. Access suppliers like Ericsson/Red Bee Media are working on continuous improvement

The improvement of speech recognition software assists with accuracy as does having time and access to more information about what is on a live program so that a captioner can prepare in advance. The increase in providing a mix of prepared and live captions in news programs will mean that access suppliers get better at making the transitions between the two types smoother and a better viewing experience. The key is greater cooperation between access suppliers and their broadcast clients, and the evidence is that in the UK this is rapidly improving, with some general prodding and encouragement from Ofcom.

4. Measuring quality and comparing results produces better captions

Dr Pablo Romero Fresco from the University of Roehampton is the creator of the NER model and is actively involved in the measurement of live caption quality. His research focuses on what causes errors in captions and whether software or training can overcome these. Omission errors (information left out of captions) are usually caused by very fast speech rate, people talking over each other and unscripted programs. It is difficult to overcome these except through preparation. On the other hand, recognition errors (where the wrong word is captioned) are related to speech recognition software and research is focused on how the English language works and that will be shared with caption providers to improve their software. Latency (the delay between the soundtrack and the caption appearing) is generally around 6 seconds and this compares well with non-English speaking countries. There are also issues with prepared captions in live programs being displayed for too short a period to be read properly. Training and management can overcome these issues.

5. Some presenters speak too fast, impacting on captions

A common problem is that not all live presenters are trained to speak at comfortable levels. The industry norms of 180 words per minute for maximum speed of captions are based on a trained newsreader/narrator. According to Romero Fresco, the biggest issue is with chat shows where speeds of over 300 wpm have been found. Conventional speech recognition software works well up to about 200 wpm. Beyond this, good stenocaptioners can keep up and deliver captions at 250-300 wpm, but they are displayed too quickly for most people to be able to read. Access supplier Deluxe is doing some interesting work on fast sports commentators and had a demonstration at the conference of how it deals with Formula 1 racing. In essence, respeakers are used, but they condense the speech to more realistic reading speeds. This “editing on the fly” is a tricky skill to learn, but does improve comprehension. Romero Fresco’s research suggests that at 200 wpm for live scrolling captions, viewers are spending 80% of their time reading the captions leaving only 20% to watch what is happening on screen.

According to Varley, the UK approaches can be used in other parts of the world to great effect. 

“The techniques used in the UK for captioning are the same the world over and broadcasters and access suppliers can apply them anywhere to good effect. A key difference is a proactive regulator that really understands the captioning process and is willing to monitor, encourage and openly report on approaches, rather than just focus on compliance and complaints.”

Media Access Australia’s white paper, Caption quality: International approaches to standards and measurement, looks at many of these issues, and compares the NER model to others that have been proposed.

Top of page