All articles
Translation7 min read

Real time translation in the newsroom: what works in 2026

Cascaded ASR plus machine translation is not new, but the gap between a demo and production is still enormous. Lessons from real newsroom deployments.

FO
Frans Olsthoorn
Founder & CEO, Scriptix
LinkedIn

Real time translation looks easy in a demo and very hard in a live newsroom. A demo runs on clean audio, a cooperative speaker and a single language pair. A newsroom runs on a fast moving anchor, cross talk from a remote correspondent, three guests on a panel and a tight on air budget. Three things explain most of the gap between those two worlds.

1. The pipeline has to commit

ASR engines emit partial hypotheses that get rewritten as more context arrives. A translation engine that retranslates every revision produces unreadable, flickering captions. A production system needs a stable commit policy: when do we lock a segment, send it for translation and stop changing it?

The naive answer (commit at every full stop) breaks down on long sentences and on speakers who do not pause cleanly. A better policy commits on a combination of acoustic silence, syntactic completeness and a maximum hold time. Tuning that policy is half the engineering work in a real deployment.

2. Context windows decide quality

Translating a single sentence in isolation discards the topical context that human translators rely on. Modern systems carry a rolling context of recent segments and pass it to the MT engine so that pronouns resolve, terminology stays consistent and tense stays coherent across a long passage. The difference in fluency is large, and it is one of the easiest wins available to anyone running a translation pipeline today.

3. Domain adaptation still wins

Generic MT struggles with broadcaster specific terminology, named people, place names and house style. A glossary and a customer specific MT model meaningfully reduce post edit effort, often by 30 percent or more in the deployments we have measured. For newsrooms that subtitle in three or four target languages, that is the difference between staffing a live editing desk and not.

What Scriptix ships

Scriptix combines streaming ASR with an MT layer tuned for broadcast latency and a controlled commit policy, so captions stay legible on air. Glossaries lock in terminology per programme or per beat. Both layers can be customer tuned, and both can be deployed inside the customer's tenant when residency or sovereignty requires it. The result is a translation pipeline that survives the messy reality of live news without flickering its way through every breaking story.

About the author

FO
Frans Olsthoorn
Founder & CEO, Scriptix

Frans Olsthoorn founded Scriptix in 2010 and has spent more than fifteen years shipping speech recognition into European broadcasters, courts and government bodies. He writes about ASR, accessibility regulation and the realities of running AI workloads inside customer infrastructure.

Connect on LinkedIn

Keep reading