What translation models are better than LLMs? The problem with Google-Translate-...

thatjoeoverthr · 2025-06-02T22:14:07 1748902447

This is true, and LLMs crush Google in many translation tasks, but they do too many other things. They can and do go off script, especially if they "object" to the content being translated.

"As a safe AI language model, I refuse to translate this" is not a valid translation of "spierdalaj".

selfhoster11 · 2025-06-02T22:18:51 1748902731

That's literally an issue with the tool being made defective by design by the manufacturer. Not with the tool-category itself.

thatjoeoverthr · 2025-06-03T09:39:13 1748943553

Indeed. 200 OK with "I refuse" in the body is not valid JSON, either, nor is it decodable by any backend or classical program.

Aachen · 2025-06-02T23:38:35 1748907515

Was thinking the same about the censoring, but going off-script? Have you seen DeepL or similar tools invent things?

thatjoeoverthr · 2025-06-03T09:40:32 1748943632

I've seen people use ChatGPT to translate for them, and seen it embellish texts with its typical obsessions, like "combining" and "engagement".

raphlinus · 2025-06-02T23:26:58 1748906818

The converse, however, is a different story. "Spierdalaj" is quite a good translation of "As a safe AI language model, I refuse to translate this."

bird0861 · 2025-06-03T16:30:06 1748968206

One would have to be absolutely cooked to consider using a censored model to translate or talk about anything a preschooler's ears can't hear.

There are plenty of uncensored models that will run on less than 8GB of vram.

ifdefdebug · 2025-06-03T01:40:26 1748914826

haha that word. back in the 80ies,some polish friends of mine taught me that but refused to tell me what it meant and instructed me to never, ever use it. Until today I don't know what it is about...

gpm · 2025-06-02T22:46:33 1748904393

I've been using small local LLMs for translation recently (<=7GB total vram usage) and they, even the small ones, definitely beat Google Translate in my experience. And they don't require sharing whatever I'm reading with Google, which is nice.

yubblegum · 2025-06-02T22:52:57 1748904777

What are you using? whisper?

gpm · 2025-06-02T22:57:38 1748905058

Edit: Huh, didn't know whisper could translate.

Just whatever small LLM I have installed as the default for the `llm` command line tool at the time. Currently that's gemma3:4b-it-q8_0 though it's generally been some version of llama in the past. And then this fish shell function (basically a bash alias)

    function trans
        llm "Translate \"$argv\" from French to English please"
    end

codethief · 2025-06-02T23:50:03 1748908203

> Uh, translation, not transcription

Whisper can translate to English (and maybe other languages these days?), too.

albertzeyer · 2025-06-02T22:36:32 1748903792

I'm not sure what type of model Google uses nowadays for their webinterface. I know that they also actually provide LLM-based translation via their API.

Also the traditional cross-attention-based encoder-decoder translation models support document-level translation, and also with context. And Google definitely has all those models. But I think the Google webinterface has used much weaker models (for whatever reason; maybe inference costs?).

I think DeepL is quite good. For business applications, there is Lilt or AppTek and many others. They can easily set up a model for you that allows you to specify context, or be trained for some specific domain, e.g. medical texts.

I don't really have a good reference for a similar leaderboard for translation models. For translation, the metric to measure the quality is anyway much more problematic than for speech recognition. I think for the best models, only human evaluation is working well now.