Hi! Here is some information about romanisation of Cantonese if you are interest...

s_Hogg · on Aug 18, 2019

Somewhat off topic: any chance you know how come Google doesn't have an explicitly cantonese model for translation?

spacehunt · on Aug 18, 2019

Not a Googler so I can only guess. But it seems like Google did try to treat Cantonese as a Chinese variant in the past, eventually they dropped it probably because they realised they're too different.

I know Google is actively working on the Cantonese version of Google Assistant, though not sure when it'll be officially released.

toastal · on Aug 18, 2019

It is a variant of Chinese though. Chinese is a language family, not a language -- which includes Mandarin, Cantonese, Hakka, et. al.

spacehunt · on Aug 18, 2019

Whatever it is, Cantonese has different pronunciation, vocabulary and even grammar from Mandarin. Which means it takes a non trivial amount of work to adapt a language model designed for one to the other.

Source: I'm a native speaker of one and fully fluent in the other.

ferzul · on Aug 18, 2019

afaik Google needs a multilingual corpus. so if Cantonese is mostly written using Chinese characters, the corpus will be in Chinese characters.

and if written Cantonese is mostly informal (conversation, shop signs) it will not often be multilingual. so the approach that has worked for most languages wouldn't work then.

and it surely wouldn't work for a completely different, lossy orthography - without independent training.