Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

>> Google Translate is often almost as accurate as a human translator.

This is the kind of overhyped reporting of results highlighted by Douglas Hofstadter in his recent article about Google Translate:

I’ve recently seen bar graphs made by technophiles that claim to represent the “quality” of translations done by humans and by computers, and these graphs depict the latest translation engines as being within striking distance of human-level translation. To me, however, such quantification of the unquantifiable reeks of pseudoscience, or, if you prefer, of nerds trying to mathematize things whose intangible, subtle, artistic nature eludes them. To my mind, Google Translate’s output today ranges all the way from excellent to grotesque, but I can’t quantify my feelings about it.

https://www.theatlantic.com/technology/archive/2018/01/the-s...

It's funny how the article above is claiming to speak of "the downsides" to deep learning, yet it spends a few paragraphs repeating the marketing pitch of Google, Amazon and Facebook, that their AI is now as good as humans in some tasks (limited as they may be) and all thanks to deep learning. To me that goes exactly counter to the article's main point and makes me wonder, what the hell is the author trying to say- and do they even know what they're talking about?



There's only a contradiction if you insist on simplistic binary thinking. (Either translation works or it doesn't. Machine learning will either solve everything or is useless garbage.)

Instead, we can acknowledge that Google Translate has reached a useful level of skill (enough to usually get the gist of an article), but not enough to be fully reliable, accurate translation. The Hofstadter article usefully demonstrates how far we have to go.


>> There's only a contradiction if you insist on simplistic binary thinking. (Either translation works or it doesn't. Machine learning will either solve everything or is useless garbage.)

I'm sorry, but I don't see where in my comment it looks like I'm saying anything like that.

>> Instead, we can acknowledge that Google Translate has reached a useful level of skill (enough to usually get the gist of an article), but not enough to be fully reliable, accurate translation.

Hofstadter's article actually argues that Google Translate is not enough to get the gist of many passages, probably all non-trivial ones.


Okay, in that case I'm not sure what you're trying to say. Maybe something got lost in translation :-)


You can ask me what you don't understand.


That article pretty much sums up what my sister in law (runs a professional translation business) says.

To me, someone who doesn't have a second language strong enough to verify the accuracy of translation, I simply run things through google translate, to the target language and then run the output back to English. Its the translation equivalent of the game of telephone.

Take "the fat cat sat on the mat"):

"the big cat was sitting on the carpet" or "The cats of oil sat on the bed"

and lots of other things which are just odd..


This is a nice example of adversary input, but I think it misses the point a bit. The point being, if I take a couple paragraphs of a tech text, online article, or the like, I will get a correct translation most of the time.

It won't do poetry, but GT did remove a need in a human translator in many everyday tasks.

Just last week a friend asked me to translate a preamble to a master thesis (political economy). I fed it into GT and it just worked.


What does this prove?

Languages are different, you should expect any automatic translation program to do something like this, as it is trying to translate without prior knowledge of the 'conversation' you are having.

Many languages put a lot of meaning into a single word. Look at English. "Tank" is good example. It's either a thing that holds a lot of fluid, a thing that blows up buildings on treads, or a verb that means you are taking a lot of damage in lieu of others taking damage. One word has a lot of meaning. Then you can get into conjugations and tenses, yeesh.

Google translate is not meant to be a natural language processor, it's just a dumb translator. It can't figure out context as it is just a simple text box and doesn't look at the million and one things natural language processing would use.

Trying to play telephone with it just proves that a dumb text box is dumb.


Its proves that _really_ simple concepts aren't being conveyed despite the claims that it is more than simple word substitution. The linked article talks about translation of things which require some cultural knowledge. The idea that an overweight house cat is sitting on a protective piece of material is simply lost, and replaced with "big cat" which could just as well be a lion or a bed is misleading at best and just plain wrong at worse.

I might understand if the target languages didn't have a concept because its cultural. That sentence failing to be translated implies that the target language doesn't have the concept of people/animals being overweight, or what a mat is. I might be more accepting if it came back with "throw rug" or something instead of mat, but it never does that, its like it has a list of rough synonyms and it picks one at random. Hence the "Fat cat" bit becoming what likely was "oily cat". The more subtle things (cat in this case being a house cat/pet) might be the bit of cultural information that lends understanding to the whole sentence, but it most cases that isn't really what the translation is getting wrong..

Claiming any of this is even near human translation levels is misleading at best, considering it falls down worse than most poorly translated computer manuals I've read. Despite all the claims of the wonders of DNN's the translations look little more advanced than direct word translations (fat=oil or mat=something you sleep on) with a bit of fuzz that fails in strange ways.


Rather than translating the description into an abstract idea, and then translating that idea into a different language; it seems like it's trying to go straight from English to Chinese.


To be fair, translating to 'idea-space' is really hard. The medium is the message. We have ideas in English that other languages do not have, and vice versa. Chinese is famous for not having a past tense. Spanish has a subjunctive tense that is difficult for fluent speakers to translate into English. Some languages have cardinal direction based gender tenses. The word 'che' in Italian is a head-tweaker for English speakers.

The problem expands as a binomial (handshaking problem). Google translate has 104 languages, which means 5356 cases to deal with (n(n-1)/2) for each 'idea' present. As such, the 'idea' could be translated, but it would result in a mess of a response that no native speaker would translate the 'idea' into. You'd have to teach the basics of the language to the person before real translation could occur. Languages are meant to communicate between two people (writing is subtly, but importantly, different) and context, body language, and recent history play large roles into that communication.


> Claiming any of this is even near human translation levels is misleading at best

Oh, got it. The issue is that Google is misrepresenting it's translate service as an 'end-all-be-all'. Yes, I would agree with that.

But as Hackers, we should know better than to think that it could ever be such, given that it is just a text-entry box. NLP needs a lot of context that a text box can never have.


This is actually a very useful measure of the amount of error in Google Translate translations, or any similar process really. I see it as sending a noisy signal down some medium. You send the signal back and forth over the medium and observe the amount of information lost in the process. The speed by which the signal is corrupted beyond the ability to extract useful information from it is indicative of the amount of noise in it (but also tells us something about the medium itself).


Hofstadter makes it obvious that 'Google Translator' is a misnomer (and not even a naive one). Even "dumb translator" seems overblown. 'Magic decoder ring' might be closer.

It's troubling is that there are, undoubtedly, people out there who actually believe that the 'translation' they're seeing is a halfways accurate represention of the original. If that accidentally happens, it's sleight-of-hand. If not, it could be dangerously misleading.


It's kinda like everything Google does these days. They wow'd you 5-10 years ago, and now they've regressed. Look at Maps, it was better in 2010. Now? I have to second guess it at just about every freeway on-ramp.


This business of 'constant improvement' very often leads to inferior, harder-to-use product. When the person who 'got it' and 'really cared' moves on, you can see it in the result.


Maps was NOT better in 2010. That's an absurd claim to make.


Then I'm absurd. To me, the interface was better then. It really started dropping off in ~2015.


Where is the contradiction?




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: