About a month ago I enabled Microsoft Translate as a source of suggested translations in Crowdin. I don’t really know if these translations are good, but I suspect they’re good enough and I’d like to turn on a feature that automatically translates all new strings using Microsoft Translate. Do translators have any thoughts about this?
My own thoughts:
Volunteers would still have the ability to suggest better translations if the automatic ones are bad
This would change the work of volunteer translators to reviewing and editing automatic translations; hopefully that’s less work
If I also auto-approve these automatic translations, text would get translated sooner in the product
This would provide more comprehensive (though possibly imperfect) translations for languages that don’t get much / any attention, like Afrikaans and Vietnamese
In general, automatic translations can be helpful to get an idea of what is being said, but they should never be trusted to be a perfect rendering of the original text.
There are a few important points:
The more text, the better the translation. A single word may have many different meanings (like bear, which could mean the animal, or endure), and is therefore often misinterpreted by a translation program. A complete sentence (A brown bear may be very dangerous) often gives a better translation.
When it comes to very specific content, like common names of organisms on iNat, translation programs may have difficulty recognizing those and give a quite literal translation, which could give very strange, wrong names in the goal language. An example: The small butterfly European common blue (Polyommatus icarus) is called Icarus blue in Dutch, but translation programs will almost always come up with weird names, like European blue, European blue songbird, or even European robin.
I’m not saying enabling translation is good or bad; just pointing out some things to always keep in mind. No matter how good translation programs get, they are never flawless.
that surprises me since South Africa is an observose iNat country.
Maybe a blog post from iNat about the need for translators, and how to, and mentors.
To clarify, Crowdin is not the source of common names in different languages on iNat and this proposal is not for automatic translation of common names. The strings available for translation on Crowdin are text in the user interface, see https://crowdin.com/project/inaturalistweb for more info.
It’s also less rewarding work. Sort of like helping someone with an ID that they don’t know vs. correcting the CV’s default suggestion that they mindlessly picked. So this may not go in a productive direction despite your expectations.
Are there non-English iNat forums? I feel a little bit unable to contribute much to this discussion but I hope there is some way to figure out whether we end up with something helpful or incomprehensible slop which is worse than not having a translation (come to think of it, that would be good to know whether the translation is automated or not).
The only one I know of is Charla a Naturaleza, a Spanish counterpart to Nature Talk. And even then, I find that new threads there tend to cross-post to the English-langiage forum.
With that said, though, there is no requirement to post in English. I have seen posts in various languages, and the community replies as best we can.
Many if not most scientific names are not Latin, but are instead “latinized” from many other languages. Most of them would end up being untranslatable, or would produce very wrong translations.
It might be a useful option on the app, but on browser I’d suggest not as it’s easy to have the browser do the translation for you.
In my experience living overseas, in Vietnam, sites that have automatic translation enabled on them often have issues and you sometimes have to force the site back to its original language to make it usable.
I’m often using sites in Vietnamese, English, and German and having to translate back and forth on those, as well as documents in the office in those languages, and if the translations are used for anything official they take a lot of manual correcting afterward.
@joeybom - a caveat with your “the more text the better the translation” note. This is only really true if the sentences themselves are relatively short and simple. Automatic translation systems tend to struggle with more complicated sentences.
A lot of the strings are short phrases that fit into a longer string, and some of them have errors because the translator didn’t have the context. For instance “abierto/difusa/privado” (these are three separate strings, I think) should be “abierta/difusa/privada” because the noun is either “geoprivacidad” or “ubicación”, both of which are feminine. This is on top of polysemous words like “bear”, which can mean “oso”, “soportar”, “dar a luz”, or “parir” (the last two mean the same but are used differently).
I vote no on any automatic translations of user interface strings.
I once saw a profile of someone on an international matchmaking service which said something like “I do exercise, road, and as many vegetables.” This was auto-translated from Spanish. Only because I know Spanish could I figure out that it’s actually supposed to mean “I do exercise, walk, and eat many vegetables”. Someone who didn’t know Spanish would have been lost.
Sorry I haven’t done any translating in a while. I’ve been busy surveying and programming, and even uploading observations I’ve been slow.
From my perspective, translating iNaturalist from English to French, I don’t really know what to expect. While it might be easier to translate long texts (explanations), it can make some smaller interaction text inconsistent. Maybe it could work out with more extensive glossaries to avoid any confusion?
I ID a lot of South American Araucaria like this: https://www.inaturalist.org/observations/107929635 , so people write their observation comments in Spanish or Portuguese. I would appreciate a button to automatically translate it, as I do not speak Spanish or Portuguese, so all I am ever doing is copy pasting it into Google Translate. Machine translation lets me interact more with the other observers and get more information for making an ID than if I just completely ignored them.
I think the topic is just referring to Crowdin, which is a site where people can help with translating the user interface of the website and apps (not comments or taxon names), so I added that to the title to make it a bit clearer.
This would be a rather big change with the risk of hurting UX and alienating volunteers, therefore I recommend running a test first with a handful of languages. I definitely don’t recommend turning it on and auto-approving, as having a confusing string published is a much worse user experience than coming across an untranslated English string on the UI. With that, I recommend adopting the principle of prioritizing quality in translations above the speed of which translated string go into prod.
Ideally, I would actually measure the current adoption of MS machine translations “as is” in the current flow, but I doubt that CrowdIn provides those metrics. I the adoption passes a certain threshold, that would be a clearer signal to go and switch them on.
Also a few observations from translating into Hungarian:
Currently we use the “informal you” style for iNat in Hungarian. MS machine translations mostly use the formal you, but I have seen cases where the two styles are mixed within the same string.
Reviewing and editing MT translation may save me time.
I appreciate all the input, everyone, especially from the folks doing translation work.
Indeed, this is about translating the user interface text, like the word “Upload” in the upload button in the header on the website, not common names.
The problem with auto-approval is that I kind of already do that for all languages in my monthly review. I spot check translations for technical problems like missing variables, and I run some contributions by new contributors through Google Translate to make sure they’re not spam, but otherwise I am not capable of assessing translation quality so I approve everything else. Ideally proofreaders would do this, but with a few exceptions (German, Hungarian, maybe a few others), most proofreaders do not approve translations, even their own. The whole point of manual approval is to provide the kind of human review some of you are advocating for, but for a lot of languages it just creates situations where translators think they’re helping, but their contributions never make it into the software… unless I perform the kind of blind approval I do every month.
If I only turn on pre-translation for languages that have proofreaders that do approval, that will probably help by reducing translation work for those languages, but it will not help the languages that don’t have much human attention to begin with.
I think I’ll try turning it on for the web project for a few languages this month and see how it goes. I’ll do some languages with lots of human attention (Hungarian, Danish, Russian, French) and some with very little but that might trigger some feedback (Afrikaans, Japanese, Korean; Japanese actually has a fair bit of attention, but I don’t get as much feedback from those translators). If you translate in those languages or use the site in those languages, please let me know how you feel about the change (if you notice anything at all).
This is only relevant to translators for now, which is a tiny fraction of overall users, so I think this post and this one on Crowdin will suffice for now. If we ever want to make a more general push for more translators, that might merit a blog post.