Wednesday 28 July 2021

Translating LWBs – How I go about it

I mentioned the other day that I spend a fair chunk of my Card-related time translating the books/booklets that come with the decks I get from other countries. I also said I was looking for possible topics for this blog. 

A big "Thank you" to Monika for responding and prompting this post. 🙂

My process is possibly/probably not the most efficient way of doing things, but it works for me and, if nothing else, it may give you an idea of how to get started.

This is the basic step-by-step process I go through. 

1 – Create a document to "house" the translation

2 – Create images of the pages I want to translate

3 – Upload the images to an Optical Character Recognition (OCR) program to convert them to editable text

4 – Paste the output into the document and reformat/check

5 – Run the text through a translation program

6 – Paste the translated text into the document and tidy up

1 – Create a document

I start by creating a document to hold the original text and the translation. 

I like to keep the original just in case at a later date I spot something odd or uncertain in the translation. I can then easily paste the offending bit of text back into Google to see what's what.

2 – Scan/photograph pages

I used to manually type the text into the translation software, but that's slow, error-prone work, and particularly onerous when you're translating a language that has its own unique alphabet – Russian, Greek, etc. 

So I began scanning the pages, but have since discovered that it's quicker and easier to take photos with my phone, and they work just as well. I like to trim them to just the text, but I have found that the OCR software copes well with extraneous backgrounds.

3 – Upload image into OCR software

Optical Character Recognition software, as the name suggests, recognises characters in an image, extracts them from the image and converts them to editable text. I use Online OCR. There are other OCR tools available but I find this one reliable.

Upload an image into the software, select the input language, and hit "Convert". Online OCR (others may vary) shows the resulting text on-screen. I simply copy that and paste it into my document, but you can also download the output if you prefer.

4 – Paste into document and tidy up

The output text is unformatted, so you'll need to put line and paragraph breaks back in.

Also, the text is often full of hyphens where words were split across lines in the booklet. In order for the translator to work properly these need to be removed. 

If it's a small amount of text or I can see from the original that there are only a few hyphens, I'll edit them out manually. 

But when I'm working with big chunks of text it's easier to use the program's Replace option to do the work. In MS Word (again, it may vary if you use a different program), I highlight the text and select Replace from the menu. I tell it to Find "-" (without the quotes) and Replace with nothing, i.e. I leave this option blank. Then click "Replace All". You should check, though, for the occasional actual hyphenated words in the original text and put those back in, as they too can affect the translation.

5 – Run text through translation software

Now I paste the text into Google Translate and let it do its thing. Again, there are other translation programs, but I like this one.

I then copy the translated text and…

6 – Paste translation into document

… Paste it into the document.

At this stage, I'll give the output a quick scan just to see if it all makes sense. If something doesn't, I check the OCR conversion against the original text to see if there are any mis-recognised characters and if so correct them. If your keyboard doesn't have the necessary alphabet, just copy and paste the characters you need from elsewhere in the text. 

Then I pop the problem sentence back into Google Translate for another attempt. Sometimes isolating a sentence from the rest of the paragraph helps. It can also be useful to break the sentence up into phrases in order to gather the sense of what it's trying to get across.

Sometimes words that are different in the original text get translated as the same word. In Google Translate, if you highlight the word it brings up a ranked list of words it can translate to. You can then cherry-pick the one you think is most appropriate.

6b – Reverso Context

Another software option for ambiguous words or ones that don't quite make sense in the context of the translated sentence is Reverso Context. Here you can enter a word and it will show you translated instances of that word in context. You can then determine which makes the most sense and tweak your translation accordingly.

6c – Idioms

Additionally, you can sometimes find yourself confronted with what must be an idiom peculiar to the original language because the literal translation makes no apparent sense or hints at a concept that would only be understood by a native speaker. Often you can guess what it's getting at, but if you're stumped or just want to double-check, try Googling the phrase in its original language. I have found websites that reference local idioms and explain their meanings. You will, of course, have to translate the website. All in a day's work. 😉


That's pretty much it. I know it looks like a lot, but in practice it's usually quite straightforward. The level of translation "perfection" required is up to you. Most of the time the initial 'raw' translation is adequate to get the basic idea of what the book is saying.

If you have any questions about any of this, feel free to comment or contact me and I'll try to help. Just don't ask me to translate something for you. I have more than enough of my own to do! 😎


2 comments:

  1. WOW Judy, thank you so much for taking the time to share your process! I knew about Google Translate and recently discovered Reverso Context, but Online OCR seems to be the program I was missing! Very cool! 🙏

    ReplyDelete
    Replies
    1. You're very welcome, Monika. Thanks for the inspiration. 🙂 Happy translating!

      Delete