Saturday, October 15, 2011

Translation Memories: Pros and Cons

To say that translation memories are popular in the translation/localization industry would be a gross understatement. Chances are, if you have worked with or in the industry in any way, you have heard "TM," "translation memory" or some other term referring to translation memories.

Almost every translation system or tool that claims to increase productivity uses a translation memory. Some make claims that their memory is somehow better than another tool's, but the basics are the same.

TMs are exactly what their name describes: They are a way for a computer to remember what has been translated before. This is done on a segment level (or on a sentence level, basically, though there are exceptions). If you come across the same segment later, or a similar enough one, your translation tool will be able to look it up in the translation and either offer it up as a suggestion or simply plug it in as the translation.

There are a lot of benefits to having a translation memory, when it is managed correctly. For example, if a client has an update to a file, they might ask you to only translate what has changed and paste it into the previously translated file. If you have a translation memory, you can just stick the whole file into the tool, and it will fill out what has already been translated, leaving you to only translate what has changed. No need to search for the changes, find the corresponding place in the previous translation, paste, etc...

There might also be a lot of occurrences of the same segment in the same document. Translation memories can be updated as you work, so if you get to a segment that you've seen before, the translation memory can fill it in for you. This is also the case with slogans, etc., where the chance of repetition is high. Translate once, then reuse as needed.

With advantages such as these, productivity can and does go up. However, translation memories have their drawbacks, several of which are often ignored. You should know that I'm not making these up, that in about a year of working in the industry as a localization engineer, I've come across each of these drawbacks at least once.

The first drawback that I'll mention has to do with translator laziness or complacency. It's not really the translator's fault, either. It's something of a trap. Most translation memory tools have the option to fill in "fuzzy" (inexact) matches if their degree of similarity to the segment in question is above a certain degree. It's very easy, as a translator, to see that a translation has already been inserted and move on. After all, your time is valuable. However, the meaning of a sentence can change without the sentence changing much at all. Some theoretical examples: "You must do that." vs "You must not do that."; "John has 1 dog." vs "John has 3 dogs."; etc. If a translator misses one change, it gets put into the translation memory. Then you have an incorrect translation all ready to be used again. That's not even mentioning exact matches that may be out of context, so their translations should be different but aren't, because it's not usually considered worth a translator's time to look at something that's already translated at a 100% match.

The second drawback can be seen as an extension of the first, except that this time it's not the translator at all who's at fault, but a side effect of what may have happened previously. That drawback is error propagation. If a segment is translated incorrectly, it will be incorrect all over the place. I've seen this happen literally hundreds of times within a very technical set of files. The worst part? Sometimes it's very difficult to fix. Because the same source segment can be translated different ways given different contexts, some translation memory tools will save any changes as a different entry into the translation memory. Depending on the features of the tool, the incorrect translation may pop up again. Sure, you can look in the memory itself and delete the offending segment, but translation memories can get big, so it can be difficult to manage them sometimes.

The third drawback (and the last that I'm going to mention) is unrealistic expectations. While this is not really the fault of translation memories, it effects people who use them. I offer the following examples, though there are a lot of ways expectations can be unrealistic. Example #1: Almost all translation memories work on a segment level. They don't deal with anything smaller than that. A lot of companies will charge/pay depending on the degree of matches in a translation memory. Why pay for something that's already translated, right? Well, some people seem to think that if they change "a few words" in some sentences, they will only be charged a lower match rate on those words. However, because translation memories work at a segment level, if the match rate (how much of the segment is similar to the entry in the translation memory) is low enough that it passes the threshold of what is charged, it affects every word in that segment, not just the words that have been changed. I've seen a few clients get into arguments because "only a few words" had been changed. Example #2: Turnaround times. Some clients, and I'm talking both about the end client and the translation companies that use freelancers, seem to think that because you have a translation memory tool, all translations can get done lightening fast, regardless of how repetitive the text is, or how long it is, or if you've ever translated anything like it before (i.e. if you have segments in your translation memory from similar texts). Translation is still work, and sometimes, regardless of what tools you're using, it takes time.

So translation memories can be huge time savers. They can make sure that your wording is consistent in similar segments, and they can give suggestions based on previous translations. However, they are not miracle workers and in some cases can make things worse. Missing fuzzy matches can allow errors to sneak in, and sometimes errors can be propagated to other sections, or even completely different documents. You have to be careful when using a translation memory, and you have to make sure to manage clients' expectations.

No comments:

Post a Comment