
by Dmitri Popov
Although computers are yet to take over the business of language translation, they have become an essential part of the translation process. Many professional translators use computer-assisted translation (CAT) tools such as SDLX, TRADOS, Déjà Vu and WordFast. But since integrate tightly with Microsoft Word, you can’t use them with OpenOffice.org Writer. The Anaphraseus extension provides solution to the problem: using it, you can turn Writer into a powerful CAT tool.
CAT terminology
Before you start exploring Anaphraseus, you should understand how this extension – or any other CAT tool for that matter – works. Anaphraseus is a so-called translation memory application; that is, it doesn’t translate texts for you. Instead it stores pieces of text (called “segments”) and their corresponding translations in a file called “translation memory” (TM). During translation Anaphraseus divides the translated text into conceptual segments; In most cases, a segment equals a sentence. When you select a segment for translation, Anaphraseus scans the TM file for possible matches, and displays the closest match right under the current segment.
There are two types of matches: exact and fuzzy. If the segment in the current text is identical to the one stored in the translation memory, you have an exact match. In the real world, however, you rarely have exact matches: some words or forms in the segment can vary from the segment in the translation memory. Fortunately, Anaphraseus supports partial matches, which in translation lingo are called “fuzzy matches”. This means that Anaphraseus can find segments in the TM file that are not identical, but similar to the one in the current text.
Before Anaphraseus can be really useful, you have to use it for some time to build a usable translation memory. The good news is that Anaphraseus works with translation memories in TXM format, which is supported by almost every CAT tool on the market, so you can easily use existing translation memories and exchange memories with other users.
Working with Anaphraseus
Since Anaphraseus is just a regular OpenOffice.org extension, you can install it using the Extension Manager (Tools -> Extension Manager). Before you can start using Anaphraseus, you have to configure its options. To do this, choose Anaphraseus -> Setup. Assuming you are starting from scratch, you have to create a new TM file. In the TM section, press the New TM button, and a simple wizard guides you through the process of setting up a TM file. Another important option you might want to configure right from the start is Fuzzy Threshold in the Setup section. By default, it’s set to 0, which returns only exact matches. For better results, set it to 70.

Figure 1: Specifying Anaphraseus’ settings
Now you can start using Anaphraseus. Open the document you want to translate (also called source document), place the cursor at the very beginning of the text, and choose Translate from the Anaphraseus menu. Alternatively, you can enable the keyboard shortcuts first by choosing the Activate keyboard shortcuts item from the menu. Anaphraseus puts then the first sentence (segment) into a turquoise box, and you enter the translation in the gray box below.
If for stylistic or other reasons you want to expand the current segment to include the next sentence, choose the Anaphraseus -> Expand Segment command. This adds the next sentence to the current segment. Once you’ve translated the segment, you can move to the next one by choosing the Translate command once again. When all segments have been translated, choose the End Translation command to finish the translation process. Now you can clean the translated text. The cleaning process does two things: it removes the original text and all the codes, and saves the segments and their translations in the TM file. To clean the document, choose the Clean command or press Alt+Q. Save the cleaned file, and you are done.

Figure 2: Translating with Anaphraseus
Next time you want to translate a document using the TM file, Anaphraseus will attempt to find exact or fuzzy matches for each segment. When it finds a match to the currently translated segment, it inserts it in the gray box. The value between the segment and translation boxes indicates how exact the match is. As you continue to translate and clean the document, the TM file will grow, and Anaphraseus will get better and better at finding translations for you.
Although Anaphraseus is still at an early stage of development, it is already quite usable and holds a lot of promise. If you need a CAT tool that won’t cost you a dime and works with OpenOffice.org, Anaphraseus is just the ticket.
by Dmitri Popov of Nothickmanuals.info
It’s fantastic! But still have little bugs when export TM
Comment by esperisto — 15 October 2008 @ 5:52 am