Chinese Dictionaries for OmegaT

by Weedy Tan on January 25, 2014

When I first started using OmegaT, I couldn’t figure out how to find and install a suitable Chinese dictionary. I didn’t care so much as I can use online Chinese <> English dictionaries while still learning OmegaT. However, after reading some old posts and discussions in the OmegaT Yahoo support group, I decided to research on this and find a way to install the Chinese dictionaries.

There are, in fact, many resources when it comes to available Chinese <> English dictionaries in both Traditional and Simplified Chinese. However, as a novice OmegaT user, I couldn’t understand the differences amongst those numerous dictionaries. Based on the OmegaT manual, I need to find a packed or zipped file with *.tar.bz2 file extension name. When unzipped, it should have 3 files with file extension names as follows:

1. *.dict.dz
2. *.idx
3. *.ifo

And all the above should have the same file name.

One of the best websites where I found very good dictionaries was at http://abloz.com/huzheng/stardict-dic/ and under it, there were 2 particular resources where I found what I needed.

http://abloz.com/huzheng/stardict-dic/zh_CN/
Here, you can find various Simplified Chinese dictionaries.
Some of the dictionaries (ZH-CN) I have tested and found working are:

- langdao-ce-gb dictionary(zh_CN – en) 朗道汉英字典 – (stardict-langdao-ce-gb-2.4.2.tar.bz2)
- the MDBG CC-CEDICT Chinese-English dictionary – ( stardict-mdbg-cc-cedict-2.4.2.tar.bz2)

http://abloz.com/huzheng/stardict-dic/zh_TW/
And here, you can find Traditional Chinese dictionaries.
Here are the tested and working dictionaries for ZH-TW:

- langdao-ce-big5 dictionary(zh_TW – en) 朗道漢英字典 – (stardict-langdao-ce-big5-2.4.2.tar.bz2)
- xdict-ce-big5 dictionary(zh_TW – en) – (stardict-xdict-ce-big5-2.4.2.tar.bz2)

After downloading the above “*.tar.bz2” files and unpacking it into the 3 extension files, put it in the subdirectory “dictionary” of your “project” (ex: c:/project name/dictionary – where “project name” is the name you gave to the project when you created a new “Project” in OmegaT).

Now you are all set and ready to go!

You opened your “project”, loaded your “source file”, and you were hoping to see some Chinese to English dictionary words popped-up in the “Dictionary” pane of OmegaT as you try to go from one sentence segment to another.

Surprise! Surprise! No dictionary words came out! Why?

If you are like me, then your “Source Language Tokenizer” in the project “Properties” was set to the default “LuceneSmartChineseTokenizer”.

This particular tokenizer cannot seem to find any dictionary words though it is not entirely true. I will explain in another blog what I found out regarding the different tokenizers’ behavior based on what I had observed.

Meantime, to see the dictionary words, I suggest you change the tokenizer to “LuceneCJKTokenizer”. You do this by going to “Project”, “Properties”, and then choose “LuceneCJKTokenizer” in the “Source Language Tokenizer”. This tokenizer can find 2 Chinese characters that are in your dictionary. Example: 書籍，預約。

Unless you have done something terribly wrong <g>, you should see some dictionary words as you go from one segment to another.

If for some reasons you are still having trouble seeing the Chinese dictionary words, feel free to let me know and I’ll try my best to help you.

Enjoy!

5 thoughts on “Chinese Dictionaries for OmegaT”

Skybridge Translation says:

January 28, 2014 at 9:40 am

Reblogged this on Skybridge Translation and commented:
Thanks for sharing this Weedy! Our discussions along with James have been really constructive for me. Thx guys! Learning what school doesn’t seem up-to-date on teaching. OmT for freelance translator! Cheers

LikeLiked by 1 person

Felipe says:

August 30, 2014 at 4:40 pm

Thanks a lot,

LikeLiked by 1 person

Liú Zhōngjūn says:

April 7, 2015 at 2:30 pm

This information is great. I am trying to use Omega for French – Chinese translation, but I can’t find dictionaries. Coud you help as the links above are dead now ?

LikeLiked by 1 person

Nora says:

November 13, 2016 at 6:26 am

Thank you very much for this how-to-do explanation! Really made my day! Best from Germany!
…..Do you have an idea why the entries from MDBG dictionary dont come up in the order they are in the text, also not alphabetically, also not stroke number? Is there an option to change that? I’m still beginner with Omaga T ….

LikeLiked by 1 person

kachuuu says:

December 20, 2017 at 10:38 pm

Yeees Thank you so much! Peace from Marseille!

LikeLiked by 1 person