coptic cross
Moheb's Coptic Pages
Tuesday, February 26, 2013

Coptic text proofing with Hunspell


Hunspell is a very powerful spelling library / utility with Unicode Support. Hunspell is the spell checker of LibreOffice, OpenOffice.org, Mozilla Firefox 3 & Thunderbird, Google Chrome, and it is also used by proprietary software packages.
Hunspell is open source mainly written for Linux in C and is available as source code from http://hunspell.sourceforge.net, but you should be less interested in using hunspell directly under Linux. For those who wish that, they can download the language files I have written for Coptic here
These language files include all the Coptic (Bohairic) words of the liturgies and the New Testament. The .tgz file contains also a sample test file in the format used by Abiword and in uft8.

Below, I describe how to configure hunspell for:

Coptic Spell Checking in OpenOffice
(for Linux & Windows)

OpenOffice "knows" Coptic and can deal with it. This means, that you will find the language "Coptic" in the pull-down menu of the languages. If your have installed Coptic key map, you can write Coptic in OpenOffice, and will be able to use all functions like searching, changing to upper or lower case etc.

Coptic Text in OpenOffice

Starting from OpenOffice version 3.0, the installation of a coptic spell checker is quite simple:
  • Install a Unicode font with Coptic Code Pages
  • Install a Unicode Coptic keymap as described here: for Linux or for Windows
  • Download the latest version of openoffice at OpenOffice.org and install it.
  • The Coptic dictionary is available as an extension. You can get the extension from my server or from the the extension server of openoffice
  • The Spell Checker shall work now. Try to open any Unicode coptic text file, for example this sample text file or this sample odt file. Don't forget choosing a Unicode font which support Coptic, and setting the language to Coptic in the dialog: Format--> Character.. You should see a small Symbol in the language selection spin left to the entry "Coptic", indicating that the dictionary has been installed correctly. Please notice, that maybe you have to restart OpenOffice to see this symbol. This also means, that you have to exit the quick launcher (under Windows).
  • Right clicking the mouse under a mis-spelled word displays suggestions in a pop-up menu. The font maybe ugly, small, unreadable or all of these. This applies also for the font in the "Spelling and Grammar..." Dialog. You can change this.

Uncheck System Font

  • Got to: Tools -> Options -> OpenOffice.org -> View Uncheck the Option "Use system font for Interface".

UI Font Replacement

  • In the same Dialog select Font and check the "Apply replacement table". Replace the font: Open Symbol with for example: "ArialCoptic" or any other Unicode font you like, which supports the Coptic glyphs. In certain versions, the default UI font is Andale Sans UI instead of "Open Symbol". Try replacing both fonts with a Unicode Coptic font.

Coptic Spell Checking for Abiword

Abiword uses a spell checking wrapper callled Enchant which can work with several front ends. Hunspell is one of them. The installation process is different in Linux and Windows.
Installation for Windows
The latest stable version 2.8.6 can only recognize mis-spelled words by highlighting them (curly red lines). For achieving this, create a new directory under the dictionary directory of Abiword (for example: C:\Program Files\AbiWord\dictionary) and call it "myspell". Then copy the hunspell files for Coptic there. (unzip to get the cop-Eg.dic and cop-Eg.aff files) According to the developer of Abiword, the next version 3.0 will have full unicode support and hopefully full spell checking support.

Installation for Linux

Abiword Snapshot

Follow the following steps:
  • Get and install hunspell as described here. Download the dictionary files from there, create a new directory in you home directory and name it .enchant. Create a subdirectory in .enchant and name it hunspell. There you have to copy the downloaded and untarred dictionary files.
  • If you have installed hunspell with ncurses-support, then you can test, if it works by typing:
hunspell -d cop_EG <sample_text>
  • Install at least one Unicode font with Coptic glyphs. I have re-mapped some famous fonts in the publich domain (CS Coptic) and merged them into the FreeSerif Unicode font. I advice you to use the FreeSerifAthanasius, since I have adjusted the ancher points for the combining marks in it (for accurate definition the position of the Jinkim). To get this font and few other follow this link.
  • Get the source code of  Enchant , install it and make sure, that when running ./configure it reports that it found hunspell.
  • Create a file in the directory .enchant and name it  enchant.ordering, in which you define for which language Enchant shall use which spell checker. It might look something like:
*:ispell
cop:hunspell
en_UK:myspell,aspell,ispell
  • Download the patch I have prepared and apply it to the base directory of Abiword. This patch adds the language "Coptic" in the pull-down menu of languages in Abiword. Move to the parent directory of Abiword and type:
 patch -p0 < abiword-2.4.6-coptic.patch
  • Make sure that all the packages that Abiword needs to compile are installed on your system. You will probably need only to compile the core abi-package itself (the abi sub-directory in abiword-2.4.6). All other packages should be either already installed on your system (like iconv), or are not needed for the Linux installation.  If some library is missing, consult the manual of you Linux-distribution to learn about installing it. Then compile abi and install it.
cd abi
./configure
make
make install
  • Start Abiword, load the sample file that comes with the Coptic dictionary files and test if it works. If not, make sure that you are using a Unicode Coptic font. Check, if the text is also recognized as cop_EG (should be displayed in the lower bar of Abiword at the right, see the screenshot above).
  • To write your own text, you have to install a Coptic keymap as described in this page. Be aware that in contrary to the old CS-Coptic fonts (non Unicode), the Jinkim and the above bar (in fact all combining diacritical marks) have to be entered after the letter not before.

Coptic Spell Checking LibreOffice

Installation procedure is identical to OpenOffice as described above.
good luck!


Moheb Mekhaiel email