[PLUG] Devanagari unicode: composite letters ( जोडाक्षरे ); latex

Mayuresh mayuresh at acm.org
Sat Jan 23 22:38:08 PST 2010


> How to write "varyavar" (read in marathi) using SCIM mr-itrans?

During this mail thread I realized the following things that I wish to summarize:

Correct rendering of a letter depends on the following factors:

1. Availability of Unicode for a certain letter

This is usually not a problem as the codes are quite comprehensively made. Problems could arise due to one of the remaining factors below.

2. Availability of right ligatures when joining codes together to form composite letters

Unicode strategy is to stay out of specifying ligatures. (See references cited on this thread before.) Ligatures are about joining the composite letters together as per the rules of the language. This is fairly complex in Devanagari. But to the extent I saw, even this is well taken care of in various fonts.

3. Support for both these in the application

Applications like editors, mail clients etc. are usually good in supporting this. Applications like terminals etc. lack here a bit - particularly in ligatures. But that's usually sufficient.

4. Input method to enter such characters

Input methods such as SCIM do a pretty decent job, though if at all, the problems in producing some letters arise in having an input method for letters. I encountered problems with Au as in August or Aa as in Actor and as above mail mentioned rya as in varyawar. Fortunately factor 1 through 3 are alright for all these examples.

I deal with such letters in the following way:

i.  Search for the unicode (on web). There will be documents providing bitmaps of various letters. Type the unicode using Ctrl-v in vi and paste it from there. (No matter how the code looks on your terminal when you enter in vi, copy pasting it in ligature aware program will render it correctly.) (u0931, u094d, u092f in this case. This specific appearance of "ra" is called "eyelash ra".)

OR

ii. Search for a related word directly on the web. If you are lucky you'll get some context that is enterable in search. E.g. if you want varyawar, search by varat. May be 1 out of 10 pages gives you correctly written word (since many people face this problem). Copy-paste from there.

iii. Maintain an oowriter (or such) file of frequently encoutered difficult to enter letters as a library. (Fortunately these are few.) This comes handy if you type devanagari frequently.  

I know, this doesn't answer the question about how an input method to get do all this. But this is something practicable I felt.


PS: I just changed my mailing system to mutt. This mail is perhaps the last that will appear to break threads in threaded clients.



More information about the plug-mail mailing list