Finally, UTF-8 locale (and about Compose) ######################################### :date: 2005-09-23T01:16:00 :category: computer :tags: utf8, linux I have finally bitten the bullet and switched my locale to `cs\_CZ.UTF-8`_. When still writing this blog in gvim (`the end of my relation with vim`_ and here_), I begun to write it in `UTF-8`_ and it was such a relief. Suddenly, I didn‘t have to use ugly kludges like \`\` or --. Of course, the problem is that there are so many supplementary characters which could be suddenly used, that no keyboard layout is able to handle all of them (I think) and some other solution has to be found. Vim has digraphs_ which are really quite useful, but as everything else in vim, there is no connection to the outside world. Switch to Kate/KWrite was very pleasant issue, but obviously there are no digraphs native to them. My first reaction was to use `HTML entities`_ and translate them to the pure UTF-8 version with `my special Python script`_. However, I felt very strongly that this is not the way. `I asked on cz.comp.linux`_ about experience of people with inserting these non-keyboardish characters and the answer was “Use Compose key”. I begun to search on Google for the answer how to make it work and finally I found that actually the best source of information about the combinations of keys for Compose (aside from `the article on Wikipedia`_) is `directly in my computer`_. The only problem was that with ISO 8859-2 based locale only very small part of keys actually worked. This was the last straw which broke my back of resistance towards switching whole computer to UTF-8. The problem is (as always) `Midnight Commander`_, which Debian version doesn’t work with UTF-8 at all (especially, panel frames are affected by this). So, again, Googling and Googling until I've found `this thread on some discussion board`_, which contains `a link to patched version of MC`_ (requires also `non-standard version of slang`_), which somehow works in my console. However, MC is not a critical for me anymore, now when Krusader_ is finally `stable enough and featurefull enough to compete with MC`_. One more problem—when I have switched to UTF-8 many filenames with accented characters were suddenly broken. I thought that Linux filesystems store all metada in UTF-8 already. Oh well, they probably don’t. So I had to run output of ``locate`` through ``cstocs`` and then to find out with ``diff`` what all has been changed. Looking at all this issue with at least some distance, it seems to that actually Compose key combines best from all the options—it works as well as vim’s digraphs, but it is X11-wide, which is cool (and yes, of course, it is much better than M$-Windows’s ``Alt+``). .. _`cs\_CZ.UTF-8`: http://www.cestina.cz/pocestovani/unix/ .. _`the end of my relation with vim`: https://groups.google.com/forum/#!topic/linux.debian.maint.kde/F7Knza9mkmY .. _here: {filename}why-yzis.rst .. _`UTF-8`: http://www.cl.cam.ac.uk/%7Emgk25/unicode.html .. _digraphs: http://www.vim.org/htmldoc/digraph.html .. _`HTML entities`: http://www.htmlhelp.com/reference/html40/entities/ .. _`my special Python script`: http://www.ceplovi.cz/matej/progs/scripts/deent .. _`I asked on cz.comp.linux`: https://groups.google.com/group/cz.comp.linux/browse_frm/thread/f86c0425956b296c/ .. _`the article on Wikipedia`: https://en.wikipedia.org/wiki/Compose .. _`directly in my computer`: file:///usr/X11R6/lib/X11/locale/en_US.UTF-8/Compose .. _`Midnight Commander`: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=242194 .. _`this thread on some discussion board`: http://forum.ubuntu.ru/index.php?topic=83.msg529#msg529 .. _`a link to patched version of MC`: http://yuozhny.ru/deb/mc/mc_4.6.0-1_i386.deb .. _`non-standard version of slang`: http://yuozhny.ru/deb/mc/mc_4.6.0-1_i386.deb .. _Krusader: http://www.krusader.org .. _`stable enough and featurefull enough to compete with MC`: http://linuxtoday.com/developer/2005060900426NWCYSW