aboutsummaryrefslogtreecommitdiffstats
path: root/hyph/README
diff options
context:
space:
mode:
Diffstat (limited to 'hyph/README')
-rw-r--r--hyph/README124
1 files changed, 124 insertions, 0 deletions
diff --git a/hyph/README b/hyph/README
new file mode 100644
index 0000000..84409bb
--- /dev/null
+++ b/hyph/README
@@ -0,0 +1,124 @@
+Format of Lout hyphenation information files
+
+Jeffrey H. Kingston
+22 December 1992
+21 September 1994
+6 June 1995
+3 April 1996
+
+Basser Lout Version 3 incorporates automatic hyphenation using the
+method introduced by TeX (see Appendix H of the TeXBook by D. E. Knuth),
+with support for multilingual hyphenation. No special action is required
+to install hyphenation unless it is desired to change the hyphenation
+information that controls it.
+
+There is one hyphenation information file for each language, and it is
+named in the langdef of that language. For example:
+
+ langdef German Deutsch { german }
+
+(There will usually be other information between the hyphenation file
+name and the closing brace, not relevant here.) This example means that
+unpacked Lout hyphenation file german.lh or its packed equivalent
+german.lp (see below) is to be used when hyphenating German words. These
+files are kept in the Lout system hyphenation directory (this directory). If
+a language is desired but no hyphenation information file is available, the
+file name may be replaced with -, and then the language will be defined but
+hyphenation in that language will never be attempted. Another possibility
+is to include a placeholder file for the language (see below).
+
+The first time on any run that German hyphenation is required, Lout will
+search the directories of the hyphenation path for a binary file called
+german.lp, which contains a binary form of the hyphenation patterns in
+german.lh, modified so that the file may be shared by big-endian and
+little-endian machines. If german.lp cannot be found, Lout then searches
+for the text file german.lh instead, and uses it to construct german.lp.
+To change the German hyphenation patterns, delete german.lp and modify
+german.lh; the rest is automatic.
+
+Alternatively, if lout is invoked with the -x flag and the langdef line
+above appears in its input, it will read german.lh and produce german.lp
+immediately. This is intended for setting up: it is good to create all
+these packed files at setup time, since a subsequent lout run that needs them
+will not have write permission in the Lout system hyphenation directory.
+
+An unpacked Lout hyphenation information (.lh) file mainly contains a
+long list of TeX hyphenation patterns. It must begin with either
+
+ Lout hyphenation information
+
+or
+ Lout hyphenation placeholder
+
+alone on the first line. In the second case, it is understood that the
+file is a placeholder (i.e. a stub file which might be overwritten with
+a real file in the future), and Lout does not read any futher; the effect
+is that Lout will not hyphenate this language, but not complain about the
+absence of the file either.
+
+In the non-placeholder case, following the header line comes the "Classes:"
+heading followed by the character classes. For example:
+
+ Classes:
+ @!$%^&*()_-+=~`{[}]:;'|<,.>?/0123456789
+ aA
+ bB
+ cC
+ ...
+ yY
+ zZ
+
+The hyphenation process treats the characters in each class as identical
+(so the classes above ensure that the distinction between upper and lower
+case is ignored). By definition, the characters of the first class are
+"non-letters", and the characters of the remaining classes are "letters".
+Notice that these are actual characters, not character names: hyphenation
+files are encoding-specific.
+
+Next comes the "Exceptions:" heading followed by the exceptions, which
+are words (composed of letters and "-" only) whose hyphenation is to be
+treated as a special case. For example:
+
+ Exceptions:
+ ta-ble
+ phil-an-thropic
+
+These words may be hyphenated in the places shown by the "-" characters.
+Character classes are in effect here (Table will be hyphenated as Ta-ble).
+If there are no exceptions, "Exceptions:" may be omitted.
+
+Next comes an optional LengthLimit section, which tells Lout to ignore
+some patterns. For example,
+
+ LengthLimit:
+ 4
+
+means that patterns containing more than 4 letters (note that
+. counts as a letter) are to be ignored. The purpose is to discard
+the least important patterns from files that are too large for Lout
+to handle otherwise. None of the files actually use this at present,
+but hyphenation files seem to be getting larger and larger, and if
+any whoppers come along they might have to be trimmed in this way.
+
+Finally comes the "Patterns:" heading followed by the list of TeX
+hyphenation patterns. Apart from the weighting digits, the patterns
+should contain only letters. Lout understands some TeX escape sequences
+e.g. it will accept \^e anywhere in a hyphenation file as the ecircumflex
+character.
+
+The file may contain comments, which begin with % (either at the start
+of a line or after a white space character) and go to end of line. The
+headings, classes, exceptions and patterns are separated by arbitrary
+white space.
+
+Briefly, hyphenation of a word works like this. If the word contains a
+character not found in any character class, it will not be hyphenated.
+Otherwise the word is analysed into sequences of letters separated by
+sequences of non-letters. Each sequence of five or more letters is
+then matched, either with an exception or else with the hyphenation
+patterns, and hyphenated. The hyphen character "-" is treated specially.
+
+Extreme lengths were resorted to to compress the .lp file as much as
+possible. Files significantly larger than german.lh are likely to cause
+Lout to abort with an error message. Please contact jeff@cs.su.oz.au if
+you have problems with this or anything else.