summaryrefslogblamecommitdiffstats
path: root/modules/hebrew-wlc/source/wlc/michigan.man
blob: 3f6228712072f3d632f7695467e1bd8a18d4ef7f (plain) (tree)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579


































































































































































































































































































































































































































































































































































































                                                                  




                         CODE MANUAL FOR
                   THE MICHIGAN OLD TESTAMENT

                   Research Memorandum UM82-1
                          19 March 1982

                     Dr. H. Van Dyke Parunak
                        1027 Ferdon Road
                      Ann Arbor, MI  48104



                            ABSTRACT

       This document describes the transcription code used in
encoding the Massoretic Text of the Old Testament at the
University of Michigan.  The project was made possible by grants
from the Packard Foundation and the University of Michigan
Computing Center, and by the gracious release granted by the
Deutsche Bibelstiftung, Stuttgart, publishers of Biblia Hebraica
Stuttgartensia.

       This document renders obsolete RMUM8 1-11  (the original
coding manual) and RMUM8 1-22  (addendum 1 to the original
manual).  The code defined here differs from that in those two
memos in six details.

1.   We now distinguish holem waw (`OW') from waw followed by
holem.
2.   Meteg with hatep vowels now has three variants, as defined
in 3.6.1. below.
3.   Telisa qaton, usually postpositive (04), has a distinct code
(24) when it is internal to a word.
4.   Telisa gadol, usually prepositive (14), has a distinct code
(44) when it is internal to a word.
5.   Galgal (formerly 55) is now 93.
6.   Darga (formerly 66) is now 94.

All texts released by the Michigan Project for Computer Assisted
Biblical Studies on or after 1 March 1982 conform to these
modifications.


                                   1. BASIC PRINCIPLES

       The code described in this paper was developed with
several basic principles in view.

       1.1.  This text is the first complete machine-readable Old
Testament text in the public domain.  Because of this, its coding
scheme will probably become the de facto standard transliteration
of biblical Hebrew for computers.  It should be capable of entry
on a wide variety of input devices, including card punches,
optical character recognition (OCR) typewriter elements, and
ASCII and EBCDIC terminal keyboards, and should be printable on
the most limited output devices.  Not all of these devices offer
the same character sets.  For instance, keypunches and some line
printers do not have lower case letters, and some OCR systems do
not have symbol `%'.  In order to allow our code to be used by as
many workers as possible, we have restricted the character set
severely.

       1.2.  Except for the accents, one Hebrew character has one
non-numeric transcription character.  The accents are all coded
with two characters, both of which are digits.  These conventions
greatly simplify automatic processing of the text.  But they rule
out the use of occasional digits as alphabetic characters; the
transcription of some consonants with double characters (for
example, sin with `SH'); or the use of explicit diacritics to
indicate the difference among short, long, and historically long
vowels.

       1.3.  This text is a complete transcription of the
graphical form of the Massoretic Text, as recorded in Biblia
Hebraica Stuttgartensia, with as little analysis as possible. 
Thus, for instance, qames has the same transcription (`F')
whether it represents a short `o' vowel or a long `a' vowel. 
Similarly, both sureq and double waw are written as `W' plus
dages.  Experience indicates that if graphical details are
leveled out by recording only analysis, inevitably those very
details will be needed later.  We have recorded as much as
possible at the outset.  Our overriding rule has been,

           "Code what is WRITTEN, not what is MEANT."


                                   2. RECORD FORMAT

2.1. GENERAL PRINCIPLES

       2.1.1.  The basic units of our text are graphical words 
(or words for short), verses, lines, logical records (records for
short), and files.  A  graphical word is a string of characters
which is separated from other characters in Biblia Hebraica
Stuttgartensia by blank space, the end of a line, or a maqqep (a
dash).  A verse is a series of graphical words which is delimited
in the Hebrew text by the end of a book; occurrences of the
accent sop pasuq, which resembles a large colon (`:'); or a verse
number.  A line is a line on a page of Biblia Hebraica
Stuttgartensia.  One line may contain parts of two different
verses, and one verse may require several lines.  A logical
record is the amount of text keyed at a terminal between carriage
returns or line feeds.  Each logical record of our text contains
80 characters or less.  A file is a series of logical records
which are stored together for the computer's use, and contains
the text of one book of the Old Testament.

       2.1.2.  Every new verse begins on a new logical record. 
Apart from this restriction, each logical record contains as many
whole graphical words as will fit in 80 spaces.  We code the
space between successive words as a space, and maqqep as a minus
sign.  The maqqep is considered part of the word before it, and
so is not separated from that word.  It may be separated,
however, from the following word, if there is not room on the
current record for the following word.  If there is not enough
room on a record for a complete graphical word, we do not code
any of that word on the record.  Instead, we leave the rest of
the record blank, and place the whole word on the next record. 
We record the location of line endings, by placing a question
mark (`?') after the last word in each line of Biblia Hebraica
Stuttgartensia.  This question mark is a legitimate separator for
graphical words, and need not be preceded or followed by a blank
space.  It may appear with a space, though, since extra spaces
between words do not violate the code.

       2.1.3.  The first three logical records of each file
contain, not text, but special header information.  The first of
these records contains the name of the book which that file
contains, in the Latin spelling used in Biblia Hebraica
Stuttgartensia.  The second record contains the following string
of characters:

       )BGDHWZX+YKLMNS(PCQR&$TI"EAFOU:.,-/?#!;*1234567890

The third record contains this same string, in reverse order:

       0987654321*;!#?/-,.:UOFAE"IT$&RQCP(SNMLKY+XZWHDGB)

These records can be used by programs that read the text to
adjust for any idiosyncratic differences between the character
codes in use on different equipment.  The character list occurs
twice so that if it itself contains an error, decoding programs
can detect it and alert the user of the text instead of uncoding
the entire text with a defective key.

       2.1.4.  Lines which begin a chapter have as their first
characters the chapter number, followed by a colon, followed by
the verse number (`1'), followed by a blank space, followed by
the text.  Lines which begin the second and later verses of each
chapter begin with the verse number, followed by a blank space,
followed by the text.  The specimens of coded text for Ruth 4 and
Ps 1 in the Appendix illustrate the record format.


                                   3. CODING WORDS

3.1. OVERVIEW

       Six details of Biblia Hebraica Stuttgartensia are encoded
in this text:  the open and closed paragraph marks; the
consonantal skeleton of the inflected forms of words; the vowels;
the accentuation signs; ketib-qere variants; and morphological
divisions for certain morphemes.  The codes for these different
details are interspersed with one another.  For instance, the
first word of the Bible is coded `B.:/R")$I73YT'.  The
consonantal skeleton of this word is represented by the letters
`BR)$YT', while the signs `:"I' are the vowels.  The number `73'
represents the tipha accent on the word, the period after `B' is
the dages, and the slash before `R' indicates that the symbols
before it are a separate morpheme from those after it.  We will
discuss each category of symbol separately.  However, in coding,
all are used at once.  Within each syllable, the consonant is
coded first, followed by the sign for dages or rape, followed by
the vowel, followed by the accentual code, followed by a closing
consonant or mater lectionis.  This order is rigidly followed,
except in the exceptions noted below.

       The Appendix contains photocopies of Ruth 4:4-6 and Ps
1:1-3 from BHS.  We will illustrate the coding principles from
these passages.  We refer to these texts in the form `Ruth
4:4.12', where `12' refers to the twelfth graphical word of the
verse.  Usually, in presenting an example, we will only code the
features which we have already discussed.  Examples illustrating
the consonantal code will not be vocalized or accented, because
the vocalization and accentuation codes are discussed after the
consonantal code.   The Appendix contains completely coded forms
of the two specimen passages, in correct record format, so that
the full effect of the code may be seen.  These specimens should
be studied in detail after the entire description of the code has
been read.

3.2. PARAGRAPH INDICATORS

       Biblia Hebraica Stuttgartensia uses the characters pe and
samek between words (usually between verses) to mark two levels
of paragraphs, commonly referred to as "open" and "closed,"
respectively.  We code these symbols as `P' and `S',
respectively, set off from their surrounding words by blanks,
line markers (`?'), or the end of a logical record.  The
specimens in the Appendix contain no examples of these paragraph
markers.

3.3 CONSONANTS

       3.3.1.  The Hebrew alphabet in traditional order is
transcribed

                     )BGDHWZX+YKLMNS(PCQR&$T

We do not distinguish medial and final forms of letters, since we
wish to keep our character set small and since these are
completely predictable from their position in a word.  Some OCR
type faces do not have parentheses, but do have left and right
curly brackets (`{' and `}'), which may be used for the `ayin and
'alep, respectively.  We relax our strictly graphical stance a
bit in distinguishing sin (`$') and sin (`&') as separate
characters, because in this case the alternatives complicate
further processing too much.  In the rare cases when the sin/sin
character occurs without a point (as in the name `Issachar' in
Judges 5:15.2), we code it with the character `#'.

       3.3.2.  A consonant with a dages, whether strong or weak,
is followed immediately, before any vowels or other consonants,
by a period to represent the dages.  Ignoring vowels, accents,
and morphological divisions, Ruth 4:4.8 is coded `HY.$BYM', not
`HYY$BYM'.  The interpretation of the dages as representing
doubling of the consonant belongs to the analysis of our text. 
We record here simply the graphic image on the page, which in
this case consists of consonant and dages.  We do not distinguish
dages indicating consonantal doubling from dages indicating the
hardening of a `BGDKPT' letter or mappiq (the dot in a final `H'
indicating that it is a real consonant and not a mater
lectionis).  All are coded simply as a period after the
consonant.  The rule is,
           "Code what is WRITTEN, not what is MEANT."

       3.3.3.  In keeping with this philosophy, a waw with a dot
in it is coded as `W.' whether it represents a doubled consonant
or a historically long `u' vowel.  Thus the imperative "turn!" is
`$W.B' (where `W.' registers a historical long `u') while "he
will hope" is `Y:QAW.EH'  (where `W.' is the doubled second
radical of the pi`el).  Both uses occur in the pi`el third person
plural suffix conjugation form `QIW.W.'  "they hoped"  (Ps
56:7.7).  The two uses of `W.' are readily distinguished in later
processing because doubled consonantal waw always follows a
vowel, while sureq never does.

       3.3.4.  One of the uses of the dages is to indicate the
plosive pronunciation of the six consonants `BGDKPT', which have
alternative fricative pronunciations.  The rule is that if one of
these consonants has a dages, it is pronounced hard, and
otherwise soft.  To emphasize the soft nature of these consonants
in some contexts, the Massoretes used the rape, a horizontal
stroke over the consonant, which simply repeats the information
conveyed by the lack of the dages.  When rape occurs over a
consonant, we code it as a comma (`,') immediately following the
consonant.  Rarely  (Exod 20:13,15) a consonant contains both
dages and rape!  In such cases, we code the consonant, then
dages, then rape, and finally any vowel.

3.4. VOWELS

       3.4.1.  The vowels are coded as indicated in the Appendix. 
In keeping with the graphical nature of our undertaking, we make
no distinction between vocal, silent, and medium sewa.  Thus both
Ruth 4:4.2   `)FMAR:T.IY' and 4:4.4  `)FZ:N:KF' use the same sign
for sewa.  We also do not distinguish between the `o' and `a'
pronunciations of qames.  Both of the last examples illustrate
the use of `F' for qames.  The same sign is used to code qames
hatup in the qere of Ruth 4:6.5, `LIG:)FL-'.

       3.4.2.  We have already noted that the sureq, the long `u'
vowel written with a waw, is coded as `W.', just like a doubled
waw.  Other vowels written with matres lectionis are coded as the
simple vowel followed by the appropriate consonant.  Thus hireq
yod is coded as `IY', holem waw is `OW'; sere yod is `"Y', and so
forth.  Holem waw  (`OW') is distinguished from waw followed by
holem  (`WO'), since Biblia Hebraica Stuttgartensia makes a
graphical distinction between these two forms.  Quiescent 'alep
follows the vowel, as in Ruth 4:5.1 `WAY.O)MER'.

       3.4.3.  Hatep vowels are coded as sewa followed by the
appropriate simple vowel.  Thus hatep qames is `:F', hatep segol
is `:E', and hatep patah is `:A'.  The relative pronoun in Ps
1:1.3 is coded `):A$ER'.

       3.4.4.  When the letter sin is followed, or sin preceded,
by holem, the single dot over one horn of the consonant often
serves to mark both the consonant and the vowel.  Because this
text is a graphical transcription, no vowel is coded in these
cases.  However, when the holem is represented in the text by a
separate dot, it is coded separately, as in Ruth 4:4.8:
`HAY.O$:BIYM'.

       3.4.5.  Furtive patah is coded AFTER the guttural. Ps
1:3.17  (the last word in the verse) is `YAC:LIYXA'.  Though this
convention conflicts with the traditional pronunciation, it is
less ambiguous for automatic processing than the alternatives.

3.5. ACCENTS

       3.5.1.  The accent code which we use was devised by Robert
Eckert, and is highly mnemonic.  It is our only violation of the
"one character for one sign" rule in transliteration.  We use two
digits to code each accent.  The first digit for each accent
records the POSITION of the accent with respect to the consonant,
while the second digit records the SHAPE of the accentual sign. 
The accents are summarized in the table at the back of this memo.

       The first digit is zero for postponed accents and one for
preposed accents.  Other initial digits are even for accents that
appear above consonants, and odd for those that appear below. 
The first digit is six for superposed marks which also occur
under letters, eight for superposed marks which never occur under
letters, seven for subposed marks which can also occur over
letters, and nine for subposed marks which never occur over
letters.  Codes beginning with two, three, four, and five are
used for miscellaneous special characters, following the pattern
"even above, odd below."

       The second digit (the rows of the table) records the
graphic appearance of the accent.  To remember the graphic digit,
associate the numbers 0-5 with sewa, 'alep, bet, gimel, dalet,
and he.  Then remember the mnemonic line, "sewa means pass
quickly left, 'alep likes to take segol, bet means house, gimel
has a wrong-way tittle, dalet has Jachin and Boaz, he has a
detached bar."

(0)  Accents with second digit 0 either resemble sewa, or are a
left arrow.  Sewa hastens the pronunciation of a syllable, moving
the reader over it more rapidly than if a full vowel were given. 
Since one reads Hebrew from right to left, sewa moves the reader
to the left rapidly.

(1)  'alep as a verbal prefix frequently takes the vowel segol
rather than hireq.  Some of the accents in this row resemble
segol and hireq.  The others are (or include) a single slash
upward to the right, which is the side of a verb to which 'alep
is joined. 

(2)  A house (bet) has a peaked roof, at least in the western
European tradition.   The accents in this row are either the peak
of the roof, or two slanted lines which could be rearranged to
make a roof.

(3)  The tittle on most tittled letters (bet, dalet, zayin) is
more or less horizontal.  On gimel, it slopes sharply down to the
right, like most of the accents in this row.  The two exceptions
may also be considered "wrong-way" characters.  Pazer (83) is a
"wrong-way" qames, and galgal (93) is a "wrong-way" 'atnah.

(4)  The pillars at the doorway (dalet) of Solomon's temple were
decorated with square checkerboard patterns and circular chains 
(1 Kings 7:17), corresponding to the circles with links, the "s"
curve, and the square corners of the accents in this row.

(5)  All the accents in this row have a vertical detached bar,
just as does the character he.  Accent 75 serves both for silluq
and for meteg when meteg occurs (as it does most often) to the
left of its vowel.  Accent 95 is reserved for meteg when it
occurs to the right of its vowel, and 35 codes a meteg which
falls between the components of a hatep vowel as at Judges 9:27. 
We cheat a bit to include salselet in the fifth row.  Its
predominant shape is that of a vertical line, though it does have
some squiggles in it.

3.5.2.  The accents in column 0 are coded at the very end of a
word, after all other characters (except maqqep).  Those in
column 1 come at the very beginning, before all other characters
(except *).  Otherwise, the accent comes in the order, consonant,
dages, vowel, accent.  For example, in Ruth 4:5.7, the digits
`92' indicate the accent in the coded form `NF(:FMI92Y'.  Note
that the 92 PRECEDES the consonantal `Y'.  To code `NF(:FMIY92'
would imply that the accent was under the yod rather than (as is
the case here) the mem.  Accents are placed after the first
vocalic code of a syllable.  Accents thus precede matres
lectionis and quiescent consonants.  With accent, Ruth 4:5.1 is
`WAY.O74)MER'.  The accentual symbol `74' follows the vowel of
its syllable immediately, and precedes the quiescent 'alep. 
Because of our strictly graphical encoding of sureq, the accent
on a syllable vocalized with this vowel comes immediately after
the consonant (and dages or rape, if any), and before the `W.'. 
Ruth 4:5.9, the proper name "Ruth," appears `R74W.T', not
`RW.74T'.  Ps 1:3.3 is `$FT93W.L'.

       3.5.3.  Following our graphical philosophy, we have not in
general adopted different codes for two accents which have the
same form but appear in different contexts.  For instance, both
'azla and qadma have the code `63', and that whether they occur
in poetic or prose books.  Because tipha and mayela have the same
form in our text, both are coded as `73', even though one is a
disjunctive accent and the other is conjunctive.  The two can be
distinguished automatically because mayela occurs only in words
which either have 'atnah (92) or silluq (75), or which are joined
by maqqep to such words.  The code for silluq, `75', is also used
for meteg.  Meteg always precedes another accent in the same word
(or pair of words joined with maqqep), while silluq is the major
accent on the last word of each verse.

       3.5.4.  Some accents are compounded of pieces which
resemble other individual accents.  Rather than add further codes
for these compound accents, we code each of the parts separately
as if it were a separate accent.  Thus the rebia` mugras on Ps
1:1.13 appears in two parts, as though two separate accents
rebia` (81) and mugras (11) were used, `11L"CI81YM'.  Note the
positioning of `11' before the first consonant of the form,
reflecting its prepositive position in the text.  The common
poetic disjunctive `ole weyored is coded as in Ps 1:1.7,
`R:$F60(I71YM'.  Paseq (05) and sop pasuq (00) are part of the
preceding word, which will also have another accentual sign. 
Examples are Ps 1:1.3  `):A$E70R05' and Ps 1:1.15  `YF$F75B00'.

       3.5.5.  There are, however, seven pairs of accents whose
members have the same graphic form but which we have coded
separately.  These are yetib (10) and mahpak (70); mugras (11)
and geres, dehi (13) and tipha (73); zarqa (02) and sinnorit
(82); pasta (03) and 'azla (63); and two forms each of telisa
qaton (04,24) and telisa gadol (14,44).  In each of these pairs,
the first member does not fall on the consonant which begins the
accented syllable, as does the second, but comes either before
(yetib, mugras, dehi, and telisa gadol) or after (zarqa, pasta,
and telisa qaton) the entire word.  To guard against mislocating
these accents, we have established separate numerical codes for
each member of these pairs.

3.6. KETIB-QERE

       A word which is marked as ketib in the text is immediately
preceded by an asterisk.  The corresponding qere entry is coded
immediately following, preceded by two asterisks.  The accent in
the text belongs to the qere.  None is coded for the ketib.  Ruth
4:4.20 appears `*W:)"DA( **W:)"75D:(FH03'.  If the ketib is
lacking, the corresponding single asterisk appears, followed by a
blank.  If Ruth 4:4.20 had a qere with no ketib, we would code,
`* **W:)"75D:(FH03'.  Sometimes a ketib has two words and the
qere only one, or vice versa.  In such a case, the * (or **)
precedes EACH of the words involved.

       3.6.1.  The ketib is vocalized according to the editorial
footnote, if one is supplied.  Otherwise we have supplied the
vocalization.  The qere receives the vowels and accents that are
written in the text with the ketib.

       3.6.2.  Sometimes one of two words joined by a maqqep has
a ketib/qere variant.  If it is the second word that is thus
varied, the code is `WORD1-*WORD2  **WORD2QERE'.  On the other
hand, if the first word is varied, the code is `*WORD1 
**WORD1QERE-WORD2'.  Ruth 4:6.5,6 is coded `*LIG:)OWL  **LIG:)FL-
LI80Y'.  The gere to the first word intervenes between the two
words, so as to come directly after its ketib.  The maqqep
belongs to the qere, not the ketib, and is only written once.

       3.6.3.  Perpetual ketib/gere readings are coded just as
they occur in the text.  Thus the Tetragrammaton appears as
`Y:HWFH' (Ps 1:2.4) or `Y:EHOWAH' (with appropriate accent). 
Jerusalem appears as `Y:RW.$FLAIM' (with the accent between the
`A' and the `I'!), and the third feminine singular independent
pronoun in the Pentateuch is `HIW)'.  No asterisks mark any of
these forms.

3.7. MORPHOLOGICAL DIVISION

       3.7.1.  The only analysis in this text marks some basic
morphemes to assist in lemmatization.  The slash (`/') marks the
division of inseparable prepositions (`K', `B', `L', and
contracted `MN', and others with pronouns); the article (only
when marked with a consonant); the interrogative prefix `H' and
its vocalization; the locative suffix `FH'; the conjunctive waw;
and pronominal genitive or accusative suffixes.  Nominative
verbal affixes are not separated from the verb stem.

       3.7.2.  The article is NOT marked when the `H' is absent,
even if it is represented by an `a' vowel or doubling of the
first radical of a word.  Ruth 4:5.5. is `HA/&.FDE73H'.  But "to
the field" would be `LA/&.FDE73H', not `L/A/&.FDE73H'.

       3.7.3.  The separating slash comes after the vowels and
accents of a prefix, but before the first consonant of the base. 
In Ruth 4:5.5, just cited, the sin is doubled by the article, and
its dages thus might be thought to belong to the article.  But
for our purposes the dividing slash comes before the sin.  In Ps
1:3.1, the prefix vowel is accented, and the slash comes just
after the accent: `W:75/HFYF81H'.

       3.7.4.  Unless a pronominal suffix is entirely vocalic, it
is joined to the base by a vowel, whose quantity may range from
silent sewa to a long vowel represented with the mater lectionis
yod.  When the joining vowel is written with yod, the slash
dividing the suffix from the base comes between the yod and the
suffix.  Ruth 4:4.26 is `)AX:ARE92Y/KF'.  In all other cases, the
slash comes just before the joining vowel, even if that vowel is
a silent sewa.  In many cases, the slash will divide the initial
consonant of a syllable from its vowel.  Ruth 4:4.4. is
`)FZ:N/:KF74', and 4:4.11 is `(AM./IY01'.   This is no cause for
alarm, since the slash carries morphological, not phonetic or
graphemic, information.  Even more bizarre cases arise when the
last consonant of a third-weak form disappears before a
pronominal suffix   Ps 1:3.11 is coded `W:/(FL/"71HW.'.

       3.7.5.  The last two paragraphs conflict when a pronominal
suffix is attached directly to a preposition.  In this case, the
joining vowel is reckoned with the pronominal suffix.  Ruth
4:6.6,12 are coded `L/I80Y' and `L/:KF70', respectively.  When a
preposition is lengthened with `MO' or `MOW', the division falls
after the entire syllable:`K.:MOW/HW.', `K.:MO/HW.'.  But when
the `MO' is the suffix, rather than an extension of the
preposition, we code `L/FMOW' (Isa 26:14.13).  Prepositions are
also lengthened at times with `AD,' as in I Sam 20:14.7, where we
also leave the extension as part of the preposition,
`(IM.FD/I91Y'.  The reduplication of the preposition is
separated, though, in `MI/M.EN./IY'.

       3.7.6.  No dividing slash is needed between words which
are separated by maqqep, since in these cases the hyphen shows
the morphological boundary.

        4. DEVIATIONS FROM BIBLIA HEBRAICA STUTTGARTENSIA

4.1. DEVIATIONS IN VERSIFICATION

       We record the versification which Biblia Hebraica
Stuttgartensia gives in unparenthesized Arabic numerals.  Biblia
Hebraica Stuttgartensia gives alternative verse numbers in Deut
5:21-33 (18-30), and the numeration of the Psalms in the
Leningrad Codex is irregular.  Neither of these variations is
recorded in our text.

4.2. OTHER DEVIATIONS

       The character `!' immediately follows any word which we
have coded at variance with Biblia Hebraica Stuttgartensia.  A
printed introduction to the text, now in preparation, will list
all such deviations.  They arise for two reasons.

       4.2.1.  First, Biblia Hebraica Stuttgartensia has some
singular features at isolated, well-known passages, which we have
not incorporated into our code.  The extraordinary points at Gen
33:4.9 and the inverted nuns at Num 10:34.7 are examples.  Since
computer analysis in general deals with repeated phenomena, we do
not complicate the code to include these singularities, but
instead flag the forms to which they are attached.

       4.2.2.  Second, Biblia Hebraica Stuttgartensia contains
many forms that are anomalous from the point of view of
"standard" Hebrew grammar.  These may arise because of an error
in our understanding of Hebrew grammar, an error by the scribe of
the Leningrad codex, or an error by the editors of Biblia
Hebraica Stuttgartensia.  When we judge that an anomalous form
arises because of an error in the preparation of Biblia Hebraica
Stuttgartensia, we correct the form, usually by the Kittel Bible,
and mark it with `!' to show our intervention.  We do not correct
anomalous forms which are attested by several editions, or errors
in the Leningrad codex which the editors of Biblia Hebraica
Stuttgartensia have noted as such.


                       5. ACKNOWLEDGEMENTS

       The design of this code and production of the manual were
supported by the Computing Center, the Project for Computer-
Assisted Biblical Studies, and the Society of Fellows, all of the
University of Michigan.  A preliminary version of this manual was
presented to the Workshop on Computers and the Bible of the
College of Wooster, Wooster, OH, 21-27 June 1981.  The present
scheme owes much to simplifications and modifications suggested
there, as well as to the suggestions and improvements supplied by
the coders who worked with an interim version.  The coding of the
text itself was supported by generous grants from the Packard
Foundation and the University of Michigan.