1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
|
@Section
@Title { {"@Common"}, {"@Rump"}, and "@Meld" }
@Tag { rump }
@Begin
@PP
common.sym @Index { @@Common symbol }
rump.sym @Index { @@Rump symbol }
meld.sym @Index { @@Meld symbol }
The @@Common and @@Rump symbols compare two paragraph objects:
@ID @Code "{ Aardvark, 29 } @Common { Aardvark, 359 }"
If either parameter is not a paragraph object, it is converted into
a single-object paragraph first. The result of @@Common is the
common prefix of the two paragraphs; that is, those initial objects
which are equal in the two paragraphs. In the example above, the
result is {@Code "Aardvark,"}. The result of @@Rump is that part of
the second object which is not included in @@Common; the result of
@ID @Code "{ Aardvark, 29 } @Rump { Aardvark, 359 }"
is {@Code "359"}.
@PP
If the two objects have nothing in common, the result of @@Common will
be an empty object and the result of @@Rump will be the second
object. If the two objects are identical, the result of @@Common will
be the first object, and the result of @@Rump will be an empty object.
@PP
The only known use for @@Rump and @@Common is to implement merged index
entries (Section {@NumberOf sorted}).
@PP
The @@Meld symbol returns the minimum meld of two paragraphs, that
is, the shortest paragraph that contains the two original paragraphs
as subsequences. For example,
@ID @Code "{ Aardvark , 1 , 2 } @Meld { Aardvark , 2 , 3 }"
produces
@ID { Aardvark , 1 , 2 } @Meld { Aardvark , 2 , 3 }
The result is related to the well-known longest common substring, in
that the meld contains everything not in the lcs plus one copy of
everything in the lcs. Where there are several minimum melds, @@Meld
returns the one in which the components of the first parameter are as
far left as possible.
@PP
Determining the values of all these symbols requires testing whether
one component of the first paragraph is equal to one component of the
second. Since Version 3.25, the objects involved may be arbitrary
and Lout will perform the necessary detailed checking for equality;
previously, only simple words were guaranteed to be tested correctly.
Two words are equal if they contain the same sequence of characters,
regardless of whether they are enclosed in quotes, and regardless
of the current font or any other style information. Otherwise,
objects are equal if they are of the same type and have the same
parameters, including gaps in concatenation objects. The sole
exception is @@LinkSource, whose left parameter is ignored during
equality testing, since otherwise there would be problems in the
appearance of melded clickable index entries.
@PP
Style changing operations (@@Font, @@Colour etc.) are not considered
in equality testing, since these have been processed and deleted by the
time that the tests are done. Also, Lout tries hard to get rid of
redundant braces around concatenation objects, which is why
@ID @Code "{ a { b c } } @Meld { { a b } c }"
produces
@ID { { a { b c } } @Meld { { a b } c } }
The two parameters are equal by the time they are compared by @@Meld.
@PP
One problematic area in the use of these operators is the definition
of equality when objects are immediately adjacent. Lout contains an
optimization which merges immediately adjacent words whenever they
have the same style. For example,
@ID @Code "{Hello}{world}"
would be treated internally as one word, whereas
@ID @Code "{Hello}{yellow @Colour world}"
would be treated as two adjacent words. Thus, although @@Font,
@@Colour, and the other style operators are ignored in equality
testing, they may affect the structure of the objects they lie
within.
@PP
At present, @@Common and @@Rump treat all unmerged components of
their paragraph as separate, even if one is immediately adjacent
to another. @@Common and @@Rump would thus see one component in
the first example and two in the second. @@Meld treats each group
of immediately adjacent components as a single component, so it
would see one component in both examples; but it would still not
report them as equal, since one is a single word and the other is a
pair of adjacent words. These confusing and inconsistent properties
might be revised in the future. See Section {@NumberOf exa_inde}
for an example of the practical use of these operators, in which
very small unbreakable gaps are used to ensure that apparently
adjacent components are separate, and @@OneCol is used to prevent
the word merging optimization from taking effect when it would
otherwise cause trouble.
@End @Section
|