aboutsummaryrefslogtreecommitdiffstats
path: root/doc/expert/exa_inde
diff options
context:
space:
mode:
authorJeffrey H. Kingston <jeff@it.usyd.edu.au>2010-09-14 20:38:02 +0000
committerJeffrey H. Kingston <jeff@it.usyd.edu.au>2010-09-14 20:38:02 +0000
commit9daa98ce90ceeeaba9e942d28575d8fcfe36db4b (patch)
tree94d258acb2a5bdddc318dbdc59b788bb8f52c704 /doc/expert/exa_inde
parentc89f0bc2209f7f98695e6b94fbac316c84fbf9d4 (diff)
downloadlout-9daa98ce90ceeeaba9e942d28575d8fcfe36db4b.tar.gz
Lout 3.26.
git-svn-id: http://svn.savannah.nongnu.org/svn/lout/trunk@21 9365b830-b601-4143-9ba8-b4a8e2c3339c
Diffstat (limited to 'doc/expert/exa_inde')
-rw-r--r--doc/expert/exa_inde179
1 files changed, 179 insertions, 0 deletions
diff --git a/doc/expert/exa_inde b/doc/expert/exa_inde
new file mode 100644
index 0000000..648f587
--- /dev/null
+++ b/doc/expert/exa_inde
@@ -0,0 +1,179 @@
+@Section
+ @Title { Merged index entries }
+ @Tag { exa_inde }
+@Begin
+@PP
+Getting index entries to merge correctly has been quite a struggle.
+It is easy to specify what is wanted, but Lout lacks the lists and
+objects (in the object-oriented sense) that would make the
+implementation straightforward. The whole problem was reanalysed
+for Version 3.26, reimplemented, tested more carefully than is
+usually necessary in Lout, and proved correct as follows.
+@PP
+We ignore page number ranges in this proof. It is not hard to
+show that they will be handled correctly too, provided they
+do not overlap with other entries with the same key. The
+effect of such overlaps is undefined, leaving us nothing to
+prove. We also assume that every entry with a given
+key has the same label, including any indent (that is, the same
+initial part before the page number). If labels differ the
+result is undefined and there is nothing to prove.
+@PP
+We will prove that raw entries always have the form
+@ID @Code "label &0.03fu {}"
+and that non-raw entries always have the form
+@ID @Code "label &0.03fu {}{@OneCol ,} pn1{@OneCol ,} pn2"
+where the pattern may repeat for any number of page numbers
+{@Code pn1}, {@Code pn2}, etc. In addition, the page numbers
+will be distinct, monotone increasing, and consist of exactly
+the numbers in the original unmerged entries.
+@PP
+These expressions are not the simplest that would give the
+correct appearance. Without @Code "&0.03fu {}" the code
+would not work correctly, as will be explained below. Without
+@@OneCol the commas would be subject to an optimization which
+can merge them into the previous word. It's too difficult to
+explain when this optimization will and will not be applied;
+suffice to say that it will sometimes not happen when melding,
+and this will cause @@Meld to get its equality testing wrong,
+so it must be prevented from happening at all.
+@PP
+Our proof is by induction on the number of entries merged
+together. First, we need to establish the base cases. If the
+index entry is raw, the following expression is used to define
+its value:
+@ID @Code "label &0.03fu {}"
+If the index entry is non-raw, the following expression is
+used to define its value:
+@ID @Code "label &"0.03fu" {}{@OneCol ,} pn"
+where @Code pn is the page number or page number range of
+the entry. In each case we clearly have an entry that
+satisfies all the requirements of the theorem.
+@PP
+Now consider what happens when we come to merge two
+entries. The code used to carry out this merge is
+@ID @OneRow @Code @Verbatim {
+def @Merge left x right y
+{
+ { x @Rump { x @Meld y } } @Case
+ {
+ "" @Yield x
+ else @Yield { { x{@OneCol ,} } @Meld y }
+ }
+
+}
+}
+where @Code x is the first entry and {@Code y} is
+the second.
+@PP
+We call the expression
+@ID @Code "x @Rump { x @Meld y }"
+the {@I discriminant}, since it determines which case
+to apply. We will track this in detail below, but
+approximately, its function is to determine whether @Code y
+contains something that is different from anything in
+{@Code x}. If so, then @Code "x @Meld y" differs from
+@Code "x" and the discriminant is non-empty; if not,
+@Code "x @Meld y" is equal to @Code "x" and the discriminant
+is empty.
+@PP
+The first entry, @Code { x }, may be raw or non-raw, and the
+second, @Code { y }, may also be raw or non-raw, together
+giving four cases, which we take in turn.
+@PP
+If both entries are raw, then by assumption they have the
+same labels and so are identical. Thus, @Code "x @Meld y"
+equals @Code { x }, the discriminant is empty,
+and the result is @Code { x }, which is correct.
+@PP
+If @Code { x } is raw and @Code { y } is non-raw, then
+the discriminant is non-empty and the result is the meld
+of two objects, the first having the form
+@ID @Code "label &0.03fu {}{@OneCol ,}"
+being @Code "x" with a comma appended, and
+the second being some non-raw entry such as
+@ID @Code "label &0.03fu {}{@OneCol ,} pn1{@OneCol ,} pn2"
+where the pattern may repeat. We are assuming by induction
+that @Code y has this form. Clearly, this meld gives a
+value equal to @Code { y }, which is the correct result.
+@PP
+If @Code { x } is non-raw and @Code { y } is raw, the
+@@Meld in the discriminant melds two values typified
+by
+@ID @Code "label &0.03fu {}{@OneCol ,} pn1{@OneCol ,} pn2"
+and
+@ID @Code "label &0.03fu {}"
+The result of this is @Code { x } with an empty object added
+at the end. This empty object is the second element of @Code { y },
+which is not equal to any element of @Code { x }: the second
+element of @Code x is not @Code "{}" but rather
+@Code { "{}{@OneCol ,}" }, because @@Meld treats immediately adjacent
+objects as single elements. The result of @@Rump is then this
+extra empty object, so the discriminant is the empty object and
+we return @Code { x }, correctly. It is this case that requires
+us to use {@Code "0.03fu"}; without it we would be melding
+@ID @Code "label{@OneCol ,} pn1{@OneCol ,} pn2"
+with
+@ID @Code "label"
+producing
+@ID @Code "label{@OneCol ,} pn1{@OneCol ,} pn2 label"
+leading to a non-empty discriminant and the wrong answer.
+@PP
+This leaves just the case where both @Code x and @Code y
+are non-raw. We will divide this last case into three
+sub-cases, but first we need some general observations.
+@PP
+Index entries are sorted for merging in the order in
+which their anchor points appear in the final printed
+document. This means that over the course of these
+entries the page numbers are non-decreasing. It is
+therefore clear that, although the order of merging
+is undefined (actually a balanced tree order is used),
+whenever two entries are presented for merging, all the
+page numbers in the first entry are no larger than all
+the page numbers in the second entry. We are also
+assuming inductively that the page numbers in each entry
+are distinct and monotone increasing. Thus, there can
+be at most one page number common to any two entries
+being merged, and if there is one in common it is the
+last page number of the first entry and the first of
+the second.
+@PP
+Our first sub-case is when the two entries have no
+page number in common. Since @Code { y } is non-raw,
+it has a page number not equal to any page number in
+@Code { x }. Therefore the discriminant is non-empty
+and the result is the meld of @Code "x{@OneCol ,}"
+with @Code { y }, which for example could be the
+meld of
+@ID @Code "label &0.03fu {}{@OneCol ,} pn1{@OneCol ,} pn2{@OneCol ,}"
+with
+@ID @Code "label &0.03fu {}{@OneCol ,} pn3{@OneCol ,} pn4"
+This will give the right answer, since @@Meld treats
+adjacent objects as single elements, and always incorporates
+elements from the first parameter first when it has a choice.
+@PP
+Our second sub-case is when the two entries have a
+page number in common and @Code { y } has two or more
+page numbers. The common page number must be the last
+of @Code x and the first of @Code { y }, so again
+@Code { y } has something (its last page number)
+distinct from @Code { x }, the discriminant is non-empty,
+and we end up for example melding
+@ID @Code "label &0.03fu {}{@OneCol ,} pn1{@OneCol ,} pn2{@OneCol ,}"
+with
+@ID @Code "label &0.03fu {}{@OneCol ,} pn2{@OneCol ,} pn3"
+Again it's clear that the meld will produce the right
+answer; in fact, this second sub-case could be unified
+with the first sub-case.
+@PP
+Our third sub-case is when the two entries have a page
+number in common and @Code { y } has only one page
+number. In this case, typified by @Code { x } with value
+@ID @Code "label &0.03fu {}{@OneCol ,} pn1{@OneCol ,} pn2"
+and @Code y with value
+@ID @Code "label &0.03fu {}{@OneCol ,} pn2"
+it is clear that @Code y offers nothing new, the
+discriminant is empty, and the result, quite correctly,
+is @Code { x }. This completes the proof.
+@End @Section