aboutsummaryrefslogtreecommitdiffstats
path: root/doc/design/s2_4
diff options
context:
space:
mode:
Diffstat (limited to 'doc/design/s2_4')
-rw-r--r--doc/design/s2_4345
1 files changed, 345 insertions, 0 deletions
diff --git a/doc/design/s2_4 b/doc/design/s2_4
new file mode 100644
index 0000000..38d25e7
--- /dev/null
+++ b/doc/design/s2_4
@@ -0,0 +1,345 @@
+@SubSection
+ @Tag { objects.impl }
+ @Title { Implementation of objects and concatenation }
+@Begin
+@PP
+In this section we discuss the implementation of objects and concatenation,
+and especially mark alignment. The first step is to use an operator
+precedence parser to convert input such as
+@ID @Code "a |0.5i b /0.2i c | d"
+into parse trees such as
+@ID @Eq {
+@Fig {
+@Tree {
+@Node @FCircle fraction
+@FirstSub {
+ @Node @FCircle bar
+ @FirstSub { @Node @FCircle a }
+ @NextSub { @Node @FEllipse 0.5i }
+ @NextSub { @Node @FCircle b }
+}
+@NextSub { @Node @FEllipse 0.2i }
+@NextSub {
+ @Node @FCircle bar
+ @FirstSub { @Node @FCircle c }
+ @NextSub { @Node @FCircle }
+ @NextSub { @Node @FCircle d }
+}
+}
+}
+}
+Missing objects are replaced by empty objects, and sequences of
+concatenation operators are consolidated:
+@ID @Eq {
+@Fig {
+@Tree {
+@Node @FCircle bar
+@FirstSub { @Node @FCircle a }
+@NextSub { @Node @FEllipse 0.2i }
+@NextSub {
+ @Node @FCircle bar
+ @FirstSub { @Node @FCircle c }
+ @NextSub { @Node @FEllipse 0.3i }
+ @NextSub { @Node @FCircle d }
+}
+}
+}
+&2m ==> &2m
+@Fig {
+@Tree {
+@Node @FCircle bar
+@FirstSub { @Node @FCircle a }
+@NextSub { @Node @FEllipse 0.2i }
+@NextSub { @Node @FCircle c }
+@NextSub { @Node @FEllipse 0.3i }
+@NextSub { @Node @FCircle d }
+}
+}
+}
+to make manifest their associativity and reduce the depth of the tree
+for efficiency later.
+@PP
+The required semantic information is the size of each subobject,
+consisting of four integers: width to left and right of the
+distinguished column mark, and height above and below the distinguished
+row mark. These numbers are always non-negative in Basser Lout, but
+this restriction is unnecessary and should be dropped.
+@PP
+For the leaves, which are simple words, the numbers are obtained from
+font tables. For the higher levels we apply recursive rules. Suppose
+that @Eq { hgap(x, g, y) } returns the desired distance between the
+column marks of objects @Eq { x } and @Eq { y } when they are separated by
+gap @Eq { g }: @Eq { right(x) + length(g) + left(y) } when the gap mode is
+edge-to-edge, the larger of @Eq { length(g) } and
+@Eq { right(x) + left(y) } when the mode is mark-to-mark, and so on. Given
+an object
+@ID @Eq {
+X = x sub 1 ````` bar g sub 1 ````` ... ````` { "-2p" @Font "^"}bar g sub i-1
+````` x sub i ````` ... ````` bar g sub n-1 ````` x sub n
+}
+we may calculate its size as follows:
+@ID @Eq {
+left(X) ^= left( x sub 1 ) + hgap( x sub 1 , g sub 1 , x sub 2 )
++ ... + hgap( x sub i-1 , g sub i-1 , x sub i )
+/1.4vx
+right(X) ^= hgap( x sub i , g sub i , x sub i+1 )
++ ... + hgap( x sub n-1 , g sub n-1 , x sub n ) + right( x sub n )
+/1.4vx
+"above"(X) ^= "above"(x sub 1 ) up ... up "above"(x sub n )
+/1.4vx
+"below"(X) ^= "below"(x sub 1 ) up ... up "below"(x sub n )
+}
+where @Eq { non up } returns the larger of its two parameters. Similar
+formulas are easily derived for the other operators.
+@PP
+For purposes of exposition we will now make the simplifying
+assumptions that all gaps are {@Code "0i"}, all column marks lie at
+the left edge, and all row marks lie at the top edge. Then the size
+of each object can be expressed by just two numbers, width and
+height, and the four formulas reduce to
+@ID @Eq {
+width( x sub 1 rel bar ... rel bar x sub n ) ^=
+width( x sub 1 ) + ... + width( x sub n )
+/1.4vx
+height( x sub 1 rel bar ... rel bar x sub n ) ^=
+height( x sub 1 ) up ... up height( x sub n )
+}
+The corresponding formulas for vertical concatenation are
+@ID @Eq {
+width( x sub 1 rel "/" ... rel "/" x sub n ) ^=
+width( x sub 1 ) up ... up width( x sub n )
+/1.4vx
+height( x sub 1 rel "/" ... rel "/" x sub n ) ^=
+height( x sub 1 ) + ... + height( x sub n )
+}
+According to these formulas, the height of
+@ID @Eq { @Fig { @Tree {
+@Node @FCircle fraction
+@LeftSub {
+ @Node @FCircle bar
+ @LeftSub { @Node @FCircle a }
+ @RightSub { @Node @FCircle b }
+}
+@RightSub {
+ @Node @FCircle bar
+ @LeftSub { @Node @FCircle c }
+ @RightSub { @Node @FCircle d }
+}
+}}}
+is
+@ID @Eq {
+[ height(a) up height(b)] + [ height(c) up height(d)]
+}
+which is correct, but for width they yield
+@ID @Eq {
+[ width(a) + width(b)] up [ width(c) + width(d)]
+}
+which is not, since it does not take the merging of column marks into
+account. The asymmetry between horizontal and vertical has come
+about because the row entries, such as @Eq {a} and {@Eq {b}}, are
+adjacent in the tree, but the column entries, such as @Eq {a} and
+{@Eq {c}}, are not. It would be possible to solve this cross-linking
+problem by augmenting the size information stored in each node to
+record the number of marks and the size of each, but the author has
+preferred the following method which makes structural changes to the
+tree instead.
+@PP
+If @Eq { a } and @Eq { c } share a column mark, they each might as well
+have width { @Eq {width(a) up width(c) }}, since all width calculations
+apply to entire columns. Accordingly, we introduce a new operator,
+@Eq {COL}, defined by
+@ID @Eq { width( x sub 1 bin COL ... bin COL x sub n ) =
+width( x sub 1 ) up ... up width( x sub n )
+}
+and replace both @Eq { a } and @Eq { c } by {@Eq { a bin COL c }}. To
+prevent @Eq { COL } operators from disturbing height calculations, we
+define a binary operator called @Eq { SPLIT } by
+@ID @Eq { width( x bin SPLIT y) ^= width(x)
+/1.4vx
+height( x bin SPLIT y) ^= height(y) }
+which switches height and width calculations onto different
+subtrees. Then the transformation
+@ID @Eq {
+@Fig { @Tree {
+ @Node @FCircle a
+}}
+&2m ==> &2m
+@Fig { @Tree {
+ @Node @FEllipse SPLIT
+ @LeftSub {
+ @Node @FEllipse COL
+ @LeftSub { @Node @FCircle a }
+ @RightSub { @Node @FCircle c }
+ }
+ @RightSub { @Node @FCircle a }
+}}
+}
+# where @Eq { S } denotes a @Eq { SPLIT } node and @Eq { C } denotes a
+# @Eq { COL } node,
+widens @Eq { a } to @Eq {width(a) up width(c) } without affecting its height;
+it is applied to every object that shares its column mark with at least
+one other object. A similar transformation involving a @Eq { ROW } operator
+deals with shared row marks. The effect on our little table is to replace
+@ID @Eq { @Fig { @Tree {
+@Node @FCircle fraction
+@LeftSub {
+ @Node @FCircle bar
+ @LeftSub { @Node @FCircle a }
+ @RightSub { @Node @FCircle b }
+}
+@RightSub {
+ @Node @FCircle bar
+ @LeftSub { @Node @FCircle c }
+ @RightSub { @Node @FCircle d }
+}
+}}}
+by
+@ID @Eq { @Fig maxlabels { "70" } { @Tree hmargin { "0.1c" } {
+@Node @FCircle fraction
+@FirstSub {
+ @Node @FCircle bar
+ @FirstSub {
+ @Node @FEllipse SPLIT
+ @FirstSub {
+ @Node @FEllipse COL
+ @FirstSub { @Node @FCircle a }
+ @NextSub { @Node @FCircle c }
+ }
+ @NextSub {
+ @Node @FEllipse ROW
+ @FirstSub { @Node @FCircle a }
+ @NextSub { @Node @FCircle b }
+ }
+ }
+ @NextSub {
+ @Node @FEllipse SPLIT
+ @FirstSub {
+ @Node @FEllipse COL
+ @FirstSub { @Node @FCircle b }
+ @NextSub { @Node @FCircle d }
+ }
+ @NextSub {
+ @Node @FEllipse ROW
+ @FirstSub { @Node @FCircle a }
+ @NextSub { @Node @FCircle b }
+ }
+ }
+}
+@NextSub {
+ @Node @FCircle bar
+ @FirstSub {
+ @Node @FEllipse SPLIT
+ @FirstSub {
+ @Node @FEllipse COL
+ @FirstSub { @Node @FCircle a }
+ @NextSub { @Node @FCircle c }
+ }
+ @NextSub {
+ @Node @FEllipse ROW
+ @FirstSub { @Node @FCircle c }
+ @NextSub { @Node @FCircle d }
+ }
+ }
+ @NextSub {
+ @Node @FEllipse SPLIT
+ @FirstSub {
+ @Node @FEllipse COL
+ @FirstSub { @Node @FCircle b }
+ @NextSub { @Node @FCircle d }
+ }
+ @NextSub {
+ @Node @FEllipse ROW
+ @FirstSub { @Node @FCircle c }
+ @NextSub { @Node @FCircle d }
+ }
+ }
+}
+}}}
+In fact, common subexpressions are identified (trivially) and the result
+is a directed acyclic graph; each affected leaf has two parents, one for
+width and one for height; and each @Eq { COL } or @Eq { ROW } node has
+one parent and one child for each object lying on the corresponding
+mark. The data structure roughly doubles in size, and this occurs only
+rarely in practice.
+@PP
+This method can cope with any legal input, including
+@ID @OneRow @Code {
+"{ a // c | d } | { b / e }"
+"/ { f / i } | { g | h // j }"
+}
+which produces overlapping spanning columns:
+@ID @I @Fig {
+ @FBox margin { 0.2c } width { 1.6c } 1.2f @Font a |
+ @FBox margin { 0.2c } width { 0.6c } 1.2f @Font b |
+// @FBox margin { 0.2c } width { 0.6c } 1.2f @Font c |
+ @FBox margin { 0.2c } width { 0.6c } 1.2f @Font d |
+ @FBox margin { 0.2c } width { 0.6c } 1.2f @Font e |
+// @FBox margin { 0.2c } width { 0.6c } 1.2f @Font f |
+ @FBox margin { 0.2c } width { 0.6c } 1.2f @Font g |
+ @FBox margin { 0.2c } width { 0.6c } 1.2f @Font h |
+// @FBox margin { 0.2c } width { 0.6c } 1.2f @Font i |
+ @FBox margin { 0.2c } width { 1.6c } 1.2f @Font j |
+}
+The boxes have been added to clarify the structure. The width of this
+object is formally
+@ID @Eq { ((width(a) up (x + y)) + z) up (x + ((y + z) up width(j))) }
+where
+@IL
+@ListItem @Eq { x = width(c) up width(`f`) up width(i) }
+@ListItem @Eq { y = width(d`) up width(g) }
+@ListItem @Eq { z = width(b) up width(e) up width(h) }
+@EL
+It seems clear that @Eq { y } at least must appear twice in any
+expression for the width of this object made out of simple addition
+and maxing operations, showing that an ordinary tree
+structure is insufficient for overlapping spanning columns. The Basser
+Lout interpreter actually rejects such structures, owing to the author's
+doubts about the implementability of @I Constrained and @I AdjustSize
+(Section {@NumberOf constraints}) on them; but with hindsight this caution
+was unnecessary.
+@PP
+The directed acyclic graph is ordered in the sense that the order of
+the edges entering and leaving each node matters. The structure is
+highly dynamic, and traversals both with and against the arrows are
+required. After a few ad-hoc attempts to extend the usual tree
+representation had failed, the author developed a representation based
+on doubly linked lists of records denoting links, whose flexibility more
+than compensated for the somewhat excessive memory consumption. For example,
+@ID @Eq { @Fig {
+ A:: @FCircle a |2c |2c B:: @FCircle b
+/1.5c C:: @FCircle c | D:: @FCircle d
+// A @JoinFigures arrow { forward } C
+// A @JoinFigures arrow { forward } D
+// B @JoinFigures arrow { forward } D
+}}
+is represented by
+@CD @Eq { @Fig maxlabels { "300" } {
+A:: @DagBox mid { @BlackDot } base { a } |2c |2c |2c |2c
+B:: @DagBox mid { @BlackDot } base { b }
+/1c L:: @DagBox top { @BlackDot } mid { @BlackDot } base { LK }
+| M:: @DagBox top { @BlackDot } mid { @BlackDot } base { LK }
+| | N:: @DagBox top { @BlackDot } mid { @BlackDot } base { LK }
+/1c
+C:: @DagBox top { @BlackDot } base { c } | |
+D:: @DagBox top { @BlackDot } base { d }
+// @TVShape nw { A@MID@CTR } ne { A@MID@CTR } sw {L@MID@CTR } se { M@MID@CTR }
+// @TVShape nw { L@TOP@CTR } ne { L@TOP@CTR } sw {C@TOP@CTR } se { C@TOP@CTR }
+// @TVShape nw { M@TOP@CTR } ne { N@TOP@CTR } sw {D@TOP@CTR } se { D@TOP@CTR }
+// @TVShape nw { B@MID@CTR } ne { B@MID@CTR } sw {N@MID@CTR } se { N@MID@CTR }
+}}
+where @Eq { LK } tags a record representing a link. The first list
+in any node contains all the incoming links, the second contains the
+outgoing ones. The node serves as the header for both lists. The
+required operations reduce to simple appends, deletes, and traversals
+of doubly linked lists, all having small constant cost. There is a
+highly tuned memory allocator, and care is taken to dispose of each node
+when the last incoming link is deleted, so that there is no need for
+garbage collection.
+@PP
+In normal use the number of nodes at higher levels of the dag is small
+in comparison with the leaves and their incoming links, so we may
+estimate the space complexity at about 60 bytes per input word (20 bytes
+per link, 40 per leaf node). Careful optimization could easily halve
+this, but since memory is reclaimed after printing each page there is
+little need.
+@End @SubSection