aboutsummaryrefslogtreecommitdiffstats
path: root/doc/design/s2_4
blob: 38d25e7db4d47801dc8c47554eec1fd0df0bb01b (plain) (blame)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
@SubSection
    @Tag { objects.impl }
    @Title { Implementation of objects and concatenation }
@Begin
@PP
In this section we discuss the implementation of objects and concatenation,
and especially mark alignment.  The first step is to use an operator
precedence parser to convert input such as
@ID @Code "a  |0.5i  b  /0.2i  c  |  d"
into parse trees such as
@ID @Eq {
@Fig {
@Tree {
@Node @FCircle fraction
@FirstSub {
   @Node @FCircle bar
   @FirstSub { @Node @FCircle a }
   @NextSub  { @Node @FEllipse 0.5i }
   @NextSub  { @Node @FCircle b }
}
@NextSub { @Node @FEllipse 0.2i }
@NextSub {
   @Node @FCircle bar
   @FirstSub { @Node @FCircle c }
   @NextSub  { @Node @FCircle   }
   @NextSub  { @Node @FCircle d }
}
}
}
}
Missing objects are replaced by empty objects, and sequences of
concatenation operators are consolidated:
@ID @Eq {
@Fig {
@Tree {
@Node @FCircle bar
@FirstSub { @Node @FCircle a }
@NextSub { @Node @FEllipse 0.2i }
@NextSub {
   @Node @FCircle bar
   @FirstSub { @Node @FCircle c     }
   @NextSub  { @Node @FEllipse 0.3i }
   @NextSub  { @Node @FCircle d     }
}
}
}
&2m ==> &2m
@Fig {
@Tree {
@Node @FCircle bar
@FirstSub { @Node @FCircle a     }
@NextSub  { @Node @FEllipse 0.2i }
@NextSub  { @Node @FCircle c     }
@NextSub  { @Node @FEllipse 0.3i }
@NextSub  { @Node @FCircle d     }
}
}
}
to make manifest their associativity and reduce the depth of the tree
for efficiency later.
@PP
The required semantic information is the size of each subobject,
consisting of four integers:  width to left and right of the
distinguished column mark, and height above and below the distinguished
row mark.  These numbers are always non-negative in Basser Lout, but
this restriction is unnecessary and should be dropped.
@PP
For the leaves, which are simple words, the numbers are obtained from
font tables.  For the higher levels we apply recursive rules.  Suppose
that @Eq { hgap(x, g, y) } returns the desired distance between the
column marks of objects @Eq { x } and @Eq { y } when they are separated by
gap @Eq { g }:  @Eq { right(x) + length(g) + left(y) } when the gap mode is
edge-to-edge, the larger of @Eq { length(g) } and
@Eq { right(x) + left(y) } when the mode is mark-to-mark, and so on.  Given
an object
@ID @Eq {
X = x sub 1 ````` bar g sub 1 ````` ... ````` { "-2p" @Font "^"}bar g sub i-1
````` x sub i ````` ... ````` bar g sub n-1 ````` x sub n
}
we may calculate its size as follows:
@ID @Eq {
left(X) ^= left( x sub 1 ) + hgap( x sub 1 , g sub 1 , x sub 2 )
+ ... + hgap( x sub i-1 , g sub i-1 , x sub i )
/1.4vx
right(X) ^= hgap( x sub i , g sub i , x sub i+1 )
+ ... + hgap( x sub n-1 , g sub n-1 , x sub n ) + right( x sub n )
/1.4vx
"above"(X) ^= "above"(x sub 1 ) up ... up "above"(x sub n )
/1.4vx
"below"(X) ^= "below"(x sub 1 ) up ... up "below"(x sub n )
}
where @Eq { non up } returns the larger of its two parameters.  Similar
formulas are easily derived for the other operators.
@PP
For purposes of exposition we will now make the simplifying
assumptions that all gaps are {@Code "0i"}, all column marks lie at
the left edge, and all row marks lie at the top edge.  Then the size
of each object can be expressed by just two numbers, width and
height, and the four formulas reduce to
@ID @Eq {
width( x sub 1 rel bar ... rel bar x sub n ) ^=
width( x sub 1 ) + ... + width( x sub n )
/1.4vx
height( x sub 1 rel bar ... rel bar x sub n ) ^=
height( x sub 1 ) up ... up height( x sub n )
}
The corresponding formulas for vertical concatenation are
@ID @Eq {
width( x sub 1 rel "/" ... rel "/" x sub n ) ^=
width( x sub 1 ) up ... up width( x sub n )
/1.4vx
height( x sub 1 rel "/" ... rel "/" x sub n ) ^=
height( x sub 1 ) + ... + height( x sub n )
}
According to these formulas, the height of
@ID @Eq { @Fig { @Tree {
@Node @FCircle fraction
@LeftSub {
   @Node @FCircle bar
   @LeftSub  { @Node @FCircle a }
   @RightSub { @Node @FCircle b }
}
@RightSub {
   @Node @FCircle bar
   @LeftSub  { @Node @FCircle c }
   @RightSub { @Node @FCircle d }
}
}}}
is
@ID @Eq {
[ height(a) up height(b)] + [ height(c) up height(d)]
}
which is correct, but for width they yield
@ID @Eq {
[ width(a) + width(b)] up [ width(c) + width(d)]
}
which is not, since it does not take the merging of column marks into
account.  The asymmetry between horizontal and vertical has come
about because the row entries, such as @Eq {a} and {@Eq {b}}, are
adjacent in the tree, but the column entries, such as @Eq {a} and
{@Eq {c}}, are not.  It would be possible to solve this cross-linking
problem by augmenting the size information stored in each node to
record the number of marks and the size of each, but the author has
preferred the following method which makes structural changes to the
tree instead.
@PP
If @Eq { a } and @Eq { c } share a column mark, they each might as well
have width { @Eq {width(a) up width(c) }}, since all width calculations
apply to entire columns.  Accordingly, we introduce a new operator,
@Eq {COL}, defined by
@ID @Eq { width( x sub 1 bin COL ... bin COL x sub n ) =
width( x sub 1 ) up ... up width( x sub n )
}
and replace both @Eq { a } and @Eq { c } by {@Eq { a bin COL c }}.  To
prevent @Eq { COL } operators from disturbing height calculations, we
define a binary operator called @Eq { SPLIT } by
@ID @Eq { width( x bin SPLIT y) ^= width(x)
/1.4vx
height( x bin SPLIT y) ^= height(y) }
which switches height and width calculations onto different
subtrees.  Then the transformation
@ID @Eq {
@Fig { @Tree {
   @Node @FCircle a
}}
&2m ==> &2m
@Fig { @Tree {
   @Node @FEllipse SPLIT
   @LeftSub {
      @Node @FEllipse COL
      @LeftSub  { @Node @FCircle a }
      @RightSub { @Node @FCircle c }
   }
   @RightSub { @Node @FCircle a }
}}
}
# where @Eq { S } denotes a @Eq { SPLIT } node and @Eq { C } denotes a
# @Eq { COL } node,
widens @Eq { a } to @Eq {width(a) up width(c) } without affecting its height;
it is applied to every object that shares its column mark with at least
one other object.  A similar transformation involving a @Eq { ROW } operator
deals with shared row marks.  The effect on our little table is to replace
@ID @Eq { @Fig { @Tree {
@Node @FCircle fraction
@LeftSub {
   @Node @FCircle bar
   @LeftSub  { @Node @FCircle a }
   @RightSub { @Node @FCircle b }
}
@RightSub {
   @Node @FCircle bar
   @LeftSub  { @Node @FCircle c }
   @RightSub { @Node @FCircle d }
}
}}}
by
@ID @Eq { @Fig maxlabels { "70" } { @Tree hmargin { "0.1c" } {
@Node @FCircle fraction
@FirstSub {
   @Node @FCircle bar
   @FirstSub  {
      @Node @FEllipse SPLIT
      @FirstSub {
         @Node @FEllipse COL
         @FirstSub  { @Node @FCircle a }
         @NextSub { @Node @FCircle c }
      }
      @NextSub {
         @Node @FEllipse ROW
         @FirstSub  { @Node @FCircle a }
         @NextSub { @Node @FCircle b }
      }
   }
   @NextSub {
      @Node @FEllipse SPLIT
      @FirstSub {
         @Node @FEllipse COL
         @FirstSub  { @Node @FCircle b }
         @NextSub { @Node @FCircle d }
      }
      @NextSub {
         @Node @FEllipse ROW
         @FirstSub  { @Node @FCircle a }
         @NextSub { @Node @FCircle b }
      }
   }
}
@NextSub {
   @Node @FCircle bar
   @FirstSub  {
      @Node @FEllipse SPLIT
      @FirstSub {
         @Node @FEllipse COL
         @FirstSub  { @Node @FCircle a }
         @NextSub { @Node @FCircle c }
      }
      @NextSub {
         @Node @FEllipse ROW
         @FirstSub  { @Node @FCircle c }
         @NextSub { @Node @FCircle d }
      }
   }
   @NextSub {
      @Node @FEllipse SPLIT
      @FirstSub {
         @Node @FEllipse COL
         @FirstSub  { @Node @FCircle b }
         @NextSub { @Node @FCircle d }
      }
      @NextSub {
         @Node @FEllipse ROW
         @FirstSub  { @Node @FCircle c }
         @NextSub { @Node @FCircle d }
      }
   }
}
}}}
In fact, common subexpressions are identified (trivially) and the result
is a directed acyclic graph; each affected leaf has two parents, one for
width and one for height; and each @Eq { COL } or @Eq { ROW } node has
one parent and one child for each object lying on the corresponding
mark.  The data structure roughly doubles in size, and this occurs only
rarely in practice.
@PP
This method can cope with any legal input, including
@ID @OneRow @Code {
"{  a  //  c  |  d  }  |  {  b  /  e  }"
"/  {  f  /  i  }  |  {  g  |  h  //  j  }"
}
which produces overlapping spanning columns:
@ID @I @Fig {
    @FBox margin { 0.2c } width { 1.6c } 1.2f @Font a |
    @FBox margin { 0.2c } width { 0.6c } 1.2f @Font b |
//  @FBox margin { 0.2c } width { 0.6c } 1.2f @Font c |
    @FBox margin { 0.2c } width { 0.6c } 1.2f @Font d |
    @FBox margin { 0.2c } width { 0.6c } 1.2f @Font e |
//  @FBox margin { 0.2c } width { 0.6c } 1.2f @Font f |
    @FBox margin { 0.2c } width { 0.6c } 1.2f @Font g |
    @FBox margin { 0.2c } width { 0.6c } 1.2f @Font h |
//  @FBox margin { 0.2c } width { 0.6c } 1.2f @Font i |
    @FBox margin { 0.2c } width { 1.6c } 1.2f @Font j |
}
The boxes have been added to clarify the structure.  The width of this
object is formally
@ID @Eq { ((width(a) up (x + y)) + z) up (x + ((y + z) up width(j))) }
where
@IL
@ListItem @Eq { x = width(c) up width(`f`) up width(i) }
@ListItem @Eq { y = width(d`) up width(g) }
@ListItem @Eq { z = width(b) up width(e) up width(h) }
@EL
It seems clear that @Eq { y } at least must appear twice in any
expression for the width of this object made out of simple addition
and maxing operations, showing that an ordinary tree
structure is insufficient for overlapping spanning columns.  The Basser
Lout interpreter actually rejects such structures, owing to the author's
doubts about the implementability of @I Constrained and @I AdjustSize
(Section {@NumberOf constraints}) on them; but with hindsight this caution
was unnecessary.
@PP
The directed acyclic graph is ordered in the sense that the order of
the edges entering and leaving each node matters.  The structure is
highly dynamic, and traversals both with and against the arrows are
required.  After a few ad-hoc attempts to extend the usual tree
representation had failed, the author developed a representation based
on doubly linked lists of records denoting links, whose flexibility more
than compensated for the somewhat excessive memory consumption.  For example,
@ID @Eq { @Fig {
       A:: @FCircle a   |2c                  |2c  B:: @FCircle b
/1.5c  C:: @FCircle c   |    D:: @FCircle d
// A @JoinFigures arrow { forward } C
// A @JoinFigures arrow { forward } D
// B @JoinFigures arrow { forward } D
}}
is represented by
@CD @Eq { @Fig maxlabels { "300" } {
A:: @DagBox mid { @BlackDot } base { a } |2c |2c |2c |2c
B:: @DagBox mid { @BlackDot } base { b }
/1c L:: @DagBox top { @BlackDot } mid { @BlackDot } base { LK }
|   M:: @DagBox top { @BlackDot } mid { @BlackDot } base { LK }
| | N:: @DagBox top { @BlackDot } mid { @BlackDot } base { LK }
/1c
C:: @DagBox top { @BlackDot } base { c } | |
D:: @DagBox top { @BlackDot } base { d }
// @TVShape nw { A@MID@CTR } ne { A@MID@CTR } sw {L@MID@CTR } se { M@MID@CTR }
// @TVShape nw { L@TOP@CTR } ne { L@TOP@CTR } sw {C@TOP@CTR } se { C@TOP@CTR }
// @TVShape nw { M@TOP@CTR } ne { N@TOP@CTR } sw {D@TOP@CTR } se { D@TOP@CTR }
// @TVShape nw { B@MID@CTR } ne { B@MID@CTR } sw {N@MID@CTR } se { N@MID@CTR }
}}
where @Eq { LK } tags a record representing a link.  The first list
in any node contains all the incoming links, the second contains the
outgoing ones.  The node serves as the header for both lists.  The
required operations reduce to simple appends, deletes, and traversals
of doubly linked lists, all having small constant cost.  There is a
highly tuned memory allocator, and care is taken to dispose of each node
when the last incoming link is deleted, so that there is no need for
garbage collection.
@PP
In normal use the number of nodes at higher levels of the dag is small
in comparison with the leaves and their incoming links, so we may
estimate the space complexity at about 60 bytes per input word (20 bytes
per link, 40 per leaf node).  Careful optimization could easily halve
this, but since memory is reclaimed after printing each page there is
little need.
@End @SubSection