| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
inside \w arguments, and skip most other escape sequences when measuring
the output length in this way because most escape sequences contribute
little or nothing to text width: for example, consider font escapes in
terminal output.
This implementation is very rudimentary. In particular, it assumes that
every character has the same width. No attempt is made to detect
double-width or zero-width Unicode characters or to take dependencies on
output devices or fonts into account. These limitations are hard to
avoid because mandoc has to interpolate \w at the parsing stage when the
output device is not yet known. I really do not want the content of the
syntax tree to depend on the output device.
Feature requested by Paul <Eggert at cs dot ucla dot edu>, who also
submitted a patch, but i chose to commit this very different patch
with almost the same functionality.
His input was still very valuable because complete support for \w is
out of the question, and consequently, the main task is identifying
subsets of the feature that are needed for real-world manual pages
and can be supported without uprooting the whole forest.
|
|
|
|
| |
infinite recursion in macro argument expansion
|
|
|
|
| |
recursive delayed expansion of escape sequences in macro arguments
|
| |
|
| |
|
|
|
|
|
|
|
|
|
| |
vertical space after boxed tables).
I'm committing this separately because trying to regenerate the
desired output with groff-current reveals an unrelated, recent
regression in groff. So i fixed the groff output by hand before
committing it, to get rid of the effect of the roff regression.
|
|
|
|
|
| |
and tbl_term.c rev. 1.79 cause quite a bit of churn, unfortunately.
This commit cleans up most of it.
|
|
|
|
|
|
|
|
|
|
|
|
| |
when comparing section headers. For example, ".Sh SEE ELSEWHERE"
and ".Sh SEE Em ALSO" were considered instances of a SEE ALSO
section. In groff-current, exact matches with no sub-macros are
required. Adjust mandoc behaviour.
While here, also fix a very minor mandoc bug, even though no
detrimental effect of the bug on formatting is known. While using
sub-macros in the .Sh HEAD is bad style, the parsers accept it, so
setting the section attribute on the HEAD needs to act recursively.
|
|
|
|
| |
The new version of the output file was generated with groff-current.
|
|
|
|
| |
when multiple input or output lines are involved.
|
|
|
|
|
|
| |
The new version of this file was generated with groff-current.
Heirloom nroff produces exactly the same output for the content
of the DESCRIPTION.
|
|
|
|
| |
after vertical spacing was improved in man_term.c rev. 1.239.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
whatsoever (for example \fR) and escape sequences that produce
invisible zero-width output (for example \&). No, i'm not joking,
groff does make that distinction, and it has consequences in some
situations, for example for vertical spacing in no-fill mode.
Heirloom and Plan 9 behaviour is subtly different, but in case of
doubt, we want to follow groff.
While this fixes the behaviour for the majority of escape sequences,
in particular for those most likely to occur in practice, it is not
perfect yet because some of the more exotic ESCAPE_IGNORE sequences
are actually of the "no output whatsoever" type but treated
as "invisible zero-width" for now. With the new ASCII_NBRZW mechanism
in place, switching them over one by one when the need arises will
no longer be very difficult.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Leaving the body empty is legitimate in this case if the author only
wants to display a mail address or URI without providing a link text.
Output modules already handle this correctly: terminal output shows
just the URI without an accompanying text, HTML output uses the URI
for *both* the href= attribute and as the content of the <a> element.
The documentation was also wrong and claimed that an .MT or .UR block
with an empty body would produce no output. As explained above,
this isn't true.
Bogus warning reported by
Alejandro Colomar <alx dot manpages at gmail dot com>.
|
|
|
|
|
| |
after they were changed in OpenBSD.
Tracking these rules here would be useless.
|
|
|
|
|
|
|
|
|
|
|
|
| |
document, <h1> is intended for top level headers, and most of the
sections in a manual page can hardly be considered top-level.
It is more usual to use <h1> only for the main title of the document
of for the site name.
Consequently, move .Sh/.SH from <h1> to <h2> and .Ss/.SS from <h2>
to <h3>, freeing <h1> for use by header.html in man.cgi(8).
Discussed with Anna Vyalkova <cyber at sysrq dot in>.
|
|
|
|
|
|
| |
G. Branden Robinson changed the -T ascii rendering
of \(sd, the "second" symbol, U+2033 DOUBLE PRIME, from '' to ".
Follow suit in mandoc.
|
|
|
|
|
|
|
| |
delimiter for an outer escape sequence, in which case the delimiting
escape sequence retains its syntax but usually ignores its argument
and loses its inherent effect. Add rudimentary support for this
syntax quirk in order to improve parsing compatibility with groff.
|
|
|
|
|
| |
into the more specific messages "invalid escape argument delimiter"
and "invalid escape sequence argument".
|
|
|
|
| |
improved diagnostics for the \C escape sequence
|
|
|
|
|
|
|
|
|
| |
diagnostics. Distinguish "incomplete escape sequence", "invalid special
character", and "unknown special character" from the generic "invalid
escape sequence", also promoting them from WARNING to ERROR because
incomplete escape sequences are severe syntax violations and because
encountering an invalid or unknown special character makes it likely
that part of the document content intended by the authors gets lost.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
in the same way as groff:
* \\ is always reduced to \
* \. is always reduced to .
* other undefined escape sequences are usually reduced to the escape name,
for example \G to G, except during the expansion of expanding escape
sequences having the standard argument form (in particular \* and \n),
in which case the backslash is preserved literally.
Yes, this is confusing indeed.
For example, the following have the same meaning:
* .ds \. and .ds . which is not the same as .ds \\.
* \*[\.] and \*[.] which is not the same as \*[\\.]
* .ds \G and .ds G which is not the same as .ds \\G
* \*[\G] and \*[\\G] which is not the same as \*[G] <- sic!
To feel less dirty, have a leaning toothpick, if you are so inclined.
This patch also slightly improves the string shown by the "escaped
character not allowed in a name" error message.
|
|
|
|
|
|
|
|
|
| |
escape sequence. This is needed to get \V into the correct parsing
class, ESCAPE_EXPAND.
It is intentional that mandoc(1) output is *not* influenced by environment
variables, so interpolate the name of the variable with some decorating
punctuation rather than interpolating its value.
|
|
|
|
|
|
|
| |
from "ignore" to "unsupported" because when an input file uses it,
mandoc(1) is likely to significantly misformat the output,
usually showing parts of the output in a different order
than the author intended.
|
|
|
|
|
| |
that take no argument and are ignored: \% \& \^ \a \d \t \u \{ \| \}
No change to parsing or formatting needed.
|
|
|
|
| |
some diagnostics now appear in a more reasonable order, too
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
after the roff_expand() reorganization in roff.c rev. 1.388.
The new parsing direction has two effects:
1. Correct output when a line contains more than one expanding
escape sequence that has a side effect.
2. Column numbers in diagnostic messages now report the changed
column numbers after any expansions left of them have taken place;
in the past, column numbers refered to the original input line.
Arguably, item 2 was a bit better in its old state, but slightly
less helpful diagnostics are a small price to pay for correct
output. Besides, when the expansion of user-defined strings or
macros is involved, in many cases, mandoc(1) is already unable to
report meaningful line and column numbers, so item 2 is not a
noteworthy regression. The effort and code complication for fixing
that would probably be excessive, in particular since well-written
manual pages are not supposed to use such features in the first place.
|
|
|
|
| |
of the roff_expand() reorganization in roff.c rev. 1.388
|
|
|
|
| |
of the roff_expand() reorganization in roff.c rev. 1.388
|
|
|
|
|
|
|
| |
functionality is not needed when called from roff_getarg(). This makes the
long and complicated function roff_expand() significantly shorter, and also
simpler in so far as it no longer needs to return ROFF_APPEND.
No functional change intended.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
or macro, including context-dependent error handling inside tbl(7) code
and inside .ce/.rj blocks. Use it both in the top level roff(7) parser
and inside conditional blocks.
This fixes an assertion failure triggered by ".if 1 .ce" inside tbl(7)
code, found by tb@ using afl(1).
As a side benefit for readability, only one place remains in the
code that calls the main handler functions for the various roff(7)
requests. This patch also improves column numbers in some error
messages and various comments.
|
|
|
|
|
|
|
|
|
| |
and the roff_onearg() parsing function is too generic,
so provide a dedicated parsing function instead.
This fixes an assertion failure when an \o escape sequence is
passed as the argument; the bug was found by tb@ using afl(1).
It also makes mandoc output more similar to groff in various cases.
|
|
|
|
|
|
|
|
| |
break multiple element next-line scopes at the same time, similar to
what man_descope() already does for unconditional rewinding.
This fixes an assertion failure that tb@ found with afl(1), caused
by .SH .I .I .BI and similar sequences of macros without arguments.
|
|
|
|
|
|
| |
and never produce output at the place of their invocation.
Minibugs found while investigating unrelated afl(1) reports from tb@.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
1. The combination \z\h is a no-op whatever the argument may be.
In the past, the \z only affected the first space character generated
by the \h, which was wrong.
2. For the conbination \zX\h with a positive argument, the first
space resulting from the \h is not printed but consumed by the \z.
3. For the combination \zX\h with a negative argument, application
of the \z needs to be completed before the \h can be started.
In the past, if this combination occurred at the beginning of an
output line, the \h backed up to the beginning of the line and
after that, the \z attempted to back up even further, triggering
an assertion.
Bugs found during an audit of assignments to termp->col that i
started after the bugfix tbl_term.c rev. 1.65. The assertion
triggered by bug 3 was *not* yet found by afl(1).
|
| |
|
|
|
|
|
|
|
|
|
| |
This is needed because the TERMP_MULTICOL mode is designed such
that term_tbl() buffers all the cells of the table row before the
normal reset logic near the end of term_flushln() can be reached.
This fixes an assertion failure triggered by \z near the end
of a table cell, found by tb@ using afl(1).
|
|
|
|
|
|
|
|
| |
Apart from making sense in the first place, this fixes an assertion
failure that happened when the calculated implicit tag did not match
the string value of the first child of the node,
Bug found by tb@ using afl(1).
|
|
|
|
|
|
|
|
| |
index but use 0 instead of the argument, just like groff.
Warn about the invalid argument.
While here, fix the column number in another warning message.
Segfault reported by tb@, found with afl(1).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
we must not reset the recursion counter when moving beyond the end
of the *previous* expansion, but we may only do so when moving
beyond the rightmost position reached by *any* expansion in the
current equation. This matters because definitions can nest;
consider:
.EQ
define inner "content"
define outer "inner outer"
outer
.EN
This endless loop was found by tb@ using afl(1).
Incidentally, GNU eqn(1) also performs an infinite loop in this
situation and then crashes when memory runs out, but that's not an
excuse for nasty behaviour of mandoc(1).
While here, consistently print the expanded content even when the
expansion is finally truncated. While that is not likely to help
end-users, it may help authors of eqn(7) code to understand what's
going on. Besides, it sends a very clear signal that something is
amiss, which was easy to miss in the past unless people
enabled -W error or used -T lint.
|
|
|
|
|
|
|
|
|
| |
whatsoever and ends with a broken next-line scope. Obviously, this
cannot happen in a real manual page, but mandoc(1) should not die
even when fed absurd input.
This bug was independently reported by both jsg@ and tb@ who both
found it with afl(1).
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
beginning of an escape sequence: \, \E, \EE, \EEE, and so on all do
the same outside copy mode, so let them do the same in mandoc(1), too.
This fixes an assertion failure triggered by \EE*X that tb@ found
with afl(1). The first E was consumed by roff_expand(), but that
function failed to recognize the escape sequence as the expansion
of a user-defined string and handed it over to mandoc_escape(),
which consumed the second E and then died on an assertion because
it is not prepared to handle user-defined strings. Fix this by
letting *both* functions handle arbitrary numbers of 'E's correctly.
|
|
|
|
| |
issue found on Oracle Solaris 11
|
|
|
|
|
|
|
| |
cells that horizontally span columns which contains "n" (number)
formatted cells on other rows. This requires updating total column
widths from "n" formatted cells before starting width distribution
from the spanning cells to their constituent columns.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
in the tbl(7) layout font modifier.
Get rid of the TBL_CELL_BOLD and TBL_CELL_ITALIC flags and use
the usual ESCAPE_FONT* enum mandoc_esc members from mandoc.h instead,
which simplifies and unifies some code.
While here, also support CB and CI in roff(7) \f escape sequences
and in roff(7) .ft requests for all output modes. Using those is
certainly not recommended because portability is limited even with
groff, but supporting them makes some existing third-party manual
pages look better, in particular in HTML output mode.
Bug-compatible with groff as far as i'm aware, except that i consider
font names starting with the '\n' (ASCII 0x0a line feed) character
so insane that i decided to not support them.
Missing feature reported by nabijaczleweli dot xyz in
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=992002.
I used none of the code from the initial patch submitted by
nabijaczleweli, but some of their ideas.
Final patch tested by them, too.
|
|
|
|
| |
output that are no longer printed since man_term.c rev. 1.236
|
| |
|
|
|
|
|
|
|
|
|
|
| |
1. Move invalid two-byte sequences after valid ones
and make their descriptions easier to understand.
2. Replace the wrong and confusing expression "middle byte"
with the correct term "start byte".
3. Add test lines for U+EFFFF and U+F0000.
4. Replace the unhelpful word "strange" with more descriptive terms.
Arguably, nothing about this (or maybe everything?) is strange.
|
|
|
|
|
| |
was changed from 0 to 1. Adjust the test results accordingly.
Issue reported by bluhm@
|
| |
|