| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
| |
when comparing section headers. For example, ".Sh SEE ELSEWHERE"
and ".Sh SEE Em ALSO" were considered instances of a SEE ALSO
section. In groff-current, exact matches with no sub-macros are
required. Adjust mandoc behaviour.
While here, also fix a very minor mandoc bug, even though no
detrimental effect of the bug on formatting is known. While using
sub-macros in the .Sh HEAD is bad style, the parsers accept it, so
setting the section attribute on the HEAD needs to act recursively.
|
|
|
|
| |
The new version of the output file was generated with groff-current.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
macros .B, .I, .SM, and .SB that the next-line scope extends
to the end of the next logical input line and is not extended
if that line ends with a \c (no-space) escape sequence.
While improving a loosely related feature in the man(7) .TP
macro, a regression entered the groff codebase in groff
commit 3549fd9f (28-Apr-2017) caused by the usual sloppiness
of Bjarni Ingi Gislason. Since that time, groff wrongly had \c
extend next-line scope to a second line for these macros.
In man.c rev. 1.127 (25-Aug-2018) i synched mandoc behaviour
with groff in this respect, unfortunately failing to notice
the recent regression in groff. The groff regression was
finally fixed by gbranden@ in commit 09c028f3 (07-Jun-2022).
With the present commit, mandoc is back in sync with both GNU and
Heirloom roff regarding the interaction of single-font macros with \c.
|
|
|
|
| |
when multiple input or output lines are involved.
|
|
|
|
|
|
| |
The new version of this file was generated with groff-current.
Heirloom nroff produces exactly the same output for the content
of the DESCRIPTION.
|
|
|
|
|
|
| |
line, use the current output position as the reference position
for tabs on that input line. This brings mandoc in line with the
behaviour of GNU, Heirloom, and Plan 9 roff.
|
|
|
|
|
| |
move it to the top level include file mandoc.h to reduce the risk of causing
clashes when introducing new ASCII_* constants in the future.
|
|
|
|
| |
after vertical spacing was improved in man_term.c rev. 1.239.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
at the beginning of the node handler, in the same way as it is done
in the mdoc(7) node handler.
As a side effect, this also fixes a bug: if an input line contained
nothing but an escape sequence producing no output whatsoever (for
example, \fR), the old code incorrectly emitted a blank line anyway,
whereas the new code only emits such a blank link if the input line
actually produces output (even invisible zero-width output). To make
the distinction, the ASCII_NBRZW -> lastcol -> term_newln() mechanism
established in term.c rev. 1.289 is used.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
whatsoever (for example \fR) and escape sequences that produce
invisible zero-width output (for example \&). No, i'm not joking,
groff does make that distinction, and it has consequences in some
situations, for example for vertical spacing in no-fill mode.
Heirloom and Plan 9 behaviour is subtly different, but in case of
doubt, we want to follow groff.
While this fixes the behaviour for the majority of escape sequences,
in particular for those most likely to occur in practice, it is not
perfect yet because some of the more exotic ESCAPE_IGNORE sequences
are actually of the "no output whatsoever" type but treated
as "invisible zero-width" for now. With the new ASCII_NBRZW mechanism
in place, switching them over one by one when the need arises will
no longer be very difficult.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
not to *output* lines. In particular, if an input line gets broken in
fill mode and a tab occurs in the second output line, it advances to a
position of at least (width of the first output line) + (width of a
space character even though this is never printed) + (width of the part
of the second output line that precedes the tab).
Implement the same logic in mandoc.
Again, do not use tabs in filled text: they have surprising effects,
including this one.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
non-breakable in exactly the same way as "\ ". That is, the preceding
word, the tab character, and the following word are always kept
together on the same output line. If filling is enabled and an
output line break is required before the end of the following word,
the break occurs before the beginning of the preceding word.
Make mandoc behave in the same way.
Of course, using literal tab characters in filled text remains a
bad idea, and the "WARNING: tab in filled text" remains unchanged.
|
| |
|
|
|
|
|
| |
from being turned into underscores;
bug reported by <Eldred dot fr> Habert
|
|
|
|
| |
Also, mention /usr/ucb/man because /usr/bin/man did not provide -f in 4.0BSD.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Leaving the body empty is legitimate in this case if the author only
wants to display a mail address or URI without providing a link text.
Output modules already handle this correctly: terminal output shows
just the URI without an accompanying text, HTML output uses the URI
for *both* the href= attribute and as the content of the <a> element.
The documentation was also wrong and claimed that an .MT or .UR block
with an empty body would produce no output. As explained above,
this isn't true.
Bogus warning reported by
Alejandro Colomar <alx dot manpages at gmail dot com>.
|
|
|
|
|
| |
after they were changed in OpenBSD.
Tracking these rules here would be useless.
|
|
|
|
| |
Patch from Anna Vyalkova <cyber at sysrq dot in>, significantly tweaked by me.
|
|
|
|
|
|
|
| |
"Start names with a capital letter;
it helps some screen readers speak them with appropriate inflection."
Anna Vyalkova already did that correctly when sending patches,
but i ruined it when committing, so fix it now.
|
| |
|
|
|
|
| |
discussed with Anna Vyalkova <cyber at sysrq dot in>
|
|
|
|
|
|
|
|
|
|
|
|
| |
document, <h1> is intended for top level headers, and most of the
sections in a manual page can hardly be considered top-level.
It is more usual to use <h1> only for the main title of the document
of for the site name.
Consequently, move .Sh/.SH from <h1> to <h2> and .Ss/.SS from <h2>
to <h3>, freeing <h1> for use by header.html in man.cgi(8).
Discussed with Anna Vyalkova <cyber at sysrq dot in>.
|
|
|
|
|
|
|
|
|
|
|
| |
and use flexbox CSS instead. Improve accessibility by adding role
and aria-label attributes to these header and footer lines.
Using ideas from both Anna Vyalkova <cyber at sysrq dot in> and myself.
As a welcome side effect, this also resolves the long-standing issue
that the rendering was always 65em wide, requiring horizontal scrolling
when the window was narrower. Now, rendering nicely adapts to browser
windows of arbitrary narrowness.
|
|
|
|
|
|
| |
before and outside the <header> element.
Fix this by moving it into the <header> element where it belongs.
While here, also wrap footer.html in a <footer> element.
|
|
|
|
|
|
|
| |
in particular adding <header>, <main>, and <nav> elements
and role and aria-label attributes in several places.
Patch from Anna Vyalkova <cyber at sysrq dot in>,
minimally tweaked by me.
|
|
|
|
|
|
|
| |
between the <head> and the <body> rather than before the <head>
because the <meta charset="utf-8"/> element ought to be within
the first 1024 bytes of the HTML code.
Issue found with validator.w3.org.
|
| |
|
|
|
|
|
|
|
|
|
|
| |
HTML <main> element. The benefit is that it has the ARIA landmark
role "main" by default. To ease the transition for people using
their own CSS file instead of mandoc.css, retain the custom class
for now.
I had this idea in a discussion with Anna Vyalkova <cyber at sysrq dot in>.
Patch from Anna, slightly tweaked by me.
|
|
|
|
|
|
| |
G. Branden Robinson changed the -T ascii rendering
of \(sd, the "second" symbol, U+2033 DOUBLE PRIME, from '' to ".
Follow suit in mandoc.
|
| |
|
|
|
|
|
|
|
|
| |
such that users of screen readers aren't forced to listen to lengthy and
distracting readings like "mdoc, left parenthesis, 7, right parenthesis".
Based on a patch from Anna Vyalkova <cyber at sysrq dot in>,
significantly tweaked by me.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
in the DPUB-ARIA doc-toc role.
Patch from Anna Vyalkova <cyber at sysrq dot in> slightly tweaked by me.
This is hopefully the start of a collaboration to improve accessibility
of Unix manual pages using the WAI-ARIA, HTML-ARIA, and DPUB-ARIA standards.
Progress appears to be possible without changing *anything* with respect to
the way manual pages are written. Instead, it seems sufficient to properly
translate semantic cues already implied by existing mdoc(7) markup into the
appropriate HTML elements and ARIA attributes. Overall, the total length
of HTML output is likely to increase slightly, but not much.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
because that has no longer been true for some time now.
I would certainly like to adhere to a coherent standard and state
which one that is. Unfortunately, the W3C deliberately smashed
the CSS standard into pieces such that a coherent standard no
longer exists and such that statements about standard conformance
have become next to meaningless. Consequently, i now remain
reluctantly silent regarding CSS standard(s) conformance.
Going back to CSS2.1, published in 2011, which was the last CSS
standard in the proper sense of the word, is not an option because
it has gaping holes in functionality and is no longer adequate for
use on today's WWW.
|
| |
|
|
|
|
|
| |
of the current block but really want the next block instead. This fixes
a segfault reported by Evan Silberman <evan at jklol dot net> on bugs@.
|
|
|
|
|
|
|
| |
delimiter for an outer escape sequence, in which case the delimiting
escape sequence retains its syntax but usually ignores its argument
and loses its inherent effect. Add rudimentary support for this
syntax quirk in order to improve parsing compatibility with groff.
|
|
|
|
|
| |
into the more specific messages "invalid escape argument delimiter"
and "invalid escape sequence argument".
|
|
|
|
|
| |
the error was already reported earlier when roff_expand()
called roff_escape().
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
operators as argument delimiters for some escape sequences that take
numerical arguments, in the same way as it had already been done for \h.
Argument delimiter parsing for escape sequences taking numerical arguments
is not perfect yet. In particular, when a character representing a
scaling unit is abused as the argument delimiter, parsing for that
character becomes context-dependent, and it is no longer possible to
find the end of the escape sequence without calling the full numerical
expression parser, which i refrain from attempting in this commit.
For now, continuing to misparse insane constructions like \Bc1c+1cc
(which is valid in groff and resolves to "1" because 1c+1c = two
centimeters is a valid numerical expression and 'c' is also a valid
delimiter) is a small price to pay for keeping complexity at bay
and for not losing focus in the ongoing series of refinements.
|
|
|
|
| |
improved diagnostics for the \C escape sequence
|
|
|
|
|
|
|
|
|
| |
The restriction of only allowing ' as the delimiter was introduced
by kristaps@ on 2011/04/09 when he first supported \C.
For most other escape sequences, similar restrictions were relaxed
later on, but for the rarely used \C, it was apparently forgotten.
While here, reject empty character names: they are never valid.
|
| |
|
|
|
|
|
|
|
|
|
| |
diagnostics. Distinguish "incomplete escape sequence", "invalid special
character", and "unknown special character" from the generic "invalid
escape sequence", also promoting them from WARNING to ERROR because
incomplete escape sequences are severe syntax violations and because
encountering an invalid or unknown special character makes it likely
that part of the document content intended by the authors gets lost.
|
|
|
|
|
|
|
| |
call mandoc_msg() only once at the end, not sometimes in the middle,
classify incomplete, non-expanding escape sequences as ESCAPE_ERROR,
and also reduce the number of return statemants;
no formatting change intended.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
in the same way as groff:
* \\ is always reduced to \
* \. is always reduced to .
* other undefined escape sequences are usually reduced to the escape name,
for example \G to G, except during the expansion of expanding escape
sequences having the standard argument form (in particular \* and \n),
in which case the backslash is preserved literally.
Yes, this is confusing indeed.
For example, the following have the same meaning:
* .ds \. and .ds . which is not the same as .ds \\.
* \*[\.] and \*[.] which is not the same as \*[\\.]
* .ds \G and .ds G which is not the same as .ds \\G
* \*[\G] and \*[\\G] which is not the same as \*[G] <- sic!
To feel less dirty, have a leaning toothpick, if you are so inclined.
This patch also slightly improves the string shown by the "escaped
character not allowed in a name" error message.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
wrong parsing class ESCAPE_SPECIAL to the better-suited parsing class
ESCAPE_UNDEF, exactly like it is already done for the similar \\,
which isn't a character escape sequence either.
No formatting change is intended just yet, but this will matter for
upcoming improvements in the parser for roff(7) macro, string, and
register names.
See the node "5.23.2 Copy Mode" in "info groff" regarding
what \\ and \. really mean.
|
|
|
|
|
|
|
|
|
| |
To that end, add another argument to roff_escape()
returning the index of the escape name.
This also makes the code in roff_escape() a bit more uniform
in so far as it no longer needs the "char esc_name" local variable
but now does everything with indices into buf[].
No functional change.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
be triggered by macro arguments ending in double backslashes, for
example if people wrote .Sq "\\" instead of the correct .Sq "\e".
The bug was hard to find because it caused a segfault only very rarely,
according to my measurements with a probability of less than one permille.
I'm sorry that the first one to hit the bug was an arm64 release build
run by deraadt@. Thanks to bluhm@ for providing access to an arm64
machine for debugging purposes. In the end, the bug turned out to be
architecture-independent.
The reason for the bug was that i assumed an invariant that does not exist.
The function roff_parse_comment() is very careful to make sure that the
input buffer does not end in an escape character before passing it on,
so i assumed this is still true when reaching roff_expand() immediately
afterwards. But roff_expand() can also be reached from roff_getarg(),
in which case there *can* be a lone escape character at the end of the
buffer in case copy mode processing found and converted a double
backslash.
Fix this by handling a trailing escape character correctly in the
function roff_escape().
The lesson here probably is to refrain from assuming an invariant
unless verifying that the invariant actually holds is reasonably
simple. In some cases, in particular for invariants that are important
but not simple, it might also make sense to assert(3) rather than just
assume the invariant. An assertion failure is so much better than a
buffer overrun...
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
semantics (test identifier for syntactical validity), not at all
following the completely unrelated Heirloom semantics (define
hyperlink target position).
The main motivation for providing this implementation is to get \A
into the parsing class ESCAPE_EXPAND that corresponds to groff parsing
behaviour, which is quite similar to the \B escape sequence (test
numerical expression for syntactical validity). This is likely
to improve parsing of nested escape sequences in the future.
Validation isn't perfect yet. In particular, this implementation
rejects \A arguments containing some escape sequences that groff
allows to slip through. But that is unlikely to cause trouble even
in documents using \A for non-trivial purposes. Rejecting the nested
escapes in question might even improve robustnest because the rejected
names are unlikely to really be usable for practical purposes - no
matter that groff dubiously considers them syntactically valid.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
escape sequence into the correct parsing class, ESCAPE_EXPAND.
Expansion of \g is supposed to work exactly like the expansion
of the related escape sequence \n (interpolate register value),
but since we ignore the .af (assign output format) request,
we just interpolate an empty string to replace the \g sequence.
Surprising as it may seem, this actually makes a formatting difference
for deviate input like ".O\gNx" which used to raise bogus "escaped
character not allowed in a name" and "skipping unknown macro" errors
and printed nothing, whereas now it correctly prints "OpenBSD".
|