| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
| |
of a conditional inside a .ce request block. Instead, abort the .ce
block just like when there is no conditional in between.
Bug found by espie@ working on the textproc/fstrcmp port.
|
|
|
|
|
|
|
|
|
|
|
| |
macro, which is usually close to the beginning of the file, right
after the Copyright header comments. But espie@ found horrible
input files in the textproc/fstrcmp port that generate lots of parse
nodes before even getting to the header macro. In some formatters,
comment nodes after some kinds of real content triggered assertions.
So make sure generation of comment nodes stops once real content is
encountered.
|
|
|
|
|
| |
patch from Michal Nowak <mnowak at startmail dot com>
who found these with git pbchk in the illumos tree
|
|
|
|
|
| |
Fixing a bug found with the groffer(1) version 1.19 manual page
following a report from Jan Stary.
|
|
|
|
|
|
|
| |
Jan Stary <hans at stare dot cz> found it in an ancient groffer(1)
manual page (version 1.19) on MacOS X Mojave.
Having .break not implemented wasn't a particularly bright idea
because obviously, it tended to cause infinite loops.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
and audit all its callers whether termination is handled correctly.
Resulting improvements:
* An escape or tab ending the macro name in a macro invocation
is discarded, and argument processing is started after it.
* An escape or tab ending a name in ".if d" and ".if r" is preserved.
* An escape ending a name in ".ds" causes the whole request to be ignored.
* A tab ending a name in ".ds" becomes part of the string.
* An escape or tab ending a name in ".rm"
causes the rest of the line to be ignored.
* An escape or tab ending the first name in ".als", ".rn", or ".nr"
causes the whole request to be ignored.
Kurt Jaeger <pi at FreeBSD> made me aware of
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=235456#c0
and in that bug report, comment 0 item (3) is a special case
of this class of issues.
Yes, the "mh" manual pages are no doubt among the worst on the planet.
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
suspending no-fill mode during their head. Model this with an
additional roff parser state flag ROFF_NONOFILL. That is much
simpler than it would be to save and restore the ROFF_NOFILL flag
itself, in particular since the latter can be switched (with lasting
effect) by the .nf and .fi requests even while its effect is
temporarily suspended.
This commit does not change formatting yet, but prepares for future
formatting simplifications and improvements.
|
|
|
|
|
|
| |
like it is already done with NODE_SYNPRETTY, such that the fill
mode becomes more directly available to the formatters.
Not used yet, but will be used by upcoming commits.
|
|
|
|
|
|
| |
parser to the roff(7) parser. As a side effect, .nf and .fi are
now also parsed in mdoc(7) input, though the mdoc(7) formatters
still ignore most of their effect.
|
|
|
|
|
|
|
|
|
| |
Simplify the way the man(7) and mdoc(7) validators are called.
Reset the parser state with a common function before calling them.
There is no need to again reset the parser state afterwards,
the parsers are no longer used after validation.
This allows getting rid of man_node_validate() and mdoc_node_validate()
as separate functions.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The struct roff_man used to be a bad mixture of internal parser
state and public parsing results. Move the public results to the
parsing result struct roff_meta, which is already public. Move the
rest of struct roff_man to the parser-internal header roff_int.h.
Since the validators need access to the parser state, call them
from the top level parser during mparse_result() rather than from
the main programs, also reducing code duplication.
This keeps parser internal state out of thee main programs (five
in mandoc portable) and out of eight formatters.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
struct as an argument such that after copy-in, it can call roff_expand()
once again, which used to be called roff_res() before this. This
fixes a subtle low-level roff(7) parsing bug reported by Fabio
Scotoni <fabio at esse dot ch> in the 4.4BSD-Lite2 mdoc.samples(7)
manual page, because that page used an escaped escape sequence in
a macro argument.
To expand escaped escape sequences in quoted mdoc(7) arguments, too,
stop bypassing the call to roff_getarg() in mdoc_argv.c, function args()
for this case. This does not solve the case of escaped escape sequences
in quoted .Bl -column phrases yet.
Because roff_expand() can make the string longer, roff_getarg() can no
longer operate in-place but needs to malloc(3) the returned string.
In the high-level parsers, free(3) that string after processing it.
|
|
|
|
|
|
|
| |
When after a \\, \t, or \a, another \t or \a had to be resolved
in copy mode within the same argument, the argument got corrupted.
Found while working on a loosely related bug report
from Fabio Scotoni <fabio at esse dot ch>.
|
|
|
|
|
|
|
|
|
| |
move the function mandoc_getarg() from mandoc.c to roff.c. It was
misplaced in mandoc.c in the first place; that file is intended for
utilities needed both by parsers and by formatters, while reading
macro arguments in copy mode is purely a task of the roff(7) parser.
Needed as a preliminary for an upcoming bugfix.
No code change.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Add the missing special character \_ (underscore).
* Partial implementations of \a (leader character)
and \E (uninterpreted escape character).
* Parse and ignore \r (reverse line feed).
* Add a WARNING message about undefined escape sequences.
* Add an UNSUPP message about unsupported escape sequences.
* Mark \! and \? (transparent throughput)
and \O (suppress output) as unsupported.
* Treat the various variants of zero-width spaces as one-byte escape
sequences rather than as special characters, to avoid defining bogus
forms with square brackets.
* For special characters with one-byte names, do not define bogus
forms with square brackets, except for \[-], which is valid.
* In the form with square brackets, undefined special characters do not
fall back to printing the name verbatim, not even for one-byte names.
* Starting a special character name with a blank is an error.
* Undefined escape sequences never abort formatting of the input
string, not even in HTML output mode.
* Document the newly handled escapes, and a few that were missing.
* Regression tests for most of the above.
|
|
|
|
|
|
|
|
|
|
| |
Now that message handling is properly encapsulated,
remove struct mparse pointers from four structs (roff, roff_man,
tbl_node, eqn_node) and from the argument lists of five functions
(roff_alloc, roff_man_alloc, mandoc_getarg, tbl_alloc, eqn_alloc).
Except for being passed to the main program as an opaque object,
it now only occurs in read.c, as it should, and not across 15 files
like in the past.
|
|
|
|
|
|
|
|
| |
from mandoc_msg(), where it is no longer used.
While here, rename mandoc_vmsg() to mandoc_msg() and retire the
old version: There is really no point in having another function
merely to save "%s" in a few places.
Minus 140 lines of code.
|
|
|
|
|
|
|
|
|
|
| |
Split the top level parser interface out of the utility header
mandoc.h, into a new header mandoc_parse.h, for use in the main
program and in the main parser only.
Move enum mandoc_os into roff.h because struct roff_man is the
place where it is stored.
This allows removal of mandoc.h from seven files in low-level
parsers and in formatters.
|
|
|
|
|
|
| |
No need to expose the eqn(7) syntax tree data structures everywhere.
Move them to their own include file, "eqn.h".
While here, delete the unused enum eqn_pilet.
|
|
|
|
|
|
|
|
| |
In libroff.h, nothing was left except the eqn(7) parser interface, which
isn't really part of the roff(7) parser, so rename it to eqn_parse.h.
While here, move struct eqn_def to eqn.c because that's the only
file using it, and let eqn_box_free() and eqn_free() handle NULL.
|
|
|
|
|
| |
Move tbl(7)-specific parser internals out of libroff.h.
Move some tbl(7)-internal processing from roff.c to tbl.c.
|
|
|
|
|
| |
No need to expose the tbl(7) syntax tree data structures everywhere.
Move them to their own include file, "tbl.h", and improve comments.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
combinations are handled, and are handled in a systematic manner.
This resolves some erratic duplicate handling, handles a number of
missing cases, and improves diagnostics in various respects.
Move validation of .br and .sp to the roff validation module
rather than doing that twice in the mdoc and man validation modules.
Move the node relinking function to the roff library where it belongs.
In validation functions, only look at the node itself, at previous
nodes, and at descendants, not at following nodes or ancestors,
such that only nodes are inspected which are already validated.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
by itself (which is somewhat unusual but not invalid; most authors
use the empty macro line ".\}" instead), agree more closely with
groff and do not produce a double space in the output.
Quirk reported by millert@.
While here, tweak the rest of the function body of roff_cond_text()
to more closely match roff_cond_sub(). The subtly different handling
could make people (including myself) wonder whether there is any
point in being different. Testing shows there is not.
|
|
|
|
|
|
|
|
|
| |
for HTML output. Somewhat relevant because pod2man(1) relies on this.
Missing feature reported by Pali dot Rohar at gmail dot com.
Note that constant width font was already correctly selected before
this when required by semantic markup. Only attempting physical
markup with the low-level escape sequence was ineffective.
|
|
|
|
|
|
|
|
|
| |
definition) request, used for example by groff_hdtbl(7).
This simplistic implementation may interact incorrectly
with the .tr (input character translation) request.
But come on, you are not only using .char *and* .tr, but you do so
with respect to the same character in the same manual page?
|
|
|
|
|
|
|
|
|
|
|
| |
Needed for example by groff_hdtbl(7).
There are two limitations:
It does not support nested .while requests yet,
and each .while loop must start and end in the same scope.
The roff_parseln() return codes are now more flexible
and allow OR'ing options.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
for example used by groff_hdtbl(7) and groff_mom(7).
Also correctly interpolate arguments during nested macro execution
even after .shift and .return, implemented using a stack of argument
arrays.
Note that only read.c, but not roff.c can detect the end of a macro
execution, and the existence of .shift implies that arguments cannot
be interpolated up front, so unfortunately, this includes a partial
revert of roff.c rev. 1.337, moving argument interpolation back into
the function roff_res().
|
|
|
|
|
|
|
|
|
|
|
|
| |
quoted) in addition to the already supported \\$* (similar, but
unquoted). Then use \\$@ to improve the implementation of
the .als request (macro alias).
Needed by groff_hdtbl(7).
Gosh, it feels like the manual pages of the groff package are
exercising every bloody roff(7) feature under the sun. In the
manual page source code itself, not merely in the implementation
of the used macro packages, that is.
|
|
|
|
|
|
|
|
| |
before even reparsing the expanded macro.
That is the least dirty way to fix the bug that \(.$ remained set
after execution of the user-defined macro ended. Any other way
to fix it would probably require changes to read.c, which really
shouldn't be bothered with such roff(7) internals.
|
|
|
|
|
|
|
|
|
|
|
| |
roff conditional, except that the .char request still isn't supported
and that behaviour differs from groff in many edge cases.
But at least valid character names and numbers are now distinguished
from invalid ones.
This also fixes the bug that parsing of the 'c' conditional was
incomplete, which resulted in leaking the tested character to the
input parser at the beginning of the body when the condition was
inverted.
|
|
|
|
| |
because that turned it into a bogus line continuation.
|
|
|
|
| |
with comment); used for example by gropdf(1)
|
|
|
|
| |
used in most manual pages of the groff package
|
|
|
|
| |
used for example by groff_diff(7)
|
|
|
|
|
| |
by allowing the preprocessor to pass it through to the formatters.
Used for example by the groff_char(7) manual page.
|
|
|
|
|
| |
Examples of manual pages (ab)using it
include groff(7), chem(1), groff_mom(7), and groff_hdtbl(7).
|
|
|
|
|
|
| |
the parse point to the beginning of the new buffer or we risk out
of bounds accesses. Bug found by Leah Neukirchen <leah at vuxu dot
org> with valgrind on Void Linux.
|
|
|
|
| |
with mandoc -Tman; suggested by Thomas Klausner <wiz at NetBSD>
|
|
|
|
|
|
| |
* .nr optional third argument (auto-increment step size)
* \n+ and \n- numerical register auto-increment and -decrement
bentley@ reported on Dec 9, 2013 that lang/sbcl(1) uses these.
|
|
|
|
|
|
| |
the previous commit for strings and macros, only technically simpler.
Desired behaviour also mentioned by Werner Lemberg in 2011.
This diff adds functionality but is -21 +19 LOC. :-)
|
|
|
|
|
| |
Observed by Werner Lemberg on Nov 14, 2011
and rotting on my TODO list ever since.
|
|
|
|
| |
fixing tree corruption and assertion failure found by jsg@ with afl(1)
|
|
|
|
|
|
| |
unable to figure out that it is never used uninitialized.
While here, tweak the content of the variable to make its usage
easier to understand. No functional change.
|
|
|
|
| |
and use after free many ensue; again found by jsg@ with afl(1)
|
|
|
|
|
|
|
|
|
|
|
|
| |
right from roff_parseln() rather than delegating to read.c,
similar to what i just did for eqn(7).
The interface function roff_span() becomes obsolete and is deleted,
the former interface function roff_addtbl() becomes static,
the interface functions tbl_read() and tbl_cdata() become void,
and minus twelve linus of code.
No functional change.
|
|
|
|
| |
found by jsg@ with afl(1)
|
|
|
|
|
|
| |
of struct roff_node which is allocated for each equation anyway.
2. Do not keep a list of equation parsers, one parser is enough.
Minus fifty lines of code, no functional change.
|