| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
| |
designed and written last autumn, polished today
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
as defining a term. Please only use it when automatic tagging does
not work. Manual page authors will not be required to add the new
macro; using it remains optional. HTML output is still rudimentary
in this version and will be polished later.
Thanks to kn@ for reminding me that i have been considering since
BSDCan 2014 whether something like this might be useful. Given
that possibilities of making automatic tagging better are running
out and there are still several situations where automatic tagging
cannot do the job, i think the time is now ripe.
Feedback and no objection from millert@; OK espie@ inoguchi@ kn@.
|
|
|
|
|
|
|
|
| |
without an argument, use the empty string, and always concatenate
all arguments, no matter their number.
This allows reducing the number of arguments of mandoc_normdate()
and some other simplifications, at the same time polishing some
error messages by adding the name of the macro in question.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
by the <p> HTML element and use the html_fillmode() mechanism
for .Bd -unfilled, just like it was done for man(7) earlier, finally
getting rid both of the horrible <div class="Pp"></div> hack and
of the worst HTML syntax violations caused by nested displays.
Care is needed because in some situations, paragraphs have to remain
open across several subsequent macros, whereas in other situations,
they must get closed together with a block containing them.
Some implementation details include:
* Always close paragraphs before emitting HTML flow content.
* Let html_close_paragraph() also close <pre> for extra safety.
* Drop the old, now unused function print_paragraph().
* Minor adjustments in the top-level man(7) node formatter for symmetry.
* Bugfix: .Ss heads suspend no-fill mode, even though .Ss doesn't end it.
* Bugfix: give up on .Op semantic markup for now, see the comment.
|
|
|
|
|
|
| |
that children and later siblings get correct NODE_NOFILL assignments.
This doesn't change rendering yet but prepares for future rendering
improvements.
|
|
|
|
|
|
|
|
|
| |
Simplify the way the man(7) and mdoc(7) validators are called.
Reset the parser state with a common function before calling them.
There is no need to again reset the parser state afterwards,
the parsers are no longer used after validation.
This allows getting rid of man_node_validate() and mdoc_node_validate()
as separate functions.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The struct roff_man used to be a bad mixture of internal parser
state and public parsing results. Move the public results to the
parsing result struct roff_meta, which is already public. Move the
rest of struct roff_man to the parser-internal header roff_int.h.
Since the validators need access to the parser state, call them
from the top level parser during mparse_result() rather than from
the main programs, also reducing code duplication.
This keeps parser internal state out of thee main programs (five
in mandoc portable) and out of eight formatters.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
struct as an argument such that after copy-in, it can call roff_expand()
once again, which used to be called roff_res() before this. This
fixes a subtle low-level roff(7) parsing bug reported by Fabio
Scotoni <fabio at esse dot ch> in the 4.4BSD-Lite2 mdoc.samples(7)
manual page, because that page used an escaped escape sequence in
a macro argument.
To expand escaped escape sequences in quoted mdoc(7) arguments, too,
stop bypassing the call to roff_getarg() in mdoc_argv.c, function args()
for this case. This does not solve the case of escaped escape sequences
in quoted .Bl -column phrases yet.
Because roff_expand() can make the string longer, roff_getarg() can no
longer operate in-place but needs to malloc(3) the returned string.
In the high-level parsers, free(3) that string after processing it.
|
|
|
|
|
|
|
|
| |
from mandoc_msg(), where it is no longer used.
While here, rename mandoc_vmsg() to mandoc_msg() and retire the
old version: There is really no point in having another function
merely to save "%s" in a few places.
Minus 140 lines of code.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
combinations are handled, and are handled in a systematic manner.
This resolves some erratic duplicate handling, handles a number of
missing cases, and improves diagnostics in various respects.
Move validation of .br and .sp to the roff validation module
rather than doing that twice in the mdoc and man validation modules.
Move the node relinking function to the roff library where it belongs.
In validation functions, only look at the node itself, at previous
nodes, and at descendants, not at following nodes or ancestors,
such that only nodes are inspected which are already validated.
|
|
|
|
|
| |
that is undefined according to the C standard. Robert Elz <kre at
munnari dot oz dot au> pointed out i wasn't quite done yet.
|
|
|
|
|
|
|
|
|
|
|
| |
and of called macros.
This bug affects almost all macros, and fixing it simplifies the
code. It is amazing that the bogus ARGS_QWORD feature got implemented
in the first place, and then carrier along for more than eight years
without anybody ever noticing that it was pointless.
Reported by Leah Neukirchen <leah at vuxu dot org>, found on Void Linux.
|
|
|
|
| |
now that this actually saves code: -70 LOC.
|
| |
|
|
|
|
|
| |
Generate the first node on the roff level: .br
Fix some column numbers in diagnostic messages while here.
|
|
|
|
| |
no functional change, minus two source files, minus 200 lines of code.
|
|
|
|
|
|
|
|
| |
* Make enum rofft an internal interface as enum roff_tok in "roff.h".
* Represent mdoc and man macros in enum roff_tok.
* Make TOKEN_NONE a proper enum value and use it throughout.
* Put the prologue macros first in the macro tables.
* Unify mdoc_macroname[] and man_macroname[] into roff_name[].
|
|
|
|
|
|
|
|
|
| |
This macro is unusual in so far as trailing punction needs to remain
inside the scope because it must be inside, not after the display
of long URIs in terminal output mode.
Improves formatting of fw_update(1), help(1), less(1), sendbug(1),
acx(4), inet6(4), ipsec(4), oce(4), isakmpd.conf(5), afterboot(8),
release(8), traceroute(8).
|
|
|
|
|
|
|
|
|
| |
implicit blocks (.Aq Bq Po .Pc) that left the outer breaker open
and could in exceptional cases, like between .Bl and .It, cause
tree corruption leading to NULL dereference.
Found by tb@ with afl(1).
While here, do not mark intermediate ENDBODY markers as broken.
|
|
|
|
|
|
|
|
| |
Comparing to groff output, it appears that all cases where it was used
and made a difference actually require the opposite, ENDBODY_SPACE.
I have no idea why i added it back in 2010; maybe to compensate for
some other bug that has long been fixed.
|
|
|
|
|
| |
Fixes the last the of tree corruptions sometimes causing NULL dereference
reported by tb@; this one triggered in cases like: .Bl -column .It Pq Ta
|
|
|
|
|
|
| |
Fixes tree corruption leading to NULL dereference
in insane cases like .Oo Oo .Nd .Pq Oc .Oc Oc
found by tb@ with afl(1).
|
|
|
|
|
| |
backwards. Only do so when a block is found that is actually broken.
Logic error found while investigating crashes reported by tb@.
|
|
|
|
|
|
|
|
| |
breakers unless the parent of the block is already closed. While
the scanning is needed in cases like ".Ac Bo" for broken Ao, it is
useless and crashy in cases like ".Ac Bc" for non-broken Ao.
This fixes a NULL pointer dereference that tb@ found with afl(1).
|
|
|
|
|
|
|
|
|
|
|
|
| |
gets broken. In that case, mark them as BROKEN and ENDED and make
sure they get closed out together with the child.
Fixes tree corruption leeding to a NULL dereference found by tb@
with afl(1) in: .Sh SYNOPSIS .Bl .Oo .Nm .Bk .Oc .It (where .Bk is
the child and .Oo is the breaker).
A simpler form of the same corruption (without crash) is visible in:
.Sh SYNOPSIS .Ao .Nm .Bo .Ac .Bc text
where the text ended up inside the .Nm (child .Bo, breaker .Ao).
|
| |
|
|
|
|
|
|
|
|
| |
are open, all except the innermost open block got a bogus MDOC_ENDED
marker, in some situations triggering segfaults down the road
which tb@ found with afl(1).
Fix the logic error by figuring out up front whether an end macro
has a matching body, and if it hasn't, don't mark any blocks as broken.
|
|
|
|
|
| |
ignore body end markers of lists breaking other blocks.
Fixing a logical error that caused a NULL deref found by tb@ with afl(1).
|
|
|
|
|
|
| |
continue scanning upwards, because the enclosing block might already
be pending as well, e.g. .Bl .Bl .It Bo .El .It.
Tree corruption leading to a later NULL deref found by tb@ with afl(1).
|
|
|
|
|
|
|
|
| |
level, validation must be separated from parsing and rewinding.
This first big step moves calling of the mdoc(7) post_*() functions
out of the parser loop into their own mdoc_validate() pass, while
using a new mdoc_state() module to make syntax tree state handling
available to both the parser loop and the validation pass.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
in .Bl -column; it took me more than a day to get this right.
Triggered by a loosely related bug report from tim@.
The lesson for you is: Use .Ta macros in .Bl -column, avoid tabs,
or you are in for surprises: The last word before a tab is not
interpreted as a macro (unless there is a blank in between), the
first word after a tab isn't either (unless there is a blank in
between), and a blank after a tab causes a leading blank in the
respective output cell. Yes, "blank", "tab", "blank tab" and "tab
blank" all have different semantics; if you write code relying on
that, good luck maintaining it afterwards...
|
|
|
|
|
|
|
|
| |
calls phrase_ta() to handle a .Ta child macro, advance the body
pointer accordingly, such that a subsequent tab character rewinds
the right body block and doesn't fail an assertion. That happened
when there was nothing between the .Ta and the tab character.
Bug reported by tim@ some time ago.
|
|
|
|
|
|
| |
that were right between two adjacent case statement. Keep only
those 24 where the first case actually executes some code before
falling through to the next case.
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
set ROFF_NEXT_CHILD, which is desirable for the final call to
mdoc_valid_post() - in case the target itself gets deleted, the
parse point may need this adjustment - but not for the intermediate
calls - if intermediate nodes get deleted, that mustn't clobber the
parse point. So move setting ROFF_NEXT_SIBLING to the proper place
in rew_last().
This fixes the assertion failure in jsg@'s afl test case 108/Apr27.
|
|
|
|
|
|
|
|
| |
weird place. Move it to the obviously correct place.
Surprisingly, this didn't cause any misformatting in the test suite
or in any base system manuals, but i cannot believe the code was
really correct for all conceivable input, and it would be very hard
to verify. At the very least, it cannot have worked for man(7).
|
|
|
|
|
| |
not just the body. In some unusual edge cases, this caused
the .Pp to become a sibling of the .Nm body inside the .Nm block.
|
|
|
|
|
|
|
| |
scope of the end macro. Instead, only keep the tail scope open if
the end macro macro calls an explicit macro and actually breaks
that. This corrects syntax tree structure and fixes an assertion
found by jsg@ with afl (test case 098/Apr27).
|
|
|
|
|
|
| |
a mismatching explicit end macro without actually being broken.
Avoids a subsequent upward search for the non-existent breaker
ending up in a NULL pointer access; afl test case 005/Apr27 from jsg@.
|
| |
|
|
|
|
| |
Bug reported by jsg@.
|
|
|
|
|
|
|
| |
* man_elem_alloc() -> roff_elem_alloc()
* man_block_alloc() -> roff_block_alloc()
The functions mdoc_elem_alloc() and mdoc_block_alloc() remain for
now because they need to do mdoc(7)-specific argument processing.
|
|
|
|
|
|
|
|
| |
* mdoc_word_alloc(), man_word_alloc() -> roff_word_alloc()
* mdoc_word_append(), man_word_append() -> roff_word_append()
* mdoc_addspan(), man_addspan() -> roff_addtbl()
* mdoc_addeqn(), man_addeqn() -> roff_addeqn()
Minus 50 lines of code, no functional change.
|
|
|
|
|
|
| |
high-level parsers to allow further unification of functions that
only need to recognize this code, but that don't care about different
high-level macrosets beyond that.
|
|
|
|
|
|
|
|
|
|
|
| |
* node_alloc() for mdoc and man_node_alloc() -> roff_node_alloc()
* node_append() for mdoc and man_node_append() -> roff_node_append()
* mdoc_head_alloc() and man_head_alloc() -> roff_head_alloc()
* mdoc_body_alloc() and man_body_alloc() -> roff_body_alloc()
* mdoc_node_unlink() and man_node_unlink() -> roff_node_unlink()
* mdoc_node_free() and man_node_free() -> roff_node_free()
* mdoc_node_delete() and man_node_delete() -> roff_node_delete()
Minus 130 lines of code, no functional change.
|
|
|
|
|
| |
Almost completely mechanical, no functional change.
Written on the train from Exeter to London returning from p2k15.
|
|
|
|
|
| |
the end macro of a broken block, put all of it into the breaking block.
Needed for example by mutella(1).
|
|
|
|
|
| |
Both partial and full implicit blocks can break explicit blocks.
Put the code to handle both cases into a common function.
|
|
|
|
|
|
|
|
| |
must go inside the breaking block. For example, in
.It Ic cmd Oo
.Ar optional_arg Oc Ar mandatory_arg
the mandatory_arg is still inside the .It block.
Used for example by mutella(1).
|