mandoc - UNIX manpage compiler toolset

	Commit message (Collapse)	Author	Age	Files	Lines
*	Avoid the layering violation of re-parsing for \E in roff_expand().	Ingo Schwarze	2022-06-02	1	-1/+1
\| \| \| \| \| \| \| \| \|	To that end, add another argument to roff_escape() returning the index of the escape name. This also makes the code in roff_escape() a bit more uniform in so far as it no longer needs the "char esc_name" local variable but now does everything with indices into buf[]. No functional change.
*	Make roff_expand() parse left-to-right rather than right-to-left.	Ingo Schwarze	2022-05-19	1	-1/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Some escape sequences have side effects on global state, implying that the order of evaluation matters. For example, this fixes the long-standing bug that "\n+x\n+x\n+x" after ".nr x 0 1" used to print "321"; now it correctly prints "123". Right-to-left parsing was convenient because it implicitly handled nested escape sequences. With correct left-to-right parsing, nesting now requires an explicit implementation, here solved as follows: 1. Handle nested expanding escape sequences iteratively. When finding one, expand it, then retry parsing the enclosing escape sequence from the beginning, which will ultimately succeed as soon as it no longer contains any nested expanding escape sequences. 2. Handle nested non-expanding escape sequences recursively. When finding one, the escape sequence parser calls itself to find the end of the inner sequence, then continues parsing the outer sequence after that point. This requires the mandoc_escape() function to operate in two different modes. The roff(7) parser uses it in a mode where it generates diagnostics and may return an expansion request instead of a parse result. All other callers, in particular the formatters, use it in a simpler mode that never generates diagnostics and always returns a definite parsing result, but that requires all expanding escape sequences to already have been expanded earlier. The bulk of the code is the same for both modes. Since this required a major rewrite of the function anyway, move it into its own new file roff_escape.c and out of the file mandoc.c, which was misnamed in the first place and lacks a clear focus. As a side benefit, this also fixes a number of assertion failures that tb@ found with afl(1), for example "\n\\\\0", "\v\-\\0", and "\w\-\\\\\$0*0". As another side benefit, it also resolves some code duplication between mandoc_escape() and roff_expand() and centralizes all handling of escape sequences (except for expansion) in roff_escape.c, hopefully easing maintenance and feature improvements in the future. While here, also move end-of-input handling out of the complicated function roff_expand() and into the simpler function roff_parse_comment(), making the logic easier to understand. Since this is a major reorganization of a central component of mandoc(1), stability of the program might slightly suffer for a few weeks, but i believe that's not a problem at this point of the release cycle. The new code already satisfies the regression suite, but more tweaking and regression testing to further improve the handling of various escape sequences will likely follow in the near future.
*	store the operating system name obtained from uname(3) in the adequate	Ingo Schwarze	2021-10-04	1	-0/+1
\| \| \| \| \| \|	struct together with similar state date rather than in a function-scope static variable, such that it can be free(3)d in roff_man_free(); no functional change
*	provide a STYLE message when mandoc knows the file name and the extension	Ingo Schwarze	2020-04-24	1	-2/+3
\| \| \| \| \|	disagrees with the section number given in the .Dt or .TH macro; feature suggested and patch tested by jmc@
*	Some high-level block macros have an effect similar to temporarily	Ingo Schwarze	2019-01-05	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \|	suspending no-fill mode during their head. Model this with an additional roff parser state flag ROFF_NONOFILL. That is much simpler than it would be to save and restore the ROFF_NOFILL flag itself, in particular since the latter can be switched (with lasting effect) by the .nf and .fi requests even while its effect is temporarily suspended. This commit does not change formatting yet, but prepares for future formatting simplifications and improvements.
*	Cleanup, no functional change:	Ingo Schwarze	2018-12-31	1	-1/+0
\| \| \| \| \| \|	Use the new parser flag ROFF_NOFILL in the mdoc(7) parser, too, instead of the old MDOC_LITERAL, which was an alias for the former MAN_LITERAL.
*	Move parsing of the .nf and .fi (fill mode) requests from the man(7)	Ingo Schwarze	2018-12-31	1	-2/+2
\| \| \| \| \| \|	parser to the roff(7) parser. As a side effect, .nf and .fi are now also parsed in mdoc(7) input, though the mdoc(7) formatters still ignore most of their effect.
*	Cleanup, minus 15 LOC, no functional change:	Ingo Schwarze	2018-12-31	1	-0/+1
\| \| \| \| \| \| \| \| \|	Simplify the way the man(7) and mdoc(7) validators are called. Reset the parser state with a common function before calling them. There is no need to again reset the parser state afterwards, the parsers are no longer used after validation. This allows getting rid of man_node_validate() and mdoc_node_validate() as separate functions.
*	Cleanup, no functional change:	Ingo Schwarze	2018-12-30	1	-0/+46
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	The struct roff_man used to be a bad mixture of internal parser state and public parsing results. Move the public results to the parsing result struct roff_meta, which is already public. Move the rest of struct roff_man to the parser-internal header roff_int.h. Since the validators need access to the parser state, call them from the top level parser during mparse_result() rather than from the main programs, also reducing code duplication. This keeps parser internal state out of thee main programs (five in mandoc portable) and out of eight formatters.
*	Cleanup, no functional change:	Ingo Schwarze	2018-12-13	1	-1/+7
\| \| \| \| \| \|	Move the roffhash_*() functions from roff.h to roff_int.h because they are only intended for use by parsers, neither by main programs nor by formatters.
*	Clean up the validation of .Pp, .PP, .sp, and .br. Make sure all	Ingo Schwarze	2018-12-04	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	combinations are handled, and are handled in a systematic manner. This resolves some erratic duplicate handling, handles a number of missing cases, and improves diagnostics in various respects. Move validation of .br and .sp to the roff validation module rather than doing that twice in the mdoc and man validation modules. Move the node relinking function to the roff library where it belongs. In validation functions, only look at the node itself, at previous nodes, and at descendants, not at following nodes or ancestors, such that only nodes are inspected which are already validated.
*	Simplify by creating struct roff_node syntax tree nodes for tbl(7)	Ingo Schwarze	2017-07-08	1	-1/+0
\| \| \| \| \| \| \| \| \| \| \| \|	right from roff_parseln() rather than delegating to read.c, similar to what i just did for eqn(7). The interface function roff_span() becomes obsolete and is deleted, the former interface function roff_addtbl() becomes static, the interface functions tbl_read() and tbl_cdata() become void, and minus twelve linus of code. No functional change.
*	1. Eliminate struct eqn, instead use the existing members	Ingo Schwarze	2017-07-08	1	-1/+0
\| \| \| \| \| \|	of struct roff_node which is allocated for each equation anyway. 2. Do not keep a list of equation parsers, one parser is enough. Minus fifty lines of code, no functional change.
*	In private header files, __BEGIN_DECLS and __END_DECLS are pointless.	Ingo Schwarze	2015-11-07	1	-4/+0
\| \| \| \| \| \| \| \| \| \| \| \|	Because these work slightly differently on different systems, they are becoming a maintenance burden in the portable version, so delete them. Besides, one of the chief design goals of the mandoc toolbox is to make sure that nothing related to documentation requires C++. Consequently, linking mandoc against any kind of C++ program would defeat the purpose and is not supported. I don't understand why kristaps@ added them in the first place.
*	move man(7) validation into the dedicated validation phase, too	Ingo Schwarze	2015-10-22	1	-1/+0
\|
*	Move all mdoc(7) node validation done before child parsing	Ingo Schwarze	2015-10-21	1	-2/+0
\| \| \| \| \| \|	to the new separate validation pass, except for a tiny bit needed by the parser which goes to the new mdoc_state() module; cleaner, simpler, and surprisingly also shorter by 15 lines.
*	In order to become able to generate syntax tree nodes on the roff(7)	Ingo Schwarze	2015-10-20	1	-1/+0
\| \| \| \| \| \| \| \|	level, validation must be separated from parsing and rewinding. This first big step moves calling of the mdoc(7) post_*() functions out of the parser loop into their own mdoc_validate() pass, while using a new mdoc_state() module to make syntax tree state handling available to both the parser loop and the validation pass.
*	Unify trickier node handling functions.	Ingo Schwarze	2015-04-19	1	-0/+2
\| \| \| \| \| \| \|	* man_elem_alloc() -> roff_elem_alloc() * man_block_alloc() -> roff_block_alloc() The functions mdoc_elem_alloc() and mdoc_block_alloc() remain for now because they need to do mdoc(7)-specific argument processing.
*	Unify some node handling functions that use TOKEN_NONE.	Ingo Schwarze	2015-04-19	1	-2/+19
\| \| \| \| \| \| \| \|	* mdoc_word_alloc(), man_word_alloc() -> roff_word_alloc() * mdoc_word_append(), man_word_append() -> roff_word_append() * mdoc_addspan(), man_addspan() -> roff_addtbl() * mdoc_addeqn(), man_addeqn() -> roff_addeqn() Minus 50 lines of code, no functional change.
*	Unify node handling functions:	Ingo Schwarze	2015-04-19	1	-0/+30
	* node_alloc() for mdoc and man_node_alloc() -> roff_node_alloc() * node_append() for mdoc and man_node_append() -> roff_node_append() * mdoc_head_alloc() and man_head_alloc() -> roff_head_alloc() * mdoc_body_alloc() and man_body_alloc() -> roff_body_alloc() * mdoc_node_unlink() and man_node_unlink() -> roff_node_unlink() * mdoc_node_free() and man_node_free() -> roff_node_free() * mdoc_node_delete() and man_node_delete() -> roff_node_delete() Minus 130 lines of code, no functional change.