mandoc - UNIX manpage compiler toolset

	Commit message (Collapse)	Author	Age	Files	Lines
*	Up to version 1.22.4, groff_mdoc(7) only considered the first word	Ingo Schwarze	2022-08-19	4	-13/+8
\| \| \| \| \| \| \| \| \| \| \| \|	when comparing section headers. For example, ".Sh SEE ELSEWHERE" and ".Sh SEE Em ALSO" were considered instances of a SEE ALSO section. In groff-current, exact matches with no sub-macros are required. Adjust mandoc behaviour. While here, also fix a very minor mandoc bug, even though no detrimental effect of the bug on formatting is known. While using sub-macros in the .Sh HEAD is bad style, the parsers accept it, so setting the section attribute on the HEAD needs to act recursively.
*	Adjust desired output after the bugfix man.c rev. 1.189.	Ingo Schwarze	2022-08-16	1	-1/+1
\| \| \| \|	The new version of the output file was generated with groff-current.
*	New tests of tabs in fill mode, in particular	Ingo Schwarze	2022-08-16	3	-2/+140
\| \| \| \|	when multiple input or output lines are involved.
*	Adjust the desired output after the improvements in term.c rev. 1.290.	Ingo Schwarze	2022-08-16	1	-1/+1
\| \| \| \| \| \|	The new version of this file was generated with groff-current. Heirloom nroff produces exactly the same output for the content of the DESCRIPTION.
*	Some more tests of no-fill mode similar to mdoc/Bd/blank.in	Ingo Schwarze	2022-08-15	2	-10/+29
\| \| \| \|	after vertical spacing was improved in man_term.c rev. 1.239.
*	Distinguish between escape sequences that produce no output	Ingo Schwarze	2022-08-15	4	-8/+32
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	whatsoever (for example \fR) and escape sequences that produce invisible zero-width output (for example \&). No, i'm not joking, groff does make that distinction, and it has consequences in some situations, for example for vertical spacing in no-fill mode. Heirloom and Plan 9 behaviour is subtly different, but in case of doubt, we want to follow groff. While this fixes the behaviour for the majority of escape sequences, in particular for those most likely to occur in practice, it is not perfect yet because some of the more exotic ESCAPE_IGNORE sequences are actually of the "no output whatsoever" type but treated as "invisible zero-width" for now. With the new ASCII_NBRZW mechanism in place, switching them over one by one when the need arises will no longer be very difficult.
*	If the body of a man(7) .MT or .UR block is empty, do not emit a warning.	Ingo Schwarze	2022-08-02	2	-4/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Leaving the body empty is legitimate in this case if the author only wants to display a mail address or URI without providing a link text. Output modules already handle this correctly: terminal output shows just the URI without an accompanying text, HTML output uses the URI for both the href= attribute and as the content of the <a> element. The documentation was also wrong and claimed that an .MT or .UR block with an empty body would produce no output. As explained above, this isn't true. Bogus warning reported by Alejandro Colomar <alx dot manpages at gmail dot com>.
*	Delete OpenBSD-only rules from the regress/roff/de Makefile	Ingo Schwarze	2022-08-02	1	-38/+0
\| \| \| \| \|	after they were changed in OpenBSD. Tracking these rules here would be useless.
*	While the HTML standard allows multiple <h1> elements in the same	Ingo Schwarze	2022-07-06	9	-25/+25
\| \| \| \| \| \| \| \| \| \| \| \|	document, <h1> is intended for top level headers, and most of the sections in a manual page can hardly be considered top-level. It is more usual to use <h1> only for the main title of the document of for the site name. Consequently, move .Sh/.SH from <h1> to <h2> and .Ss/.SS from <h2> to <h3>, freeing <h1> for use by header.html in man.cgi(8). Discussed with Anna Vyalkova <cyber at sysrq dot in>.
*	In groff commit 78e66624 on May 7 20:15:33 2021 +1000,	Ingo Schwarze	2022-06-26	1	-1/+1
\| \| \| \| \| \|	G. Branden Robinson changed the -T ascii rendering of \(sd, the "second" symbol, U+2033 DOUBLE PRIME, from '' to ". Follow suit in mandoc.
*	Surprisingly, every escape sequence can also be used as an argument	Ingo Schwarze	2022-06-08	28	-27/+1005
\| \| \| \| \| \| \|	delimiter for an outer escape sequence, in which case the delimiting escape sequence retains its syntax but usually ignores its argument and loses its inherent effect. Add rudimentary support for this syntax quirk in order to improve parsing compatibility with groff.
*	Split the excessively generic diagnostic message "invalid escape sequence"	Ingo Schwarze	2022-06-07	6	-9/+9
\| \| \| \| \|	into the more specific messages "invalid escape argument delimiter" and "invalid escape sequence argument".
*	adjust two desired error messages after roff_escape.c rev. 1.11	Ingo Schwarze	2022-06-06	1	-2/+2
\| \| \| \|	improved diagnostics for the \C escape sequence
*	With the improved escape sequence parser, it becomes easy to also improve	Ingo Schwarze	2022-06-05	12	-70/+70
\| \| \| \| \| \| \| \| \|	diagnostics. Distinguish "incomplete escape sequence", "invalid special character", and "unknown special character" from the generic "invalid escape sequence", also promoting them from WARNING to ERROR because incomplete escape sequences are severe syntax violations and because encountering an invalid or unknown special character makes it likely that part of the document content intended by the authors gets lost.
*	During identifier parsing, handle undefined escape sequences	Ingo Schwarze	2022-06-03	17	-48/+174
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	in the same way as groff: * \\ is always reduced to \ * \. is always reduced to . * other undefined escape sequences are usually reduced to the escape name, for example \G to G, except during the expansion of expanding escape sequences having the standard argument form (in particular \* and \n), in which case the backslash is preserved literally. Yes, this is confusing indeed. For example, the following have the same meaning: * .ds \. and .ds . which is not the same as .ds \\. * \[\.] and \[.] which is not the same as \[\\.] .ds \G and .ds G which is not the same as .ds \\G * \[\G] and \[\\G] which is not the same as \*[G] <- sic! To feel less dirty, have a leaning toothpick, if you are so inclined. This patch also slightly improves the string shown by the "escaped character not allowed in a name" error message.
*	Dummy implementation of the roff(7) \V (interpolate environment variable)	Ingo Schwarze	2022-05-30	4	-3/+32
\| \| \| \| \| \| \| \| \|	escape sequence. This is needed to get \V into the correct parsing class, ESCAPE_EXPAND. It is intentional that mandoc(1) output is not influenced by environment variables, so interpolate the name of the variable with some decorating punctuation rather than interpolating its value.
*	Re-classify the roff(7) \r (reverse line feed) escape sequence	Ingo Schwarze	2022-05-20	4	-5/+31
\| \| \| \| \| \| \|	from "ignore" to "unsupported" because when an input file uses it, mandoc(1) is likely to significantly misformat the output, usually showing parts of the output in a different order than the author intended.
*	Test the handling of some additional one-character escape sequences	Ingo Schwarze	2022-05-20	3	-13/+43
\| \| \| \| \|	that take no argument and are ignored: \% \& \^ \a \d \t \u \{ \\| \} No change to parsing or formatting needed.
*	following the fixed parsing direction of roff_expand() in roff.c rev. 1.388,	Ingo Schwarze	2022-05-19	3	-29/+29
\| \| \| \|	some diagnostics now appear in a more reasonable order, too
*	Adjust a column number in an error message	Ingo Schwarze	2022-05-19	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	after the roff_expand() reorganization in roff.c rev. 1.388. The new parsing direction has two effects: 1. Correct output when a line contains more than one expanding escape sequence that has a side effect. 2. Column numbers in diagnostic messages now report the changed column numbers after any expansions left of them have taken place; in the past, column numbers refered to the original input line. Arguably, item 2 was a bit better in its old state, but slightly less helpful diagnostics are a small price to pay for correct output. Besides, when the expansion of user-defined strings or macros is involved, in many cases, mandoc(1) is already unable to report meaningful line and column numbers, so item 2 is not a noteworthy regression. The effort and code complication for fixing that would probably be excessive, in particular since well-written manual pages are not supposed to use such features in the first place.
*	fix a wrong column number that got fixed as a side effect	Ingo Schwarze	2022-05-19	1	-1/+1
\| \| \| \|	of the roff_expand() reorganization in roff.c rev. 1.388
*	remove a bogus warning that went away as a side effect	Ingo Schwarze	2022-05-19	1	-1/+0
\| \| \| \|	of the roff_expand() reorganization in roff.c rev. 1.388
*	Split a new function roff_parse_comment() out of roff_expand() because this	Ingo Schwarze	2022-05-01	4	-3/+48
\| \| \| \| \| \| \|	functionality is not needed when called from roff_getarg(). This makes the long and complicated function roff_expand() significantly shorter, and also simpler in so far as it no longer needs to return ROFF_APPEND. No functional change intended.
*	Provide a new function roff_req_or_macro() to parse and handle a request	Ingo Schwarze	2022-04-30	8	-8/+126
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	or macro, including context-dependent error handling inside tbl(7) code and inside .ce/.rj blocks. Use it both in the top level roff(7) parser and inside conditional blocks. This fixes an assertion failure triggered by ".if 1 .ce" inside tbl(7) code, found by tb@ using afl(1). As a side benefit for readability, only one place remains in the code that calls the main handler functions for the various roff(7) requests. This patch also improves column numbers in some error messages and various comments.
*	The syntax of the roff(7) .mc request is quite special	Ingo Schwarze	2022-04-28	5	-2/+65
\| \| \| \| \| \| \| \| \|	and the roff_onearg() parsing function is too generic, so provide a dedicated parsing function instead. This fixes an assertion failure when an \o escape sequence is passed as the argument; the bug was found by tb@ using afl(1). It also makes mandoc output more similar to groff in various cases.
*	Element next-line scopes may nest, so man_breakscope() may have to	Ingo Schwarze	2022-04-28	4	-3/+41
\| \| \| \| \| \| \| \|	break multiple element next-line scopes at the same time, similar to what man_descope() already does for unconditional rewinding. This fixes an assertion failure that tb@ found with afl(1), caused by .SH .I .I .BI and similar sequences of macros without arguments.
*	The .AT, .DT, and .UC macros are allowed inside next-line scope	Ingo Schwarze	2022-04-27	10	-2/+98
\| \| \| \| \| \|	and never produce output at the place of their invocation. Minibugs found while investigating unrelated afl(1) reports from tb@.
*	Fix three bugs regarding the interaction of \z and \h:	Ingo Schwarze	2022-04-27	5	-4/+41
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	1. The combination \z\h is a no-op whatever the argument may be. In the past, the \z only affected the first space character generated by the \h, which was wrong. 2. For the conbination \zX\h with a positive argument, the first space resulting from the \h is not printed but consumed by the \z. 3. For the combination \zX\h with a negative argument, application of the \z needs to be completed before the \h can be started. In the past, if this combination occurred at the beginning of an output line, the \h backed up to the beginning of the line and after that, the \z attempted to back up even further, triggering an assertion. Bugs found during an audit of assignments to termp->col that i started after the bugfix tbl_term.c rev. 1.65. The assertion triggered by bug 3 was not yet found by afl(1).
*	typo in example text: unsused -> unused; noticed by tb@	Ingo Schwarze	2022-04-26	4	-5/+5
\|
*	At the end of every tbl(7) cell, clear the \z state.	Ingo Schwarze	2022-04-26	5	-4/+63
\| \| \| \| \| \| \| \| \|	This is needed because the TERMP_MULTICOL mode is designed such that term_tbl() buffers all the cells of the table row before the normal reset logic near the end of term_flushln() can be reached. This fixes an assertion failure triggered by \z near the end of a table cell, found by tb@ using afl(1).
*	If a node is tagged explicitly, skip implicit tagging for that node.	Ingo Schwarze	2022-04-26	6	-4/+51
\| \| \| \| \| \| \| \|	Apart from making sense in the first place, this fixes an assertion failure that happened when the calculated implicit tag did not match the string value of the first child of the node, Bug found by tb@ using afl(1).
*	If a .shift request has a negative argument, do not use a negative array	Ingo Schwarze	2022-04-24	3	-6/+13
\| \| \| \| \| \| \| \|	index but use 0 instead of the argument, just like groff. Warn about the invalid argument. While here, fix the column number in another warning message. Segfault reported by tb@, found with afl(1).
*	To prevent infinite recursion while expanding eqn(7) definitions,	Ingo Schwarze	2022-04-13	3	-6/+75
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	we must not reset the recursion counter when moving beyond the end of the previous expansion, but we may only do so when moving beyond the rightmost position reached by any expansion in the current equation. This matters because definitions can nest; consider: .EQ define inner "content" define outer "inner outer" outer .EN This endless loop was found by tb@ using afl(1). Incidentally, GNU eqn(1) also performs an infinite loop in this situation and then crashes when memory runs out, but that's not an excuse for nasty behaviour of mandoc(1). While here, consistently print the expanded content even when the expansion is finally truncated. While that is not likely to help end-users, it may help authors of eqn(7) code to understand what's going on. Besides, it sends a very clear signal that something is amiss, which was easy to miss in the past unless people enabled -W error or used -T lint.
*	Do not die on an assertion if an input file contains no section	Ingo Schwarze	2022-04-13	3	-3/+5
\| \| \| \| \| \| \| \| \|	whatsoever and ends with a broken next-line scope. Obviously, this cannot happen in a real manual page, but mandoc(1) should not die even when fed absurd input. This bug was independently reported by both jsg@ and tb@ who both found it with afl(1).
*	Surprisingly, groff supports multiple copy mode escapes at the	Ingo Schwarze	2022-04-13	3	-2/+50
\| \| \| \| \| \| \| \| \| \| \| \| \|	beginning of an escape sequence: \, \E, \EE, \EEE, and so on all do the same outside copy mode, so let them do the same in mandoc(1), too. This fixes an assertion failure triggered by \EEX that tb@ found with afl(1). The first E was consumed by roff_expand(), but that function failed to recognize the escape sequence as the expansion of a user-defined string and handed it over to mandoc_escape(), which consumed the second E and then died on an assertion because it is not prepared to handle user-defined strings. Fix this by letting both* functions handle arbitrary numbers of 'E's correctly.
*	do not use the sed(1) -i option, it is not portable;	Ingo Schwarze	2021-09-19	1	-5/+6
\| \| \| \|	issue found on Oracle Solaris 11
*	Correctly calculate required column widths for tables containing	Ingo Schwarze	2021-09-07	2	-3/+20
\| \| \| \| \| \| \|	cells that horizontally span columns which contains "n" (number) formatted cells on other rows. This requires updating total column widths from "n" formatted cells before starting width distribution from the spanning cells to their constituent columns.
*	Support two-character font names (BI, CW, CR, CB, CI)	Ingo Schwarze	2021-08-10	10	-23/+55
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	in the tbl(7) layout font modifier. Get rid of the TBL_CELL_BOLD and TBL_CELL_ITALIC flags and use the usual ESCAPE_FONT* enum mandoc_esc members from mandoc.h instead, which simplifies and unifies some code. While here, also support CB and CI in roff(7) \f escape sequences and in roff(7) .ft requests for all output modes. Using those is certainly not recommended because portability is limited even with groff, but supporting them makes some existing third-party manual pages look better, in particular in HTML output mode. Bug-compatible with groff as far as i'm aware, except that i consider font names starting with the '\n' (ASCII 0x0a line feed) character so insane that i decided to not support them. Missing feature reported by nabijaczleweli dot xyz in https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=992002. I used none of the code from the initial patch submitted by nabijaczleweli, but some of their ideas. Final patch tested by them, too.
*	delete the two pairs of extra blank lines from expected man(7) terminal	Ingo Schwarze	2021-06-28	216	-864/+22
\| \| \| \|	output that are no longer printed since man_term.c rev. 1.236
*	test private use areas some more as they have proven fragile	Ingo Schwarze	2021-06-02	8	-38/+74
\|
*	Cleanup:	Ingo Schwarze	2021-06-02	4	-53/+57
\| \| \| \| \| \| \| \| \| \|	1. Move invalid two-byte sequences after valid ones and make their descriptions easier to understand. 2. Replace the wrong and confusing expression "middle byte" with the correct term "start byte". 3. Add test lines for U+EFFFF and U+F0000. 4. Replace the unhelpful word "strange" with more descriptive terms. Arguably, nothing about this (or maybe everything?) is strange.
*	The wcwidth(3) of Plane 15 and Plane 16 Private Use Characters	Ingo Schwarze	2021-06-02	2	-4/+4
\| \| \| \| \|	was changed from 0 to 1. Adjust the test results accordingly. Issue reported by bluhm@
*	test font modifiers in the layout; related to tbl_html.c rev. 1.34	Ingo Schwarze	2021-05-16	4	-2/+49
\|
*	In HTML output, correctly render .Bd -unfilled in proportionally-spaced	Ingo Schwarze	2021-03-30	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \|	font, rather than with the monospace font appropriate for .Bd -literal. This fixes a minibug reported by anton@. Implemented by no longer relying on the typical browser default of "pre { font-family: monospace }" but instead letting <pre> elements inherit the font family from their parent, then adding an explicit CSS .Li class only for those displays where the manual page author requested it by using the -literal option on the .Bd macro.
*	Rename syntax test of the \O escape sequence (suppress output groff	Ingo Schwarze	2020-12-21	6	-26/+26
\| \| \| \| \| \|	extension; mandoc only implements syntax checking but ignores the sequence) to please Bill Gates and didickman@: avoid path names that only differ by case, like o.in vs. O.in.
*	The GNU tbl(1) program contained in the groff package internally	Ingo Schwarze	2020-10-25	6	-6/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	uses roff(7) tabulator settings to implement tables, and it used to leak the changed tabulator settings from tables to the subsequent roff(7) code. In mandoc/tbl_term.c rev. 1.54 (June 17, 2017), code was added to be bug-compatible with groff. In commit d0e03cf6 (Oct 20, 2020), GNU tbl(1) changed behaviour to save the tabulator settings before starting a table and restore them afterwards. Adjust mandoc for compatibility. Since mandoc implements tables without using roff(7) tabulator settings, saving and restoring tabulator settings is not needed in mandoc. Simply deleting the code that changed tabulator settings by reverting tbl_term.c rev. 1.54 is sufficient in mandoc. Also adjust the desired output of the regression tests to match the new behaviour of both groff and mandoc.
*	Treat \[.T] in the same way as \(.T rather than calling abort(3).	Ingo Schwarze	2020-10-24	7	-13/+20
\| \| \| \| \|	Bug found because the groff-current manual pages started using the variant form of this predefined string.
*	In HTML output, avoid printing a newline right after <pre>	Ingo Schwarze	2020-10-16	10	-97/+47
\| \| \| \| \| \| \| \|	and right before </pre> because that resulted in vertical whitespace not requested by the manual page author. Formatting bug reported by Aman Verma <amanraoverma plus vim at gmail dot com> on discuss@.
*	Element next-line scopes can nest. Consequently, even when closing	Ingo Schwarze	2020-09-09	5	-10/+34
\| \| \| \| \| \| \| \|	one element next-line scope, the MAN_ELINE flag must not yet be cleared if the parent macro is another element macro having next-line scope, or an assertion failure is caused if all this is wrapped in another macro that has block next-line scope, for example .TP. Bug found in an afl run performed by Jan Schreiber <jes at posteo dot de>.
*	Fix two issues with .po (page offset) formatting:	Ingo Schwarze	2020-09-03	3	-2/+53
\| \| \| \| \| \| \| \| \| \| \| \|	1. Truncate excessive offsets to a width reasonable in the context of manual pages instead of printing excessively long lines and sometimes causing assertion failures; found in an afl run performed by Jan Schreiber <jes at posteo dot de>. 2. Remember both the requested and the applied page offset; otherwise, subtracting an excessive width, then adding it again, would end up with an incorrectly large offset. While here, simplify the code by reverting the previous offset up front, and also add some comments to make the general ideas easier to understand.