summaryrefslogtreecommitdiffstats
path: root/regress/roff
Commit message (Collapse)AuthorAgeFilesLines
* Split a new function roff_parse_comment() out of roff_expand() because thisIngo Schwarze2022-05-014-3/+48
| | | | | | | functionality is not needed when called from roff_getarg(). This makes the long and complicated function roff_expand() significantly shorter, and also simpler in so far as it no longer needs to return ROFF_APPEND. No functional change intended.
* Provide a new function roff_req_or_macro() to parse and handle a requestIngo Schwarze2022-04-303-2/+64
| | | | | | | | | | | | | | or macro, including context-dependent error handling inside tbl(7) code and inside .ce/.rj blocks. Use it both in the top level roff(7) parser and inside conditional blocks. This fixes an assertion failure triggered by ".if 1 .ce" inside tbl(7) code, found by tb@ using afl(1). As a side benefit for readability, only one place remains in the code that calls the main handler functions for the various roff(7) requests. This patch also improves column numbers in some error messages and various comments.
* The syntax of the roff(7) .mc request is quite specialIngo Schwarze2022-04-285-2/+65
| | | | | | | | | and the roff_onearg() parsing function is too generic, so provide a dedicated parsing function instead. This fixes an assertion failure when an \o escape sequence is passed as the argument; the bug was found by tb@ using afl(1). It also makes mandoc output more similar to groff in various cases.
* Fix three bugs regarding the interaction of \z and \h:Ingo Schwarze2022-04-275-4/+41
| | | | | | | | | | | | | | | | | | | | 1. The combination \z\h is a no-op whatever the argument may be. In the past, the \z only affected the first space character generated by the \h, which was wrong. 2. For the conbination \zX\h with a positive argument, the first space resulting from the \h is not printed but consumed by the \z. 3. For the combination \zX\h with a negative argument, application of the \z needs to be completed before the \h can be started. In the past, if this combination occurred at the beginning of an output line, the \h backed up to the beginning of the line and after that, the \z attempted to back up even further, triggering an assertion. Bugs found during an audit of assignments to termp->col that i started after the bugfix tbl_term.c rev. 1.65. The assertion triggered by bug 3 was *not* yet found by afl(1).
* If a .shift request has a negative argument, do not use a negative arrayIngo Schwarze2022-04-243-6/+13
| | | | | | | | index but use 0 instead of the argument, just like groff. Warn about the invalid argument. While here, fix the column number in another warning message. Segfault reported by tb@, found with afl(1).
* Surprisingly, groff supports multiple copy mode escapes at theIngo Schwarze2022-04-133-2/+50
| | | | | | | | | | | | | beginning of an escape sequence: \, \E, \EE, \EEE, and so on all do the same outside copy mode, so let them do the same in mandoc(1), too. This fixes an assertion failure triggered by \EE*X that tb@ found with afl(1). The first E was consumed by roff_expand(), but that function failed to recognize the escape sequence as the expansion of a user-defined string and handed it over to mandoc_escape(), which consumed the second E and then died on an assertion because it is not prepared to handle user-defined strings. Fix this by letting *both* functions handle arbitrary numbers of 'E's correctly.
* Support two-character font names (BI, CW, CR, CB, CI)Ingo Schwarze2021-08-102-4/+5
| | | | | | | | | | | | | | | | | | | | | | | | in the tbl(7) layout font modifier. Get rid of the TBL_CELL_BOLD and TBL_CELL_ITALIC flags and use the usual ESCAPE_FONT* enum mandoc_esc members from mandoc.h instead, which simplifies and unifies some code. While here, also support CB and CI in roff(7) \f escape sequences and in roff(7) .ft requests for all output modes. Using those is certainly not recommended because portability is limited even with groff, but supporting them makes some existing third-party manual pages look better, in particular in HTML output mode. Bug-compatible with groff as far as i'm aware, except that i consider font names starting with the '\n' (ASCII 0x0a line feed) character so insane that i decided to not support them. Missing feature reported by nabijaczleweli dot xyz in https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=992002. I used none of the code from the initial patch submitted by nabijaczleweli, but some of their ideas. Final patch tested by them, too.
* delete the two pairs of extra blank lines from expected man(7) terminalIngo Schwarze2021-06-2854-216/+0
| | | | output that are no longer printed since man_term.c rev. 1.236
* Rename syntax test of the \O escape sequence (suppress output groffIngo Schwarze2020-12-216-26/+26
| | | | | | extension; mandoc only implements syntax checking but ignores the sequence) to please Bill Gates and didickman@: avoid path names that only differ by case, like o.in vs. O.in.
* Treat \*[.T] in the same way as \*(.T rather than calling abort(3).Ingo Schwarze2020-10-247-13/+20
| | | | | Bug found because the groff-current manual pages started using the variant form of this predefined string.
* In HTML output, avoid printing a newline right after <pre>Ingo Schwarze2020-10-162-5/+2
| | | | | | | | and right before </pre> because that resulted in vertical whitespace not requested by the manual page author. Formatting bug reported by Aman Verma <amanraoverma plus vim at gmail dot com> on discuss@.
* Fix two issues with .po (page offset) formatting:Ingo Schwarze2020-09-033-2/+53
| | | | | | | | | | | | 1. Truncate excessive offsets to a width reasonable in the context of manual pages instead of printing excessively long lines and sometimes causing assertion failures; found in an afl run performed by Jan Schreiber <jes at posteo dot de>. 2. Remember both the requested and the applied page offset; otherwise, subtracting an excessive width, then adding it again, would end up with an incorrectly large offset. While here, simplify the code by reverting the previous offset up front, and also add some comments to make the general ideas easier to understand.
* If .ti had an excessive argument, using it was attempted, in someIngo Schwarze2020-09-033-2/+49
| | | | | | | | cases resulting in an assertion failure. Instead, truncate the temporary indent to a width reasonable in a manual page. I found the issue in an afl run that was performed by Jan Schreiber <jes at posteo dot de>.
* Do not indent by SIZE_MAX/2 when .ce occurs inside explicit no-fill mode.Ingo Schwarze2020-09-022-4/+20
| | | | | | | | While here, drop two unused arguments from the function term_field(); the related work was already done by term_fill() before this commit. I found the bug in an afl run that was performed by Jan Schreiber <jes at posteo dot de>.
* Put the code handling \} into a new function roff_cond_checkend()Ingo Schwarze2020-08-037-6/+95
| | | | | | | | | | | | | | | | | | | | and call that function not only from both places where copies existed - when processing text lines and when processing request/macro lines in conditional block scope - but also when closing a macro definition request, such that this construction works: .if n \{.de macroname macro content .. \} ignored arguments .macroname This fixes a bug reported by John Gardner <gardnerjohng at gmail dot com>. While here, avoid a confusing decrement of the line scope counter in roffnode_cleanscope() for conditional blocks that do not have line scope in the first place (no functional change for this part). Also improve validation of an internal invariant in roff_cblock() and polish some comments.
* trivial sync with OpenBSDIngo Schwarze2020-07-301-6/+6
| | | | | in parts of these files that are not used by -portable; consequently, no functional change
* adapt to new <p> output logic (html.c rev. 1.260)Ingo Schwarze2019-09-034-14/+6
|
* In HTML output, allow switching the desired font for subsequentIngo Schwarze2019-04-301-5/+4
| | | | | | | | text without printing an opening tag right away, and use that in the .ft request handler. While here, garbage collect redundant enum htmlfont and reduce code duplication in print_text(). Fixing an assertion failure reported by Michael <Stapelberg at Debian> in pmRegisterDerived(3) from libpcp3-dev.
* When calling an empty macro, do not clobber existing arguments.Ingo Schwarze2019-04-213-3/+30
| | | | | Fixing a bug found with the groffer(1) version 1.19 manual page following a report from Jan Stary.
* Implement the roff .break request (break out of a .while loop).Ingo Schwarze2019-04-213-2/+27
| | | | | | | Jan Stary <hans at stare dot cz> found it in an ancient groffer(1) manual page (version 1.19) on MacOS X Mojave. Having .break not implemented wasn't a particularly bright idea because obviously, it tended to cause infinite loops.
* Wrap .Sh/.SH sections and .Ss/.SS subsections in HTML <section> elementsIngo Schwarze2019-03-011-1/+1
| | | | | | as recommended for accessibility by the HTML 5 standard. Triggered by a similar, but slightly different suggestion from Laura Morales <lauretas at mail dot com>.
* Let roff_getname() end the roff identifier at a tab characterIngo Schwarze2019-02-0617-15/+199
| | | | | | | | | | | | | | | | | | | | | | and audit all its callers whether termination is handled correctly. Resulting improvements: * An escape or tab ending the macro name in a macro invocation is discarded, and argument processing is started after it. * An escape or tab ending a name in ".if d" and ".if r" is preserved. * An escape ending a name in ".ds" causes the whole request to be ignored. * A tab ending a name in ".ds" becomes part of the string. * An escape or tab ending a name in ".rm" causes the rest of the line to be ignored. * An escape or tab ending the first name in ".als", ".rn", or ".nr" causes the whole request to be ignored. Kurt Jaeger <pi at FreeBSD> made me aware of https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=235456#c0 and in that bug report, comment 0 item (3) is a special case of this class of issues. Yes, the "mh" manual pages are no doubt among the worst on the planet.
* Test handling of escaped backslashes because the code related toIngo Schwarze2019-01-175-2/+97
| | | | copy mode is complicated and prone to regressions.
* Represent mdoc(7) .Pp (and .sp, and some SYNOPSIS and .Rs features)Ingo Schwarze2019-01-075-30/+14
| | | | | | | | | | | | | | | | | | | by the <p> HTML element and use the html_fillmode() mechanism for .Bd -unfilled, just like it was done for man(7) earlier, finally getting rid both of the horrible <div class="Pp"></div> hack and of the worst HTML syntax violations caused by nested displays. Care is needed because in some situations, paragraphs have to remain open across several subsequent macros, whereas in other situations, they must get closed together with a block containing them. Some implementation details include: * Always close paragraphs before emitting HTML flow content. * Let html_close_paragraph() also close <pre> for extra safety. * Drop the old, now unused function print_paragraph(). * Minor adjustments in the top-level man(7) node formatter for symmetry. * Bugfix: .Ss heads suspend no-fill mode, even though .Ss doesn't end it. * Bugfix: give up on .Op semantic markup for now, see the comment.
* Finally, represent the man(7) .PP and .HP macros by the naturalIngo Schwarze2019-01-064-3/+53
| | | | | | | | | | | choice, which is the <p> HTML element. On top of the previous fill-mode improvements, the key to making this possible is to automatically close the <p> when required: before headers, subsequent paragraphs, lists, indented blocks, synopsis blocks, tbl(7) blocks, and before blocks using no-fill mode. In man(7) documents, represent the .sp request by a blank line in no-fill mode and in the same way as .PP in fill mode.
* test the roff(7) .ce and .rj requests;Ingo Schwarze2019-01-044-2/+43
| | | | they were already supported in the past
* merge a test update from OpenBSD that was forgotten in AprilIngo Schwarze2018-12-212-1/+11
|
* Rename mandoc_getarg() to roff_getarg() and pass it the roff parserIngo Schwarze2018-12-211-6/+10
| | | | | | | | | | | | | | | | | | struct as an argument such that after copy-in, it can call roff_expand() once again, which used to be called roff_res() before this. This fixes a subtle low-level roff(7) parsing bug reported by Fabio Scotoni <fabio at esse dot ch> in the 4.4BSD-Lite2 mdoc.samples(7) manual page, because that page used an escaped escape sequence in a macro argument. To expand escaped escape sequences in quoted mdoc(7) arguments, too, stop bypassing the call to roff_getarg() in mdoc_argv.c, function args() for this case. This does not solve the case of escaped escape sequences in quoted .Bl -column phrases yet. Because roff_expand() can make the string longer, roff_getarg() can no longer operate in-place but needs to malloc(3) the returned string. In the high-level parsers, free(3) that string after processing it.
* Yet another round of improvements to manual font selection.Ingo Schwarze2018-12-169-25/+68
| | | | | | | | | Unify handling of \f and .ft. Support \f4 (bold+italic). Support ".ft BI" and ".ft CW" for terminal output. Support the .ft request in HTML output. Reject the bogus fonts \f(C1, \f(C2, \f(C3, and \f(CP. In regress.pl, only strip leading whitespace in math mode.
* Several improvements to escape sequence handling.Ingo Schwarze2018-12-1515-38/+218
| | | | | | | | | | | | | | | | | | | | | | | * Add the missing special character \_ (underscore). * Partial implementations of \a (leader character) and \E (uninterpreted escape character). * Parse and ignore \r (reverse line feed). * Add a WARNING message about undefined escape sequences. * Add an UNSUPP message about unsupported escape sequences. * Mark \! and \? (transparent throughput) and \O (suppress output) as unsupported. * Treat the various variants of zero-width spaces as one-byte escape sequences rather than as special characters, to avoid defining bogus forms with square brackets. * For special characters with one-byte names, do not define bogus forms with square brackets, except for \[-], which is valid. * In the form with square brackets, undefined special characters do not fall back to printing the name verbatim, not even for one-byte names. * Starting a special character name with a blank is an error. * Undefined escape sequences never abort formatting of the input string, not even in HTML output mode. * Document the newly handled escapes, and a few that were missing. * Regression tests for most of the above.
* Clean up the validation of .Pp, .PP, .sp, and .br. Make sure allIngo Schwarze2018-12-042-0/+3
| | | | | | | | | | | | | | combinations are handled, and are handled in a systematic manner. This resolves some erratic duplicate handling, handles a number of missing cases, and improves diagnostics in various respects. Move validation of .br and .sp to the roff validation module rather than doing that twice in the mdoc and man validation modules. Move the node relinking function to the roff library where it belongs. In validation functions, only look at the node itself, at previous nodes, and at descendants, not at following nodes or ancestors, such that only nodes are inspected which are already validated.
* When a conditional block is closed by putting "\}" on a text lineIngo Schwarze2018-11-263-4/+30
| | | | | | | | | | | | | by itself (which is somewhat unusual but not invalid; most authors use the empty macro line ".\}" instead), agree more closely with groff and do not produce a double space in the output. Quirk reported by millert@. While here, tweak the rest of the function body of roff_cond_text() to more closely match roff_cond_sub(). The subtly different handling could make people (including myself) wonder whether there is any point in being different. Testing shows there is not.
* Rudimentary implementation of the roff(7) .char (output glyphIngo Schwarze2018-08-257-2/+61
| | | | | | | | | definition) request, used for example by groff_hdtbl(7). This simplistic implementation may interact incorrectly with the .tr (input character translation) request. But come on, you are not only using .char *and* .tr, but you do so with respect to the same character in the same manual page?
* If man(7) next-line scope is open and the line ends with \c,Ingo Schwarze2018-08-252-3/+29
| | | | the scope remains open. Needed for example for groff_man(7).
* Rudimentary implementation of the roff(7) .while request.Ingo Schwarze2018-08-2416-2/+182
| | | | | | | | | | | Needed for example by groff_hdtbl(7). There are two limitations: It does not support nested .while requests yet, and each .while loop must start and end in the same scope. The roff_parseln() return codes are now more flexible and allow OR'ing options.
* Implement the roff(7) .shift and .return requests,Ingo Schwarze2018-08-2314-8/+184
| | | | | | | | | | | | | | for example used by groff_hdtbl(7) and groff_mom(7). Also correctly interpolate arguments during nested macro execution even after .shift and .return, implemented using a stack of argument arrays. Note that only read.c, but not roff.c can detect the end of a macro execution, and the existence of .shift implies that arguments cannot be interpolated up front, so unfortunately, this includes a partial revert of roff.c rev. 1.337, moving argument interpolation back into the function roff_res().
* Disable one test for now that is broken after the addition of \).Ingo Schwarze2018-08-192-4/+3
| | | | | | | | It is not broken because of \), which is correctly implemented, but the addition merely reveals a hidden bug elsewhere, almost certainly in \\ handling. Given that \\ is among the most mysterious escape sequences and using it is very strongly discouraged in manual pages, fixing that is not urgent - and may be hard.
* Implement the \*(.T predefined string (interpolate device name)Ingo Schwarze2018-08-166-1/+90
| | | | | by allowing the preprocessor to pass it through to the formatters. Used for example by the groff_char(7) manual page.
* Two new low-level roff(7) features:Ingo Schwarze2018-04-103-1/+50
| | | | | | * .nr optional third argument (auto-increment step size) * \n+ and \n- numerical register auto-increment and -decrement bentley@ reported on Dec 9, 2013 that lang/sbcl(1) uses these.
* When accessing an undefined number register, define it to be zero, likeIngo Schwarze2018-04-093-1/+46
| | | | | | the previous commit for strings and macros, only technically simpler. Desired behaviour also mentioned by Werner Lemberg in 2011. This diff adds functionality but is -21 +19 LOC. :-)
* Using an undefined string or macro will cause it to be defined as empty.Ingo Schwarze2018-04-094-2/+110
| | | | | Observed by Werner Lemberg on Nov 14, 2011 and rotting on my TODO list ever since.
* catch up with ASCII renderings in chars.c rev. 1.72Ingo Schwarze2017-08-232-10/+14
|
* Now that we have the -Wstyle message level, downgrade six warningsIngo Schwarze2017-07-065-44/+44
| | | | | | that are not syntax mistakes and that do not cause wrong formatting or content to style suggestions. Also upgrade two warnings that may cause information loss to errors.
* Fix handling of \} on roff request lines.Ingo Schwarze2017-07-042-3/+12
| | | | Cures bogus error messages in pages generated with pod2man(1).
* Messages of the -Wbase level now print STYLE:. Since thisIngo Schwarze2017-07-04145-315/+341
| | | | | | | | causes horrible churn anyway, profit of the opportunity to stop excessive testing, such that this is hopefully the last instance of such churn. Consistently use OpenBSD RCS tags, blank .Os, blank fourth .TH argument, and Mdocdate like everywhere else. Use -Ios=OpenBSD for platform-independent predictable output.
* cope with changes in BASE messagesIngo Schwarze2017-06-257-9/+11
|
* cope with changes in BASE messagesIngo Schwarze2017-06-2518-29/+40
|
* Implement appending to standard man(7) and mdoc(7) macros with .am.Ingo Schwarze2017-06-1812-6/+100
| | | | | | | | | | | | | | | | | | With roff_getstrn(), provide finer control which definitions can be used for what: * All definitions can be used for .if d tests and .am appending. * User-defined for \* expansion, .dei expansion, and macro calling. * Predefined for \* expansion. * Standard macros, original or renamed, for macro calling. Several related improvements while here: * Do not return string table entries that have explicitly been removed. * Do not create a rentab entry when trying to rename a non-existent macro. * Clear an existing rentab entry when the external interface roff_setstr() is called with its name. * Avoid trailing blanks in macro lines generated from renamed and from aliased macros. * Delete the duplicate __m*_reserved[] tables, just use roff_name[].
* style message about missing RCS ids; inspired by mdoclintIngo Schwarze2017-06-1721-4/+21
|
* style message about missing RCS ids; inspired by mdoclintIngo Schwarze2017-06-1711-4/+11
|