summaryrefslogtreecommitdiffstats
path: root/regress
Commit message (Collapse)AuthorAgeFilesLines
* Introduce the concept of nodes that are semantically transparent:Ingo Schwarze2020-02-2754-36/+834
| | | | | | | | | | | | | | they are skipped when looking for previous or following high-level macros. Examples include roff(7) .ft, .ll, and .ta, mdoc(7) .Sm and .Tg, and man(7) .DT and .PD. Use this concept for a variety of improved decisions in various validators and formatters. While here, * remove a few const qualifiers on struct arguments that caused trouble; * get rid of some more Yoda notation in the vicinity; * and apply some other stylistic improvements in the vicinity. I found this class of issues while considering .Tg patches from kn@.
* Fix this test after the recent Unicode update in OpenBSD base.Ingo Schwarze2020-02-271-1/+1
| | | | | | | | | | The test uses U+07FF NKO TAMAN SIGN because it is the highest code point having a two-byte UTF-8 representation. This character is a new single-width punctuation character in Unicode 11, such that mandoc now does correct horizontal spacing. We already used the code point for the test before it was assigned, which resulted in weird spacing because wcwidth(3) returns -1 for unassigned code points.
* Align to the new, sane behaviour of the groff_mdoc(7) .Dd macro:Ingo Schwarze2020-01-1920-30/+42
| | | | | | | | without an argument, use the empty string, and always concatenate all arguments, no matter their number. This allows reducing the number of arguments of mandoc_normdate() and some other simplifications, at the same time polishing some error messages by adding the name of the macro in question.
* test tbl_term.c rev. 1.73 and tbl_data.c rev. 1.53:Ingo Schwarze2020-01-116-11/+39
| | | | | incomplete short layout lines followed by longer lines, and spans at the beginning of layout lines
* Skip whitespace before tokens, too.Ingo Schwarze2020-01-084-3/+23
| | | | Bug found by bentley@ with input like "delim $$ delim off".
* Improve the test case by changing the eqn(7) delimiters such that itIngo Schwarze2020-01-082-7/+7
| | | | | actually tests which parts of text lines are processed with eqn(7) and which are not.
* Enable generation of the desired delim/basic output with groff(1).Ingo Schwarze2020-01-081-1/+3
| | | | No functional change for the portable test suite.
* Simplify maintainer targets in OpenBSD: EQN and TBL variablesIngo Schwarze2020-01-085-29/+8
| | | | | | no longer exist and NROFF/NOPTS were replaced with GROFF/GOPTS. This doesn't change how things work in the protable version of the test suite.
* When all cells in a tbl(1) column are empty, set the column widthIngo Schwarze2019-12-313-2/+97
| | | | | to 1n rather than to 0n, in the same way as groff does. This fixes misformatting reported by bentley@ in xkeyboard-config(7).
* Improve validation of function names:Ingo Schwarze2019-09-134-3/+63
| | | | | | 1. Relax checking to accept function types of the form "ret_type (fname)(args)" (suggested by Yuri Pankov <yuripv dot net>). 2. Tighten checking to require the closing parenthesis.
* adapt to print_indent() HTML_NOSPACE fix, html.c rev. 1.261Ingo Schwarze2019-09-052-18/+6
|
* adapt to new <p> output logic (html.c rev. 1.260)Ingo Schwarze2019-09-0327-78/+29
|
* new test for an empty text block; from rea@ via bapt@ (FreeBSD)Ingo Schwarze2019-07-183-2/+44
|
* When parsing a tab character that is not preceded by a space characterIngo Schwarze2019-07-114-5/+6
| | | | | | | | | | | | | | | | on an .It -column line, args() sets the MDOC_PHRASEQL flag to Quote the Last word of the Phrase. Even if it turns out this quoting is not needed because the word is already quoted for other reasons, clear the flag at the end of parsing the phrase, such that the flag does not leak to the next phrase. This patch fixes the bug that the trailing Macro on a line of the form .It "word<tab>word" Ta word Macro<eol> was incorrectly considered quoted and hence not parsed. Bug found by Havard Eidnes (he@) with the NetBSD gettytab(5) manual page: https://gnats.netbsd.org/cgi-bin/query-pr-single.pl?number=54361 Reported via Thomas Klausner (wiz@).
* delete trailing whitespace and space-tab sequences; no code change;Ingo Schwarze2019-07-011-1/+1
| | | | | patch from Michal Nowak <mnowak at startmail dot com> who found these with git pbchk in the illumos tree
* Do not access a NULL pointer if a table contains a horizontal lineIngo Schwarze2019-06-113-2/+99
| | | | | | next to a table line having fewer columns than the table as a whole. Bug found by Stephen Gregoratto <dev at sgregoratto dot me> with aerc-config(5).
* In HTML output, allow switching the desired font for subsequentIngo Schwarze2019-04-301-5/+4
| | | | | | | | text without printing an opening tag right away, and use that in the .ft request handler. While here, garbage collect redundant enum htmlfont and reduce code duplication in print_text(). Fixing an assertion failure reported by Michael <Stapelberg at Debian> in pmRegisterDerived(3) from libpcp3-dev.
* When calling an empty macro, do not clobber existing arguments.Ingo Schwarze2019-04-213-3/+30
| | | | | Fixing a bug found with the groffer(1) version 1.19 manual page following a report from Jan Stary.
* Implement the roff .break request (break out of a .while loop).Ingo Schwarze2019-04-213-2/+27
| | | | | | | Jan Stary <hans at stare dot cz> found it in an ancient groffer(1) manual page (version 1.19) on MacOS X Mojave. Having .break not implemented wasn't a particularly bright idea because obviously, it tended to cause infinite loops.
* Automatically detect whether diff(1) supports the -a option.Ingo Schwarze2019-03-101-0/+1
| | | | | Useful on illumos and on Oracle Solaris, where it doesn't. Patch written based on a report from Sevan Janiyan.
* mention Solaris BUGS in regress.pl(1)Ingo Schwarze2019-03-061-0/+18
|
* Wrap .Sh/.SH sections and .Ss/.SS subsections in HTML <section> elementsIngo Schwarze2019-03-016-1/+17
| | | | | | as recommended for accessibility by the HTML 5 standard. Triggered by a similar, but slightly different suggestion from Laura Morales <lauretas at mail dot com>.
* Let roff_getname() end the roff identifier at a tab characterIngo Schwarze2019-02-0617-15/+199
| | | | | | | | | | | | | | | | | | | | | | and audit all its callers whether termination is handled correctly. Resulting improvements: * An escape or tab ending the macro name in a macro invocation is discarded, and argument processing is started after it. * An escape or tab ending a name in ".if d" and ".if r" is preserved. * An escape ending a name in ".ds" causes the whole request to be ignored. * A tab ending a name in ".ds" becomes part of the string. * An escape or tab ending a name in ".rm" causes the rest of the line to be ignored. * An escape or tab ending the first name in ".als", ".rn", or ".nr" causes the whole request to be ignored. Kurt Jaeger <pi at FreeBSD> made me aware of https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=235456#c0 and in that bug report, comment 0 item (3) is a special case of this class of issues. Yes, the "mh" manual pages are no doubt among the worst on the planet.
* Since resetting of offsets works quite differently in man(7) and mdoc(7),Ingo Schwarze2019-01-313-2/+29
| | | | | test table centering in an mdoc(7) document as well. Related to tbl_term.c rev. 1.67.
* Test handling of escaped backslashes because the code related toIngo Schwarze2019-01-175-2/+97
| | | | copy mode is complicated and prone to regressions.
* Remove the HTML title= attributes which harmed accessibility andIngo Schwarze2019-01-116-19/+18
| | | | | | | violated the principle of separation of content and presentation. Instead, implement the tooltips purely in CSS. Thanks to John Gardner <gardnerjohng at gmail dot com> for suggesting most of the styling in the new ::before rules.
* Represent mdoc(7) .Pp (and .sp, and some SYNOPSIS and .Rs features)Ingo Schwarze2019-01-0735-60/+450
| | | | | | | | | | | | | | | | | | | by the <p> HTML element and use the html_fillmode() mechanism for .Bd -unfilled, just like it was done for man(7) earlier, finally getting rid both of the horrible <div class="Pp"></div> hack and of the worst HTML syntax violations caused by nested displays. Care is needed because in some situations, paragraphs have to remain open across several subsequent macros, whereas in other situations, they must get closed together with a block containing them. Some implementation details include: * Always close paragraphs before emitting HTML flow content. * Let html_close_paragraph() also close <pre> for extra safety. * Drop the old, now unused function print_paragraph(). * Minor adjustments in the top-level man(7) node formatter for symmetry. * Bugfix: .Ss heads suspend no-fill mode, even though .Ss doesn't end it. * Bugfix: give up on .Op semantic markup for now, see the comment.
* Finally, represent the man(7) .PP and .HP macros by the naturalIngo Schwarze2019-01-0628-45/+380
| | | | | | | | | | | choice, which is the <p> HTML element. On top of the previous fill-mode improvements, the key to making this possible is to automatically close the <p> when required: before headers, subsequent paragraphs, lists, indented blocks, synopsis blocks, tbl(7) blocks, and before blocks using no-fill mode. In man(7) documents, represent the .sp request by a blank line in no-fill mode and in the same way as .PP in fill mode.
* In no-fill mode, avoid bogus blank lines in two situations:Ingo Schwarze2019-01-052-1/+33
| | | | | 1. After the last child; the parent will take care of the line break. 2. At the .YS macro; the end of the preceding .SY already broke the line.
* In groff, when the .SY block macro occurs in no-fill mode,Ingo Schwarze2019-01-054-2/+66
| | | | the output line gets broken after the head. Do the same.
* Slowly start doing more HTML output tests, in this case for theIngo Schwarze2019-01-054-12/+44
| | | | | | | | | | | | | | | | | | | interaction of .nf and .RS, related to man_macro.c rev. 1.106. HTML regression testing is tricky because it is extremely prone to over-testing, i.e. unintentional testing for volatile formatting details which are irrelevant for deciding whether the HTML output is good or bad. Minor changes to the formatter - which is still heavily under development - might result in the necessity to repeatedly adjust many test cases. Then again, HTML syntax rules are so complicated that without regression testing, the risk is simply too high that later changes will re-introduce issues that were already fixed earlier. Let's just try to design the tests very carefully in such a way that the *.out_html files contain nothing that is likely to change, and defer testing in cases where the HTML output is not yet clean enough to allow designing tests in such a way.
* Test interaction of low-level roff(7) filling requests with .Bd in generalIngo Schwarze2019-01-047-12/+97
| | | | and filling in .Bd -centered in particular; related to mdoc_term.c rev. 1.372.
* test the roff(7) .ce and .rj requests;Ingo Schwarze2019-01-044-2/+43
| | | | they were already supported in the past
* catch up with the changed order of warnings;Ingo Schwarze2018-12-312-2/+2
| | | | related to man_validate.c rev. 1.145
* merge a test update from OpenBSD that was forgotten in AprilIngo Schwarze2018-12-212-1/+11
|
* Rename mandoc_getarg() to roff_getarg() and pass it the roff parserIngo Schwarze2018-12-2129-31/+365
| | | | | | | | | | | | | | | | | | struct as an argument such that after copy-in, it can call roff_expand() once again, which used to be called roff_res() before this. This fixes a subtle low-level roff(7) parsing bug reported by Fabio Scotoni <fabio at esse dot ch> in the 4.4BSD-Lite2 mdoc.samples(7) manual page, because that page used an escaped escape sequence in a macro argument. To expand escaped escape sequences in quoted mdoc(7) arguments, too, stop bypassing the call to roff_getarg() in mdoc_argv.c, function args() for this case. This does not solve the case of escaped escape sequences in quoted .Bl -column phrases yet. Because roff_expand() can make the string longer, roff_getarg() can no longer operate in-place but needs to malloc(3) the returned string. In the high-level parsers, free(3) that string after processing it.
* Bugfix:Ingo Schwarze2018-12-203-3/+8
| | | | | | | When after a \\, \t, or \a, another \t or \a had to be resolved in copy mode within the same argument, the argument got corrupted. Found while working on a loosely related bug report from Fabio Scotoni <fabio at esse dot ch>.
* Yet another round of improvements to manual font selection.Ingo Schwarze2018-12-1610-34/+79
| | | | | | | | | Unify handling of \f and .ft. Support \f4 (bold+italic). Support ".ft BI" and ".ft CW" for terminal output. Support the .ft request in HTML output. Reject the bogus fonts \f(C1, \f(C2, \f(C3, and \f(CP. In regress.pl, only strip leading whitespace in math mode.
* Several improvements to escape sequence handling.Ingo Schwarze2018-12-1527-58/+303
| | | | | | | | | | | | | | | | | | | | | | | * Add the missing special character \_ (underscore). * Partial implementations of \a (leader character) and \E (uninterpreted escape character). * Parse and ignore \r (reverse line feed). * Add a WARNING message about undefined escape sequences. * Add an UNSUPP message about unsupported escape sequences. * Mark \! and \? (transparent throughput) and \O (suppress output) as unsupported. * Treat the various variants of zero-width spaces as one-byte escape sequences rather than as special characters, to avoid defining bogus forms with square brackets. * For special characters with one-byte names, do not define bogus forms with square brackets, except for \[-], which is valid. * In the form with square brackets, undefined special characters do not fall back to printing the name verbatim, not even for one-byte names. * Starting a special character name with a blank is an error. * Undefined escape sequences never abort formatting of the input string, not even in HTML output mode. * Document the newly handled escapes, and a few that were missing. * Regression tests for most of the above.
* Clean up the validation of .Pp, .PP, .sp, and .br. Make sure allIngo Schwarze2018-12-048-4/+19
| | | | | | | | | | | | | | combinations are handled, and are handled in a systematic manner. This resolves some erratic duplicate handling, handles a number of missing cases, and improves diagnostics in various respects. Move validation of .br and .sp to the roff validation module rather than doing that twice in the mdoc and man validation modules. Move the node relinking function to the roff library where it belongs. In validation functions, only look at the node itself, at previous nodes, and at descendants, not at following nodes or ancestors, such that only nodes are inspected which are already validated.
* In the validators, translate obsolete macro aliases (Lp, Ot, LP, P)Ingo Schwarze2018-12-031-2/+2
| | | | | | to the standard forms (Pp, Ft, PP) up front, such that later code does not need to look for the obsolete versions. This reduces the risk of incomplete handling.
* When a conditional block is closed by putting "\}" on a text lineIngo Schwarze2018-11-263-4/+30
| | | | | | | | | | | | | by itself (which is somewhat unusual but not invalid; most authors use the empty macro line ".\}" instead), agree more closely with groff and do not produce a double space in the output. Quirk reported by millert@. While here, tweak the rest of the function body of roff_cond_text() to more closely match roff_cond_sub(). The subtly different handling could make people (including myself) wonder whether there is any point in being different. Testing shows there is not.
* Render the eqn(7) "sqrt" function as U+221A in UTF-8 output.Ingo Schwarze2018-10-022-4/+4
| | | | | | This also agrees with what groff does. Suggested by an attendee of EuroBSDCon 2018 in Bucuresti. Written on the plane Bucuresti-Frankfurt returning from EuroBSDCon.
* Rudimentary implementation of the roff(7) .char (output glyphIngo Schwarze2018-08-257-2/+61
| | | | | | | | | definition) request, used for example by groff_hdtbl(7). This simplistic implementation may interact incorrectly with the .tr (input character translation) request. But come on, you are not only using .char *and* .tr, but you do so with respect to the same character in the same manual page?
* If man(7) next-line scope is open and the line ends with \c,Ingo Schwarze2018-08-252-3/+29
| | | | the scope remains open. Needed for example for groff_man(7).
* Rudimentary implementation of the roff(7) .while request.Ingo Schwarze2018-08-2416-2/+182
| | | | | | | | | | | Needed for example by groff_hdtbl(7). There are two limitations: It does not support nested .while requests yet, and each .while loop must start and end in the same scope. The roff_parseln() return codes are now more flexible and allow OR'ing options.
* Implement the roff(7) .shift and .return requests,Ingo Schwarze2018-08-2314-8/+184
| | | | | | | | | | | | | | for example used by groff_hdtbl(7) and groff_mom(7). Also correctly interpolate arguments during nested macro execution even after .shift and .return, implemented using a stack of argument arrays. Note that only read.c, but not roff.c can detect the end of a macro execution, and the existence of .shift implies that arguments cannot be interpolated up front, so unfortunately, this includes a partial revert of roff.c rev. 1.337, moving argument interpolation back into the function roff_res().
* Improve the ASCII rendering of \(Po (Pound Sterling)Ingo Schwarze2018-08-2116-90/+86
| | | | | and of the playing card suits to match groff, using feedback from Ralph Corderoy <ralph at inputplus dot co dot uk>.
* Fix some issues found looking at groff_char(7):Ingo Schwarze2018-08-218-12/+12
| | | | | | * Add two missing characters, \('Y and \('y. * The Weierstrass p is not capital, see http://unicode.org/notes/tn27/. * Add a groff-compatible ASCII transliteration for U+02DC: "~".
* Disable one test for now that is broken after the addition of \).Ingo Schwarze2018-08-192-4/+3
| | | | | | | | It is not broken because of \), which is correctly implemented, but the addition merely reveals a hidden bug elsewhere, almost certainly in \\ handling. Given that \\ is among the most mysterious escape sequences and using it is very strongly discouraged in manual pages, fixing that is not urgent - and may be hard.