summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* Make the "make depend" maintainer target more convenientIngo Schwarze2020-03-132-6/+20
| | | | by having it run ./configure with native fts and ohash disabled.
* Properly reset the validation part of the tagging module between files.Ingo Schwarze2020-03-132-0/+5
| | | | This fixes a crash in makewhatis(8) encountered by naddy@.
* Split tagging into a validation part including prioritizationIngo Schwarze2020-03-13121-909/+2339
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | in tag.{h,c} and {mdoc,man}_validate.c and into a formatting part including command line argument checking in term_tag.{h,c}, html.c, and {mdoc|man}_{term|html}.c. Immediate functional benefits include: * Improved prioritization of automatic tags for .Em and .Sy. * Avoiding bogus automatic tags when .Em, .Fn, or .Sy are explicitly tagged. * Explicit tagging of .Er and .Fl now works in HTML output. * Automatic tagging of .IP and .TP now works in HTML output. But mainly, this patch provides clean earth to build further improvements on. Technical changes: * Main program: Write a tag file for ASCII and UTF-8 output only. * All formatters: There is no more need to delay writing the tags. * mdoc(7)+man(7) formatters: No more need for elaborate syntax tree inspection. * HTML formatter: If available, use the "string" attribute as the tag. * HTML formatter: New function to write permalinks, to reduce code duplication. Style cleanup in the vicinity while here: * mdoc(7) terminal formatter: To set up bold font for children, defer to termp_bold_pre() rather than calling term_fontpush() manually. * mdoc(7) terminal formatter: Garbage collect some duplicate functions. * mdoc(7) HTML formatter: Unify <code> handling, delete redundant functions. * Where possible, use switch statements rather than if cascades. * Get rid of some more Yoda notation. The necessity for such changes was first discussed with kn@, but i didn't bother him with a request to review the resulting -673/+782 line patch.
* The HTML standard does not allow self-closing syntax for non-void elements.Ingo Schwarze2020-02-272-3/+3
| | | | Consequently, write an explicit end tag for <mark> elements.
* Fully support explicit tagging of .Sh and .Ss.Ingo Schwarze2020-02-2710-18/+141
| | | | | | | | | | | | | | | | | | | | | | | | | This fixes the offset of two lines in terminal output and this improves HTML output by putting the id= attribute and <a> element into the respective <h1> or <h2> element rather than writing an additional <mark> element. To that end, introduce node flags NODE_ID (to make the node a link target, for example by writing an HTML id= attribute or by calling tag_put()) and NODE_HREF (to make the node a link source, used only in HTML output, used only to write an <a class="permalink"> element). In particular: * In the validator, generalize the concept of the "next node" such that it also works before .Sh and .Ss. * If the first argument of .Tg is empty, don't forget to complain if there are additional arguments, which will be ignored. * In the terminal formatter, support writing of explicit tags for all kinds of nodes, not just for .Tg. * In deroff(), allow nodes to have an explicit string representation even when they aren't text nodes. Use this for explicitly tagged section headers. Suprisingly, this is sufficient to make HTML output work, without explicit code changes in the HTML formatter. * In syntax tree output, display NODE_ID and NODE_HREF.
* Introduce the concept of nodes that are semantically transparent:Ingo Schwarze2020-02-2764-311/+1156
| | | | | | | | | | | | | | they are skipped when looking for previous or following high-level macros. Examples include roff(7) .ft, .ll, and .ta, mdoc(7) .Sm and .Tg, and man(7) .DT and .PD. Use this concept for a variety of improved decisions in various validators and formatters. While here, * remove a few const qualifiers on struct arguments that caused trouble; * get rid of some more Yoda notation in the vicinity; * and apply some other stylistic improvements in the vicinity. I found this class of issues while considering .Tg patches from kn@.
* Fix this test after the recent Unicode update in OpenBSD base.Ingo Schwarze2020-02-271-1/+1
| | | | | | | | | | The test uses U+07FF NKO TAMAN SIGN because it is the highest code point having a two-byte UTF-8 representation. This character is a new single-width punctuation character in Unicode 11, such that mandoc now does correct horizontal spacing. We already used the code point for the test before it was assigned, which resulted in weird spacing because wcwidth(3) returns -1 for unassigned code points.
* Marc Espie reported that "man p*ipc" displayed the perlipc(1) manual.Ingo Schwarze2020-02-241-3/+25
| | | | | | | | | | | | | | The reason was that as a last resort when failing to find a page name in mandoc.db(5) or at a few well well-defined fully qualified file names, man(1) uses glob(3) to look for candidate files in relevant directories, because some operating systems have weird file name extensions, for example pcap.3pcap and BF_set_key.3ssl on Linux. But during that globbing, the metacharacters "*?[" need to be escaped in the name, section, and path supplied by the user, or you would get weird false positives and misleading warning messages and would be unable to use the fallback for path or file names that actually contain an opening bracket. Feedback and OK espie@.
* bugfix: indented paragraph macros need a space characterIngo Schwarze2020-02-201-3/+3
| | | | before the width argument
* bugfix: .Tg must be ignored completely in these output modesIngo Schwarze2020-02-202-2/+2
|
* Mention that .AT, .P, .SB, and .UC are extensions; it really mattersIngo Schwarze2020-02-181-3/+14
| | | | | | | | | | | | | because we only retain the language for backward compatibility in the first place. Part of the research done by <G dot Branden dot Robinson at gmail dot com>, see the list <groff at gnu dot org> for details. No change to the following conventions: Consider portable whatever made it into GNU troff no later than 4.4BSD. For portable extensions, mention their origin at the end of the description. For non-portable extensions, for example from man-ext, usually warn earlier, near the beginning of the description.
* mention that -T man does not support eqn(7) and tbl(7);Ingo Schwarze2020-02-151-1/+6
| | | | triggered by a question from Stephen Gregoratto <dev at sgregoratto dot me>
* two new entries: "Fl Fl" to "Fl \-" in validation and eqn/tbl in -T manIngo Schwarze2020-02-151-0/+8
|
* Mention that the .Dd "date" argument is the date of the last change.Ingo Schwarze2020-02-131-15/+2
| | | | | | Triggered by a question from Jason A. Donenfeld. While here, delete three COMPATIBILITY entries that i fixed some time ago.
* Digit-width and narrow spaces are non-breaking.Ingo Schwarze2020-02-132-8/+9
| | | | Noticed because Branden Robinson worked on related documentation in groff.
* In roff, a space character at the beginning of an input line requiresIngo Schwarze2020-02-121-2/+2
| | | | | | | | starting a new output line, and merely starting a new line of HTML code isn't sufficient to achieve that. Solve this in the same way as mdoc_html.c already does it, by printing a <br/> element. Fixing a bug reported by Jason A. Donenfeld <Jason at zx2c4 dot com> in the wg-quick(8) manual page on manpages.debian.org.
* Finally delete support for the "_whatdb" configuration directive,Ingo Schwarze2020-02-102-14/+1
| | | | | which has a misleading syntax. It was declared obsolete and superseded by the "manpath" directive five years ago.
* Reduce the diff to OpenBSD by making FILES a list,Ingo Schwarze2020-02-101-1/+3
| | | | | even though it has only one entry in the portable version. Do not add /etc/examples/man.conf for the portable version, though.
* The man(1) command was already available in AT&T Version 2 UNIX.Ingo Schwarze2020-02-101-1/+1
| | | | | | | | Jonathan Gray found it in the "Combined Table of Contents" in Doug McIlroy's "A Research UNIX Reader", which contains a table of which edition manuals appeared in, and in both the "Table of Contents" (page vi) and the body (page 89) of the printed UNIX Programmer's Manual (June 12, 1972) from bitsavers.
* For compatibility with the man(1) implementations of the man-1.6Ingo Schwarze2020-02-102-3/+18
| | | | | | | | | | | | | | | | | | and man-db packages, print the manpath if the -w option is given without a following name argument. This quirk has been in man-1.6 since at least man-1.5e (1998) and in man-db since 2012. Using this feature in portable software is a dubious idea because the internal organization of manual page directories varies in about a dozen respects among operating systems, so even if you get the answer, there is no portable way to use it for looking up anything inside. However, Matej Cepl <mcepl at suse dot cz> made me aware that some software, for example the manual viewing functionality in the newest editors/neovim code, unwisely relies on this feature anyway. No objections were raised when this patch was shown on tech@.
* Make sure that -l always causes -w to be ignored, as documentedIngo Schwarze2020-02-061-2/+13
| | | | | in the man(1) manual page. This bugfix is needed to prevent the command "man -lw" from dereferencing a NULL pointer.
* No longer try to ask make(1) what the default compiler is, just use "cc".Ingo Schwarze2020-02-052-16/+9
| | | | | | | | That line was a bad idea in the first place, it tried to be too clever, and it failed in different ways on different platforms. Even when it succeeded, what make(1) considered the default wasn't always useful. Having a simple and robust default and asking users to override it when needed is better.
* Repair more of the issues that i found in filescan() while investigatingIngo Schwarze2020-01-261-35/+85
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | the report from <Andreas dot Kahari at abc dot se> on ports@: For a symlink, use the first of the following names that is available: 1. In -t mode, the symlink itself (unchanged). 2. When the (unresolved) symlink already resides inside the manpath, just strip the manpath and use the rest (unchanged). 3. When prefix(es) of the unresolved symlink point to the manpath, strip the longest such prefix and use the rest (new); this fixes situations where the manpath or one of its parent directories is a symlink and at the same time contains symlinks to manual pages. 4. Fall back to the fully resolved symlink, with the manpath stripped (new); this may for example happen when the command line passes symlinks from outside the manpath that point to manual pages inside the manpath, or if manual page trees contain symlinks to symlinks and not all of them are given on the command line. The fallback (4) isn't perfect. You can construct symlink spaghetti in such a way that this algorithm will not enter all manual page names into the database that a human would be able to deduce. But i do not expect such spaghetti to actually occur in practice (not even in ports), and a full fix would require re-implementing realpath(3) in terms of step-by-step readlink(2) calls, repeating the complicated algorithm (3) after each step. While here, also stop using PATH_MAX as the size of a static buffer in filescan(); on some systems, it can be unreasonably large. Instead, allocate path strings dynamically.
* Fix incorrect file type tests.Ingo Schwarze2020-01-261-2/+2
| | | | | | | | | | This bug caused sockets and character special devices to be accepted as manual pages if they appeared inside manpaths, and it caused incorrect file names to be entered into the database when the manpath or one of its parent directories was a symbolic link. This fixes the issues reported by <Andreas dot Kahari at abc dot se> on ports@, but additional issues remain when symbolic links are contained in a manpath that involves another symbolic link.
* Minor cleanup, no functional change:Ingo Schwarze2020-01-251-54/+59
| | | | | | | Do not abuse strstr(3) to check whether one long string starts with another long string. Instead, use strncmp(3) with the proper length. In set_basedir(), also reset *basedir in the error brances for extra safety. While here, invert some more Yoda conditions in the neighbourhood.
* Make the code more readable by introducingIngo Schwarze2020-01-206-25/+42
| | | | | | symbolic constants for tagging priorities. This review also made me find a minor bug: do not upgrade TAG_FALLBACK to TAG_WEAK when there is trailing whitespace.
* Introduce a new mdoc(7) macro .Tg ("tag") to explicitly mark a placeIngo Schwarze2020-01-1915-12/+125
| | | | | | | | | | | | | | | as defining a term. Please only use it when automatic tagging does not work. Manual page authors will not be required to add the new macro; using it remains optional. HTML output is still rudimentary in this version and will be polished later. Thanks to kn@ for reminding me that i have been considering since BSDCan 2014 whether something like this might be useful. Given that possibilities of making automatic tagging better are running out and there are still several situations where automatic tagging cannot do the job, i think the time is now ripe. Feedback and no objection from millert@; OK espie@ inoguchi@ kn@.
* Align to the new, sane behaviour of the groff_mdoc(7) .Dd macro:Ingo Schwarze2020-01-1928-75/+96
| | | | | | | | without an argument, use the empty string, and always concatenate all arguments, no matter their number. This allows reducing the number of arguments of mandoc_normdate() and some other simplifications, at the same time polishing some error messages by adding the name of the macro in question.
* delete the entry for a crash that was already fixedIngo Schwarze2020-01-191-8/+0
|
* test tbl_term.c rev. 1.73 and tbl_data.c rev. 1.53:Ingo Schwarze2020-01-116-11/+39
| | | | | incomplete short layout lines followed by longer lines, and spans at the beginning of layout lines
* When autogenerating one layout cell from a data cell just beyond theIngo Schwarze2020-01-111-0/+2
| | | | | | | | | last layout cell that was explicitly specified, properly initialize the spacing attribute to indicate that the default is to be used. Failing to do so and leaving the spacing at zero in this case caused misformatting when another row further down the table had even more explicitly specified cells. Bug found while trying to write regression tests for tbl_term.c rev. 1.73.
* Fix a logic error:Ingo Schwarze2020-01-111-13/+12
| | | | | | | | | | | | | When both the first and the third column are spans, do not use the number of columns of the span starting in column two for the span starting in column zero. With afl, Jan Schreiber <jes at posteo dot de> found cases where this caused NULL pointer accesses because too many layout cells were consumed. While here, make the code more similar at the three places that iterate over data cells.
* Print more tbl(7) details to help debugging:Ingo Schwarze2020-01-111-11/+94
| | | | column numbers, options, layout rows, cell types, cell modifiers.
* autocapitalize=none; also from Tim BaumgardIngo Schwarze2020-01-101-1/+2
|
* Switch off the useless and annoying "autocomplete" feature;Ingo Schwarze2020-01-101-1/+1
| | | | | issue reported by Tim Baumgard <at bmgrd dot com>. landry@ and florian@ agree with the general direction.
* Document the "delim" syntax and its usage.Ingo Schwarze2020-01-101-22/+29
| | | | | | | Closing a gap reported by bentley@, who also sent a patch, but i'm explaining it somewhat differently. While here, remove duplicate information from the text. OK bentley@
* Skip whitespace before tokens, too.Ingo Schwarze2020-01-085-5/+33
| | | | Bug found by bentley@ with input like "delim $$ delim off".
* Improve the test case by changing the eqn(7) delimiters such that itIngo Schwarze2020-01-082-7/+7
| | | | | actually tests which parts of text lines are processed with eqn(7) and which are not.
* Enable generation of the desired delim/basic output with groff(1).Ingo Schwarze2020-01-081-1/+3
| | | | No functional change for the portable test suite.
* Simplify maintainer targets in OpenBSD: EQN and TBL variablesIngo Schwarze2020-01-085-29/+8
| | | | | | no longer exist and NROFF/NOPTS were replaced with GROFF/GOPTS. This doesn't change how things work in the protable version of the test suite.
* Improve the description of -m/-M/MANPATH/man.conf in multiple respectsIngo Schwarze2020-01-071-33/+46
| | | | | | after kn@ reported that the descriptions were incomplete and somewhat inaccurate. OK jmc@ kn@
* When all cells in a tbl(1) column are empty, set the column widthIngo Schwarze2019-12-314-7/+114
| | | | | to 1n rather than to 0n, in the same way as groff does. This fixes misformatting reported by bentley@ in xkeyboard-config(7).
* Do not fail an assertion when a high level macro occurs in the bodyIngo Schwarze2019-12-261-1/+13
| | | | | | | of a conditional inside a .ce request block. Instead, abort the .ce block just like when there is no conditional in between. Bug found by espie@ working on the textproc/fstrcmp port.
* distinction between .Vt and .VaIngo Schwarze2019-12-251-0/+5
|
* two new entries: make .Sh/.Ss parsed in mdoc(7)Ingo Schwarze2019-12-221-0/+8
| | | | and delete release number verification from groff_mdoc(7)
* In HTML, display straight quotes, not curly quotes, for Qq/Qo/Qc macros.Ingo Schwarze2019-12-111-3/+7
| | | | | | This is the intended behavior and already the case in terminal output. Incorrect output noticed by Eldred Habert. Patch from bentley@.
* Add a Content-Security-Policy HTTP header that allows only CSS.Ingo Schwarze2019-11-101-0/+2
| | | | | | This ensures that in a modern browser that understands the header, mandoc rendering bugs cannot possibly be interpreted as JavaScript. Patch from bentley@.
* want to get rid of the last style= attributes, suggested by bentley@Ingo Schwarze2019-11-101-0/+5
|
* .ce .if .B crash reported by espie@, and one other bugIngo Schwarze2019-11-091-0/+13
|
* In the past, generating comment nodes stopped at the .TH or .DdIngo Schwarze2019-11-092-3/+8
| | | | | | | | | | | macro, which is usually close to the beginning of the file, right after the Copyright header comments. But espie@ found horrible input files in the textproc/fstrcmp port that generate lots of parse nodes before even getting to the header macro. In some formatters, comment nodes after some kinds of real content triggered assertions. So make sure generation of comment nodes stops once real content is encountered.