| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
in tag.{h,c} and {mdoc,man}_validate.c
and into a formatting part including command line argument checking
in term_tag.{h,c}, html.c, and {mdoc|man}_{term|html}.c.
Immediate functional benefits include:
* Improved prioritization of automatic tags for .Em and .Sy.
* Avoiding bogus automatic tags when .Em, .Fn, or .Sy are explicitly tagged.
* Explicit tagging of .Er and .Fl now works in HTML output.
* Automatic tagging of .IP and .TP now works in HTML output.
But mainly, this patch provides clean earth to build further improvements on.
Technical changes:
* Main program: Write a tag file for ASCII and UTF-8 output only.
* All formatters: There is no more need to delay writing the tags.
* mdoc(7)+man(7) formatters: No more need for elaborate syntax tree inspection.
* HTML formatter: If available, use the "string" attribute as the tag.
* HTML formatter: New function to write permalinks, to reduce code duplication.
Style cleanup in the vicinity while here:
* mdoc(7) terminal formatter: To set up bold font for children,
defer to termp_bold_pre() rather than calling term_fontpush() manually.
* mdoc(7) terminal formatter: Garbage collect some duplicate functions.
* mdoc(7) HTML formatter: Unify <code> handling, delete redundant functions.
* Where possible, use switch statements rather than if cascades.
* Get rid of some more Yoda notation.
The necessity for such changes was first discussed with kn@, but i didn't
bother him with a request to review the resulting -673/+782 line patch.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
as defining a term. Please only use it when automatic tagging does
not work. Manual page authors will not be required to add the new
macro; using it remains optional. HTML output is still rudimentary
in this version and will be polished later.
Thanks to kn@ for reminding me that i have been considering since
BSDCan 2014 whether something like this might be useful. Given
that possibilities of making automatic tagging better are running
out and there are still several situations where automatic tagging
cannot do the job, i think the time is now ripe.
Feedback and no objection from millert@; OK espie@ inoguchi@ kn@.
|
| |
|
|
|
|
|
|
|
| |
Tested on the complete manual page trees of Version 7 AT&T UNIX,
4.4BSD-Lite2, POSIX-2013, OpenBSD 2.2 to 6.5 and -current,
FreeBSD 10.0 to 12.0, NetBSD 6.1.5 to 8.1, DragonFly 3.8.2 to 5.6.1,
and Linux 4.05 to 5.02.
|
|
|
|
|
|
|
|
| |
text without printing an opening tag right away, and use that in
the .ft request handler. While here, garbage collect redundant
enum htmlfont and reduce code duplication in print_text().
Fixing an assertion failure reported by Michael <Stapelberg at Debian>
in pmRegisterDerived(3) from libpcp3-dev.
|
|
|
|
|
|
| |
as recommended for accessibility by the HTML 5 standard.
Triggered by a similar, but slightly different suggestion
from Laura Morales <lauretas at mail dot com>.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
which establish phrasing context, but they can contain paragraph
breaks (which is relevant for terminal formatting, so we can't just
change the structure of the syntax tree), which are respresented
by <p> elements and cannot occur inside <a>.
Fix this by prematurely closing the <a> element in the HTML formatter.
This menas that the clickable text in HTML output is shorter than
what is represented as the link text in terminal output, but in
HTML, it is frankly impossible to have the clickable area of a
hyperlink extend across a paragraph break. The difference in
presentation is not a major problem, and besides, paragraph breaks
inside .UR are rather poor style in the first place.
The implementation is quite tricky. Naively closing out the <a>
prematurely would result in accessing a stale pointer when later
reaching the physical end of the .UR block. So this commit separates
visual and structural closing of "struct tag" stack items. Visual
closing means that the HTML element is closed but the "struct tag"
remains on the stack, to avoid later access to a stale pointer and
to avoid closing the same HTML element a second time later.
This also needs reference counting of pointers to "struct tag" stack
items because often more than one child holds a pointer to the same
parent item, and only the outermost child can safely do the physical
closing.
In the whole corpus of nearly half a million manual pages on
man.openbsd.org, this problem occurs in exactly one page: the
groff(1) version 1.20.1 manual contained in DragonFly-3.8.2, which
contains a formatting error triggering the bug.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
by the <p> HTML element and use the html_fillmode() mechanism
for .Bd -unfilled, just like it was done for man(7) earlier, finally
getting rid both of the horrible <div class="Pp"></div> hack and
of the worst HTML syntax violations caused by nested displays.
Care is needed because in some situations, paragraphs have to remain
open across several subsequent macros, whereas in other situations,
they must get closed together with a block containing them.
Some implementation details include:
* Always close paragraphs before emitting HTML flow content.
* Let html_close_paragraph() also close <pre> for extra safety.
* Drop the old, now unused function print_paragraph().
* Minor adjustments in the top-level man(7) node formatter for symmetry.
* Bugfix: .Ss heads suspend no-fill mode, even though .Ss doesn't end it.
* Bugfix: give up on .Op semantic markup for now, see the comment.
|
|
|
|
|
|
|
|
|
|
|
| |
choice, which is the <p> HTML element. On top of the previous
fill-mode improvements, the key to making this possible is to
automatically close the <p> when required: before headers, subsequent
paragraphs, lists, indented blocks, synopsis blocks, tbl(7) blocks,
and before blocks using no-fill mode.
In man(7) documents, represent the .sp request by a blank line in
no-fill mode and in the same way as .PP in fill mode.
|
|
|
|
|
|
|
|
|
|
| |
use it in the man(7) HTML formatter rather than keeping fill mode
state locally, resulting in massive simplification (minus 40 LOC).
Move the html_fillmode() state handler function to the html.c module
such that both the man(7) and the roff(7) formatter (and in the future,
also the mdoc(7) formatter) can use it. Give it a query mode, to be
invoked with TOKEN_NONE.
|
| |
|
|
|
|
|
|
|
|
|
| |
Unify handling of \f and .ft.
Support \f4 (bold+italic).
Support ".ft BI" and ".ft CW" for terminal output.
Support the .ft request in HTML output.
Reject the bogus fonts \f(C1, \f(C2, \f(C3, and \f(CP.
In regress.pl, only strip leading whitespace in math mode.
|
|
|
|
|
|
|
|
|
| |
for HTML output. Somewhat relevant because pod2man(1) relies on this.
Missing feature reported by Pali dot Rohar at gmail dot com.
Note that constant width font was already correctly selected before
this when required by semantic markup. Only attempting physical
markup with the low-level escape sequence was ineffective.
|
|
|
|
|
| |
the top of HTML pages containing at least two non-standard sections.
Suggested by Adam Kalisz and discussed with kristaps@ during EuroBSDCon 2018.
|
|
|
|
|
|
| |
selecting the format according to local existence of the file.
Suggested by kristaps@ during EuroBSDCon 2018.
Written on the train Frankfurt-Karlsruhe returning from EuroBSDCon.
|
|
|
|
| |
now that we no longer use variable style= attributes.
|
|
|
|
|
|
| |
author-specified column widths, which can harm responsive design and
provide no real benefit: HTML rendering engines usually do just
fine automatically selecting appropriate column widths.
|
|
|
|
|
| |
Append suffixes for disambiguation. Issue first reported by Jakub
Klinkovsky <j dot l dot k at gmx dot com> (Arch Linux).
|
|
|
|
|
|
|
|
|
| |
Some macros (Nd, Oo) can contain blocks but rendered as elements that
can only contain phrasing content, resulting in invalid HTML nesting.
Switch them to <div>.
Also move the related "display: inline" style from the HTML to the CSS.
Reminded during a conversation with John Gardner.
|
|
|
|
|
| |
in full HTML output, but not with -Ofragment, e.g. in man.cgi(8);
suggested by Thomas Klausner <wiz at NetBSD>
|
|
|
|
|
|
| |
of struct roff_node which is allocated for each equation anyway.
2. Do not keep a list of equation parsers, one parser is enough.
Minus fifty lines of code, no functional change.
|
|
|
|
|
| |
and write fontstyle or fontweight attributes where required.
Missing features reported by bentley@.
|
|
|
|
|
|
|
| |
used by both the mdoc and man formatters, with the ultimate
goal of reducing code duplication between the two macro formatters.
Made possible by the parser unification.
Add the first formatting function (for the .br request).
|
|
|
|
|
| |
As the man(7) language does not provide semantic markup,
only .SH, .SS, and .UR become anchors for now.
|
|
|
|
| |
suggested by bentley@ long ago, but needed lots of cleanup first
|
| |
|
|
|
|
|
|
|
|
| |
The <col> element can only appear inside <colgroup>, so use <colgroup>.
The <tbody> element is optional and useless, so don't use it.
Even if we would ever need <thead> or <tfoot>, <tbody> would still be
optional and useless; besides, we will likely never need <thead> or <tfoot>,
simply because our languages don't support such functionality.
|
|
|
|
| |
no functional change
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
with "literal", by the way, it means "no fill"):
* Use <pre> such that whitespace is preserved.
* Preserve lines breaks.
* For font alternating macros, avoid node recursion which required
scary juggling with the fill state. Instead, simply print the text
children directly.
Missing feature first noticed by kristaps@ in 2011,
the again reported by afresh1@ in 2016,
and finally reported here: https://github.com/Debian/debiman/issues/21 ,
which i only found because of Shane Kerr's comment here:
https://plus.google.com/110314300533310775053/posts/H1eaw9Yskoc
|
|
|
|
| |
in particular, stop abuse of <blockquote>
|
|
|
|
|
|
|
|
|
| |
in filled text. This does not affect HTML semantics, but makes the
HTML code even more humanly readable.
While here,
- collapse multiple consecutive space characters in filled text
- and insert a blank between style entries.
|
|
|
|
|
| |
around tags and by introducing some simple indentation.
No change of HTML semantics intended.
|
|
|
|
|
|
|
|
|
| |
interfaces. Such a static buffer was a bad idea in the first place,
causing unfixable truncation that was only prevented by triggering
an assertion failure. Instead, let the small number of remaining
users allocate and free their own, temporary dynamic buffers,
or for the case of .Xr and .In, pass the original data to be
assembled in print_otag().
|
|
|
|
|
|
|
|
|
|
| |
number of arguments.
Delete struct htmlpair and all the PAIR_*() macros.
Delete enum htmlattr, handle that in print_otag() instead.
Minus 190 lines of code; no functional change except better ordering
of attributes (class before style) in three cases.
|
|
|
|
|
|
|
| |
Triggered by a smaller patch from Christos Zoulas.
While here, unify style, move several config tests to config.h,
and delete the useless MANDOC_CONFIG_H.
|
|
|
|
|
|
|
|
|
|
|
|
| |
Because these work slightly differently on different systems,
they are becoming a maintenance burden in the portable version,
so delete them.
Besides, one of the chief design goals of the mandoc toolbox is to
make sure that nothing related to documentation requires C++.
Consequently, linking mandoc against any kind of C++ program would
defeat the purpose and is not supported.
I don't understand why kristaps@ added them in the first place.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Use ohash(3) rather than a hand-rolled hash table.
* Make the character table static in the chars.c module:
There is no need to pass a pointer around, we most certainly
never want to use two different character tables concurrently.
* No need to keep the characters in a separate file chars.in;
that merely encourages downstream porters to mess with them.
* Sort the characters to agree with the mandoc_chars(7) manual page.
* Specify Unicode codepoints in hex, not decimal (that's the detail
that originally triggered this patch).
No functional change, minus 100 LOC, and i don't see a performance change.
|
|
|
|
|
| |
In particular, make it work in no-fill mode, too.
Reminded by Carsten dot Kunze at arcor dot de (Heirloom roff).
|
|
|
|
|
|
| |
* add missing forward declarations
* remove needless header inclusions
* some style unification
|
| |
|
|
|
|
|
|
|
|
| |
validity of character escape names and warn about unknown ones.
This requires mchars_spec2cp() to report unknown names again.
Fortunately, that doesn't require changing the calling code because
according to groff, invalid character escapes should not produce
output anyway, and now that we warn about them, that's fine.
|
| |
|
|
|
|
|
|
|
|
|
|
| |
This adds parser-level support for the grammar described by the eqn
second-edition technical paper, "Typesetting Mathematics — User's Guide"
(Kernighan, Cherry).
The reason for this re-write is the grouping rules, which were not
possible given the existing implementation.
The re-write has also considerably simplified the HTML (and, if it ever
is completed, terminal) front-end.
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
| |
This is good because <p> is brittle: it can't appear within other block
macros.
This fixes a regression of the original HTML5 patch as noted by schwarze@
on the tech@ list, 14/8/2014.
|
| |
|
| |
|
| |
|