| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Add the missing special character \_ (underscore).
* Partial implementations of \a (leader character)
and \E (uninterpreted escape character).
* Parse and ignore \r (reverse line feed).
* Add a WARNING message about undefined escape sequences.
* Add an UNSUPP message about unsupported escape sequences.
* Mark \! and \? (transparent throughput)
and \O (suppress output) as unsupported.
* Treat the various variants of zero-width spaces as one-byte escape
sequences rather than as special characters, to avoid defining bogus
forms with square brackets.
* For special characters with one-byte names, do not define bogus
forms with square brackets, except for \[-], which is valid.
* In the form with square brackets, undefined special characters do not
fall back to printing the name verbatim, not even for one-byte names.
* Starting a special character name with a blank is an error.
* Undefined escape sequences never abort formatting of the input
string, not even in HTML output mode.
* Document the newly handled escapes, and a few that were missing.
* Regression tests for most of the above.
|
|
|
|
|
|
|
|
|
| |
for HTML output. Somewhat relevant because pod2man(1) relies on this.
Missing feature reported by Pali dot Rohar at gmail dot com.
Note that constant width font was already correctly selected before
this when required by semantic markup. Only attempting physical
markup with the low-level escape sequence was ineffective.
|
|
|
|
|
| |
by allowing the preprocessor to pass it through to the formatters.
Used for example by the groff_char(7) manual page.
|
|
|
|
| |
patch from florian@, found with clang
|
|
|
|
| |
used for example by zoem(1)
|
| |
|
|
|
|
| |
in horizontal orientation in the terminal formatter
|
| |
|
|
|
|
|
| |
inside individual table cells that contain text blocks.
This cures overlong lines in various Xenocara manuals.
|
|
|
|
|
| |
a pointer to the end of the parsed data, making it easier to
parse subsequent bytes
|
|
|
|
|
|
|
|
|
|
| |
second step: make the per-column byte pointer persistent across
term_flushln() calls, such that a subsequent call can continue at
the point where the previous call left. If more than one column
is in use, return from term_flushln() when the column is full,
rather than breaking the output line.
No functional change, because nothing sets up multiple columns yet.
|
|
|
|
|
|
| |
first step: split column data out of the terminal state struct into
a new column state struct and use an array of such column state
structs. No functional change.
|
|
|
|
|
|
| |
and after that, previously written output gets overwritten, but
overwriting with blanks does *not* erase previously written content.
Yes, manual pages exist that are crazy enough to rely on that...
|
|
|
|
|
|
| |
The Tcl/Tk manual pages use this extensively.
Delete the TERM_MAXMARGIN hack, it breaks .mc inside .nf;
instead, implement a proper TERMP_BRNEVER flag.
|
|
|
|
|
|
| |
Eliminate the "overstep" state variable.
The information is already contained in "viscol".
Minus 60 lines of code, no functional change intended.
|
|
|
|
|
|
|
|
|
|
|
| |
A full implementation would require access to output device properties
and state variables (both only available after the main parser has
finalized the parse tree) before numerical expansions in the roff
preprocessor (i.e., before the main parser is even started).
Not trying to pull that stunt right now because the static-width
implementation committed here is sufficient for tcl-style manual pages
and already more complicated than i would have suspected.
|
|
|
|
| |
Good enough to cope with the average DocBook insanity.
|
|
|
|
|
|
| |
This is the first feature made possible by the parser reorganization.
Improves the formatting of the SYNOPSIS in many Xenocara GL manuals.
Also important for ports, as reported by many, including naddy@.
|
|
|
|
| |
Reported by jsg@ after an afl(1) run long ago.
|
|
|
|
|
|
| |
sequences that jsg@ found with afl(1):
* Avoid writing \t\b in term.c.
* Handle trailing \b in term_ps.c.
|
| |
|
|
|
|
|
| |
where only sizeof(enum termfont) is needed.
Fixes CID 1288941. From christos@ via wiz@, both at NetBSD.
|
|
|
|
| |
fixing input like \fB\('e; issue reported by bentley@
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Use ohash(3) rather than a hand-rolled hash table.
* Make the character table static in the chars.c module:
There is no need to pass a pointer around, we most certainly
never want to use two different character tables concurrently.
* No need to keep the characters in a separate file chars.in;
that merely encourages downstream porters to mess with them.
* Sort the characters to agree with the mandoc_chars(7) manual page.
* Specify Unicode codepoints in hex, not decimal (that's the detail
that originally triggered this patch).
No functional change, minus 100 LOC, and i don't see a performance change.
|
|
|
|
|
|
| |
that were right between two adjacent case statement. Keep only
those 24 where the first case actually executes some code before
falling through to the next case.
|
| |
|
| |
|
|
|
|
|
| |
in mdoc(7) .Bl -tag and man(7) .TP, but not in man(7) .IP.
Quirk reported by Jan Stary <hans at stare dot cz> on ports@.
|
|
|
|
|
|
| |
escape sequences; that's cleaner for all output modes, and it's required
to prevent the PostScript/PDF formatter from dying on assertions.
Bug found by jsg@ with afl.
|
|
|
|
|
|
|
|
| |
implementation. As a side effect, minus ten lines of code.
As another side effect, this also fixes the assertion failure that
used to be triggered by "\z\o'ab'c" at the beginning of an output
line, found by jsg@ with afl (test case 022/Apr27).
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There is a first rounding to basic units on the input side.
After that, rounding rules differ between requests and macros.
Requests round to the nearest possible character position.
Macros round to the next character position to the left.
Implement that by changing the return value of term_hspan()
to basic units and leaving the second scaling and rounding stage
to the formatters instead of doing it in the terminal handler.
Improves for example argtable2(3).
|
|
|
|
|
| |
Replace struct mdoc_meta and struct man_meta by a unified struct roff_meta.
Written of the train from London to Exeter on the way to p2k15.
|
| |
|
|
|
|
|
|
| |
font stack. The latter fail after the stack is grown with realloc().
Fixing an assertion failure found by jsg@ with afl some time ago
(test case number 51).
|
|
|
|
|
|
| |
This is of some relevance because the pod2man(1) preamble abuses it
for the icelandic letter Thorn, instead of simply using \(TP and \(Tp.
Missing feature found by sthen@ in DateTime::Locale::is_IS(3p).
|
|
|
|
|
|
| |
Not exactly recommended for use, rather for groff compatibility.
While here, introduce similar SHRT_MAX limits as in man(7),
fixing a few cases of infinite output found by jsg@ with afl.
|
|
|
|
|
|
|
|
|
|
|
| |
indentations or paragraph distances, large output may be generated,
which is practically the same as an endless loop; found by jsg@
with afl.
Reject such unreasonably large numbers beyond arbitrary limits
similar to those used by groff (max. 65 blank lines between paragraphs
and max. SHRT_MAX characters per output line) and fall back to
defaults when exceeded. Having the limits behave in exactly the
same way is not relevant.
|
|
|
|
| |
minus twenty lines of code in spite of enhanced functionality
|
|
|
|
|
| |
Basic units, centimeters, points, ens, ems, and the rounding algorithm
were all wrong, only inches, pica, and the default vertical span worked.
|
|
|
|
|
|
| |
by calling assert() when valid user input exceeds it is a bad idea.
Allocate the terminal font stack dynamically instead of crashing
above 10 entries. Issue found by jsg@ with afl.
|
|
|
|
|
| |
In particular, make it work in no-fill mode, too.
Reminded by Carsten dot Kunze at arcor dot de (Heirloom roff).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
output handler because the high level terminal formatters could be
tricked into setting the left margin further to the right than the
right margin. Today, jsg@ found more of these with afl.
Change the internal interface between both levels, aiming for
simplicity and robustness of the code. Treat both margins as
*independent* settings: Now, termp.offset is the requested left
margin, and termp.rmargin is the available space. Let the lower
level cope with that case of insufficient space.
Obviously, high level code that does centering or flush right
still has to do careful checks, so i did a full audit of margin
settings in the terminal formatters.
Fixes crashes caused by excessively long title or date strings in
the man(7) footer, operating system or date strings in the mdoc(7)
footer, volume strings in the man(7) or mdoc(7) header, and a few
cases related to some non-prologue macros.
|
|
|
|
|
|
| |
the `vbl' variable includes the left margin, but `vis' does not.
Prevent a `vis' underflow that caused a bogus blank line.
Bug reported by Carsten Kunze, found in less(1): .Bl -tag ... .It " "
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
escape sequences just like it was earlier implemented for -Thtml.
Do not let control characters other than ASCII 9 (horizontal tab)
propagate to the output, even though groff allows them; but that
really doesn't look like a great idea.
Let mchars_num2char() return int such that we can distinguish invalid \N
syntax from \N'0'. This also reduces the danger of signed char issues
popping up.
|
|
|
|
|
|
| |
representation, not for character escapes with unknown names.
According to groff, the latter produce no output, and we now warn
about them.
|
|
|
|
|
|
|
|
| |
validity of character escape names and warn about unknown ones.
This requires mchars_spec2cp() to report unknown names again.
Fortunately, that doesn't require changing the calling code because
according to groff, invalid character escapes should not produce
output anyway, and now that we warn about them, that's fine.
|
|
|
|
|
|
|
|
|
|
| |
In UTF-8 output, do not print anything if mchars_spec2cp() returns 0.
In particular, this repairs handling of zero-width spaces (\&).
While here, let mchars_spec2cp() return 0xFFFD instead of -1
if the character is not found, simplifying the using code.
In HTML output, do not print obfuscated ASCII characters and
do not test for one-char escapes, mchars_spec2cp() already does that.
|
|
|
|
|
|
|
|
|
|
|
|
| |
code points, provide ASCII approximations. This is already much better
than what groff does, which prints nothing for most code points.
A few minor fixes while here:
* Handle Unicode escape sequences in the ASCII range.
* In case of errors, use the REPLACEMENT CHARACTER U+FFFD for -Tutf8
and the string "<?>" for -Tascii output.
* Handle all one-character escape sequences in mchars_spec2{cp,str}()
and remove the workarounds on the higher level.
|
|
|
|
|
|
|
|
| |
This happens in specific conditions (trailing whitespace in certain
terminal modes), but in practise, it happens quite often (as reported by
valgrind).
In short, "Nothing about term_flushln() is simple. Srsly!" (schwarze@)
Discussed on tech@, ok schwarze@.
|