| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
|
|
|
| |
escape sequences just like it was earlier implemented for -Thtml.
Do not let control characters other than ASCII 9 (horizontal tab)
propagate to the output, even though groff allows them; but that
really doesn't look like a great idea.
Let mchars_num2char() return int such that we can distinguish invalid \N
syntax from \N'0'. This also reduces the danger of signed char issues
popping up.
|
|
|
|
|
|
| |
representation, not for character escapes with unknown names.
According to groff, the latter produce no output, and we now warn
about them.
|
|
|
|
|
|
|
|
| |
validity of character escape names and warn about unknown ones.
This requires mchars_spec2cp() to report unknown names again.
Fortunately, that doesn't require changing the calling code because
according to groff, invalid character escapes should not produce
output anyway, and now that we warn about them, that's fine.
|
|
|
|
|
|
|
|
|
|
| |
In UTF-8 output, do not print anything if mchars_spec2cp() returns 0.
In particular, this repairs handling of zero-width spaces (\&).
While here, let mchars_spec2cp() return 0xFFFD instead of -1
if the character is not found, simplifying the using code.
In HTML output, do not print obfuscated ASCII characters and
do not test for one-char escapes, mchars_spec2cp() already does that.
|
|
|
|
|
|
|
|
|
|
|
|
| |
code points, provide ASCII approximations. This is already much better
than what groff does, which prints nothing for most code points.
A few minor fixes while here:
* Handle Unicode escape sequences in the ASCII range.
* In case of errors, use the REPLACEMENT CHARACTER U+FFFD for -Tutf8
and the string "<?>" for -Tascii output.
* Handle all one-character escape sequences in mchars_spec2{cp,str}()
and remove the workarounds on the higher level.
|
|
|
|
|
|
|
|
| |
This happens in specific conditions (trailing whitespace in certain
terminal modes), but in practise, it happens quite often (as reported by
valgrind).
In short, "Nothing about term_flushln() is simple. Srsly!" (schwarze@)
Discussed on tech@, ok schwarze@.
|
|
|
|
|
|
| |
Include <sys/types.h> where needed, it does not belong in config.h.
Remove <stdio.h> from config.h; if it is missing somewhere, it should
be added, but i cannot find a *.c file where it is missing.
|
|
|
|
|
|
|
|
|
| |
properly round to the nearest M (=0.001m), which is the smallest
available unit.
This avoids weirdness like (size_t)(0.6 * 10.0) == 5
by instead calculating (size_t)(0.6 * 10.0 + 0.0005) == 6,
and so it fixes the indentation of the readline(3) manual.
|
|
|
|
|
|
| |
Write double constants as double rather than integer literals.
Remove useless explicit (double) cast done at one place and nowhere else.
No functional change.
|
|
|
|
| |
do not throw away the rest of the string to be rendered.
|
|
|
|
|
|
|
| |
* Change eight reallocs to reallocarray to be safe from overflows.
* Change one malloc to reallocarray to be safe from overflows.
* Change one calloc to reallocarray, no zeroing needed.
* Change the order of arguments of three callocs (aesthetical).
|
|
|
|
|
| |
remove trailing whitespace and blanks before tabs, improve some indenting;
no functional change
|
|
|
|
|
|
|
|
|
|
| |
to control indentation of continuation lines in TERMP_NOBREAK mode.
In the past, this was always on; continue using it
for .Bl, .Nm, .Fn, .Fo, and .HP, but no longer for .IP and .TP.
I looked at this because sthen@ reported the issue in a manual
of a Perl module from ports, but it affects base, too: This patch
reduces groff-mandoc differences in base by more than 15%.
|
|
|
|
| |
when rendering .ll (line length) requests. oops.
|
| |
|
|
|
|
|
|
|
| |
functions used for multiple languages (mdoc, man, roff), for example
mandoc_escape(), mandoc_getarg(), mandoc_eos(), and generic auxiliary
functions. Split the auxiliaries out into their own file and header.
While here, do some #include cleanup.
|
|
|
|
|
|
|
| |
length even when they are breakable. Before this, a line containing N
breakable hyphens could get up to N characters wider than the right margin
in -Tutf8 output mode.
Issue reported by tedu@ on <bugs at OpenBSD>.
|
|
|
|
|
|
|
| |
documented in the Ossanna-Kernighan-Ritter troff manual
and also supported by groff.
Missing feature reported by Steffen Nurpmeso <sdaoden at gmail dot com>.
|
|
|
|
|
| |
and remove pointless local variables;
found in a clang output from Ulrich Spoerlein <uqs at FreeBSD>
|
|
|
|
|
| |
Following an idea from Franco Fichtner, but implemented more cleanly.
This reduces groff-mandoc-differences in OpenBSD base by a fantastic 7.5%.
|
|
|
|
|
| |
and the empty callback termp_igndelim_pre().
Sort the remaining termp flags.
|
|
|
|
|
|
| |
hanging indentation for .Fn in SYNOPSIS mode,
exploiting the new trailspace feature
by deliberately *NOT* using it.
|
|
|
|
|
|
|
|
|
|
|
| |
The TERMP_TWOSPACE flag i introduced in August 2009 was idiosyncratic
and served only a very narrow purpose. Replace it by a more intuitive
and more general termp attribute "trailspace", to be used together
with TERMP_NOBREAK, to request a minimum amount of whitespace at
the end of the current column. Adapt all code to the new interface.
No functional change intended;
code reviews to confirm that are welcome *eg*.
|
|
|
|
|
|
|
|
|
| |
from int to size_t, to match some existing ones (offset, *rmargin, viscol).
Move some related local variables from int to size_t as well.
Needed as a preparation to make a generalized adjbuf() function available
beyond the file term.c, i.e. in mandoc.c.
Also saves a couple of ugly casts.
|
|
|
|
|
| |
This improves the formatting of about 40 base manuals
and reduces groff-mandoc formatting differences in base by about 5%.
|
|
|
|
|
|
|
|
|
|
|
| |
against vend, causing a premature line break. Fix that bug by reverting
revision 1.93 which Kristaps committed four years ago. Kristaps patch is no
longer needed because the code below /* Write out the [remaining] word. */
now handles leading blanks correctly, probably already for a long time.
This avoids premature line breaks in about a dozen base system manuals,
for example as(1) and gdb(1), and alignment issues in another twenty,
for example mount(2), ip6(4), pfctl(8), and crypto(9).
|
|
|
|
|
|
|
| |
any text that follows must be kept on the same line.
I already found the issue and wrote the patch in April 2011,
but didn't come round to do proper testing and forgot about it.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The basic idea is to already pop the font at the end marker
instead of allowing it to linger until the final end of the block.
This requires a few preliminaries:
* For each block, save a pointer to the previous font
to be used in case the block breaks another and gets extended.
* That requires making node information writable during rendering.
* Now fonts may get popped in the wrong order; hence, after the stack
has already been rewound further by some block that began earlier,
ignore popping a font that was put on the stack later.
* To be able to exploit all this for font blocks, tie processing
to their body, not their block, which is more logical anyway.
Triggered by florian@ reporting vaguely similar issues with list blocks.
|
|
|
|
|
| |
at the position of a literal tab, the tab indents the following line.
Fixes the perl(1) SYNOPSIS; reminded by deraadt@; OpenBSD rev. 1.66.
|
|
|
|
|
|
|
|
| |
When the width of a tag in .Bl -hang was exactly one character
shorter than the maximum length that would fit, the following text
would have a negative hang of one character (i.e., hang to the left).
That bug is no longer present in groff-1.21, so relax mandoc, too.
OpenBSD rev. 1.65
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
character without advancing the cursor position; implement it to
simply skip the next character, as it will usually be overwritten.
With this change, the pod2man(1) preamble user-defined string \*:,
intended to render as a diaeresis or umlaut diacritic above the
preceding character, is rendered in a slightly less ugly way,
though still not correctly. It was rendered as "z.." and is now
rendered as ".".
Given that the definition of \*: uses elaborate manual \h positioning,
there is little chance for mandoc(1) to ever render it correctly,
but at least we can refrain from printing out a spurious "z", and
we can make the \z do something semi-reasonable for easier cases.
"just commit" kristaps@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Groff forces the document author to manually request sufficient spacing
after .TE - that is, at least .sp 1v after a table with the "box" option
and at least .sp 2v after a table with the "doublebox" option - or else
it clobbers the box. I consider that insane, so i'm not imitating groff
in that respect. Instead, i add at least as much vertical space as groff,
or more where required to avoid clobbering the box.
Consequently, output will be identical for input that looks sane with
groff, and mandoc will make output look better for input that looks bad
with groff.
"Please check them in and I'll look into them later!" kristaps@
|
|
|
|
|
|
|
| |
forget about pending whitespace (vbl), or the next line would
be misaligned and potentially too long; but i'm fixing this
in a simpler way than he proposed.
Also remove the kludges in .HP that compensated for this bug.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In columnated contexts (.Bl -column, .Bl -tag, .IP, .TP, .HP etc.), do not
pad after writing a column. Instead, always pad before writing content.
In itself, this change avoids:
- writing trailing whitespace in some situations
- with .fi/.nf in .HP, breaking lines that were already padded
It allows several bugfixes included in this patch:
- Do not count backspace as a character with positive width.
- Set up proper indentation when encountering .fi/.nf in .HP.
- Adjust the .HP indentation width to what groff does.
- Never unlimit the right margin unless in the final column.
ok kristaps@
|
|
|
|
|
| |
even a breakable hyphen may be bold or underlined
ok kristaps@
|
|
|
|
| |
found while syncing to OpenBSD
|
|
|
|
| |
the libroff point. This clears up a nice chunk of code.
|
|
|
|
|
| |
special characters, if possible. This is broken into a separate switch
statement for clarity.
|
|
|
|
|
| |
This makes sequences of \f[unknown] \fP not completely puke. From a
TODO by schwarze@.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
it looks pretty good. Basically, the -Tlocale option propogates into
term_ascii.c, where we set locale-specific console call-backs IFF (1)
setlocale() works; (2) locale support is compiled in (see Makefile for
-DUSE_WCHAR); (3) the internal structure of wchar_t maps directly to
Unicode codepoints as defined by __STDC_ISO_10646__; and (4) the console
supports multi-byte characters.
To date, this configuration only supports GNU/Linux. OpenBSD doesn't
export __STDC_ISO_10646__ although I'm told by stsp@openbsd.org that it
should (it has the correct map). Apparently FreeBSD is the same way.
NetBSD? Don't know. Apple also supports this, but doesn't define the
macro. Special-casing!
Benchmark: -Tlocale incurs less than 0.2 factor overhead when run
through several thousand manuals when UTF8 output is enabled. Native
mode (whether directly -Tascii or through no locale or whatever) is
UNCHANGED: the function callbacks are the same as before.
Note. If the underlying system does NOT support STDC_ISO_10646, there
is a "slow" version possible with iconv or other means of flipping from
a Unicode codepoint to a wchar_t.
|
|
|
|
|
|
|
| |
like -Tascii. While adding this, inline term_alloc() (was a one-liner),
remove some switches around the terminal encoding for the symbol table
(unnecessary), and split out ascii_alloc() into ascii_init(), which is
also called from locale_init().
|
|
|
|
|
|
| |
The reasoning behind printing SOMETHING at a Unicode codepoint is
because the input is not "wrong" (we suppress printing of "wrong"
things). It's just that ASCII can't handle it.
|
|
|
|
|
|
| |
only used once and simply bloated the binary. Also fix mchars_num2char
to correctly render the character instead of using atoi(). This makes
the conversation more strict, but it's more correct.
|
|
|
|
| |
(oops). Do the same for -Thtml (oops^2).
|
| |
|
|
|
|
| |
indexing into arrays, so this removes lots of casts from size_t to int.
|
|
|
|
|
|
| |
consist of type "int". This will take more work (especially in encode and
friends), but this is a strong start. This commit also consists of some
harmless lint fixes.
|
| |
|
|
|
|
|
|
|
|
|
|
| |
change any code but for renaming functions and types to be consistent
with other mandoc.h stuff. The reason for moving into libmandoc is that
the rendering of special characters is part of mandoc itself---not an
external part. From mandoc(1)'s perspective, this changes nothing, but
for other utilities, it's important to have these part of libmandoc.
Note this isn't documented [yet] in mandoc.3 because there are some
parts I'd like to change around beforehand.
|