| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
| |
not have mmap(), from what I can tell).
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
|
|
| |
commit adds parsing of "File Variables" in the first two lines in order
to grok the encoding. This completes groff's recognition sequence (-e,
BOM, File variables, -D, default). I've also cleaned up the manual to
indicate this and for some general readability.
preconv is now compiled by default in the Makefile.
|
| |
|
|
|
|
| |
manual regarding its output and `Nd' sentence.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
the \[uNNNN] strings (taking into account big-endian archs). Also allow
it to determine from the BOM whether it's a UTF-8 file. Also add the
initial manual. This has been tested over a random selection of UTF-8
documents, as
% preconv -e utf-8 foo.1 | ./mandoc -Tlocale
where -Tlocale is allowed (-DUSE_WCHAR).
Note that we're still missing the "type" indicator that preconv accepts.
|
|
|
|
|
|
| |
string instead of passing it along to libmdoc/libman (where it'll be
printed verbatim, now). This is what groff seems to do, too (of course
without a warning).
|
|
|
|
| |
some language for clarity.
|
|
|
|
|
|
|
|
| |
version and let it grow in-tree. Right now, this only supports the
Latin-1 and US-ASCII encoding. I'll do UTF-8 next. It's
call-compatible with GNU's preconv although I don't do fancy stuff like
BOM or header check. This will come. I used read.c's file-grokking
code.
|
| |
|
|
|
|
|
| |
spec2cp never needs to fall through to spec2str. Then clean out html.c
of its unnecessary print_res() function.
|
|
|
|
| |
the libroff point. This clears up a nice chunk of code.
|
|
|
|
|
| |
to predefs.in. This also makes "BOTH" entries directly into CHAR. The
res2str and spec2str are now effectively the same function.
|
|
|
|
|
|
|
|
|
|
|
| |
within roff.c. These are now grokked from a table in the roff
allocation routine and rest in the newly-created predefs.in (for
consistency with chars.in). This is a first implementation and will
likely be optimised along with the ds/de lookup table itself.
This allows mandoc-defined predefined strings to be correctly removed or
whatnot; earlier they couldn't. What will follow is the stripping-away
of all predefined-string crud in the other parts of the system.
|
|
|
|
|
|
| |
ccond(). Fix the text handler to behave like the macro handler
regarding escaped \}. Make \} actually become a zero-width space, too,
and clean up the documentation in this regard.
|
|
|
|
|
|
|
|
|
|
| |
the `\}' not being directly after the `.br'. Now we check for `\}' in
arbitrary parts of the line, and account for if it's escaped in funny
ways.
This behaviour diverges somewhat from groff in that the text at and
following the `\}' is lost, while groff keeps it (sort-of). I'll add a
COMPATIBILITY note to this effect.
|
| |
|
|
|
|
| |
from jmc@
|
| |
|
|
|
|
|
|
| |
little else. Also remove the check for __STDC_ISO_10646__. It turns
out that very few systems---even those that support it---actually
declare this and it's just causing problems instead of being useful.
|
|
|
|
|
| |
special characters, if possible. This is broken into a separate switch
statement for clarity.
|
|
|
|
| |
want to get some notes in).
|
|
|
|
|
|
|
|
| |
defined, so remove the check for it and leave it up to people compiling
the software (DOWNSTREAM) to take care of this. This will eventually
need to be fixed up with a proper non-10646 converter and so on, but
this is a simple start. While here, strengthen then language in the
Makefile to this effect.
|
|
|
|
|
| |
This makes sequences of \f[unknown] \fP not completely puke. From a
TODO by schwarze@.
|
|
|
|
| |
Pankov). Also fix typo in Makefile, same reporter. Thanks!
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
it looks pretty good. Basically, the -Tlocale option propogates into
term_ascii.c, where we set locale-specific console call-backs IFF (1)
setlocale() works; (2) locale support is compiled in (see Makefile for
-DUSE_WCHAR); (3) the internal structure of wchar_t maps directly to
Unicode codepoints as defined by __STDC_ISO_10646__; and (4) the console
supports multi-byte characters.
To date, this configuration only supports GNU/Linux. OpenBSD doesn't
export __STDC_ISO_10646__ although I'm told by stsp@openbsd.org that it
should (it has the correct map). Apparently FreeBSD is the same way.
NetBSD? Don't know. Apple also supports this, but doesn't define the
macro. Special-casing!
Benchmark: -Tlocale incurs less than 0.2 factor overhead when run
through several thousand manuals when UTF8 output is enabled. Native
mode (whether directly -Tascii or through no locale or whatever) is
UNCHANGED: the function callbacks are the same as before.
Note. If the underlying system does NOT support STDC_ISO_10646, there
is a "slow" version possible with iconv or other means of flipping from
a Unicode codepoint to a wchar_t.
|
|
|
|
|
|
|
| |
like -Tascii. While adding this, inline term_alloc() (was a one-liner),
remove some switches around the terminal encoding for the symbol table
(unnecessary), and split out ascii_alloc() into ascii_init(), which is
also called from locale_init().
|
|
|
|
|
|
| |
(found by Yuri Pankov). This was due to looking for modifiers for the
vertical bar. This has been fixed, along with other special-key layout
types.
|
|
|
|
|
| |
and remove some long-fixed notes in sthe same section. Also, add an
`Lb' for the mandoc library to mandoc.3 (noted by Sascha Wildner).
|
|
|
|
|
|
| |
The reasoning behind printing SOMETHING at a Unicode codepoint is
because the input is not "wrong" (we suppress printing of "wrong"
things). It's just that ASCII can't handle it.
|
| |
|
|
|
|
| |
Make buffmt functions internally bufinit(), too.
|
|
|
|
|
|
| |
bufcat_id(), then collapse it into a little function without so much
crap. Next, make bufinit() only be called when we really need to do so,
and not simply before pre/post calls.
|
|
|
|
|
|
|
|
|
|
| |
conditional; same for print_xmltype() and print_doctype(), same reason;
make bufncat() be static, as it was only being called from html.c;
have bufcat() simply call through to strlcat(). Finally, assert()
whenever we truncate.
Also rename buffmt() -> bufcat_fmt() to differentiate from buffmt_man et
al., which do not concatenate.
|
|
|
|
|
| |
the roff units. Also remove a comment about CSS and number types (they
all accept decimal numbers).
|
|
|
|
|
|
| |
only used once and simply bloated the binary. Also fix mchars_num2char
to correctly render the character instead of using atoi(). This makes
the conversation more strict, but it's more correct.
|
|
|
|
| |
(oops). Do the same for -Thtml (oops^2).
|
|
|
|
|
|
| |
http://mdocml.bsd.lv/archives/tech/0368.html
For the time being, we just throw it away.
|
| |
|
|
|
|
| |
it really need to be fixed, anyway).
|
|
|
|
|
|
| |
which only held one entry; finally (as per the first), make "ps" member into a
pointer managed by term_ps.c. This frees up a nice chunk of memory during
run-time and in the binary.
|
| |
|
| |
|
|
|
|
| |
and nicer.
|
| |
|
|
|
|
| |
indexing into arrays, so this removes lots of casts from size_t to int.
|
|
|
|
|
|
| |
consist of type "int". This will take more work (especially in encode and
friends), but this is a strong start. This commit also consists of some
harmless lint fixes.
|
|
|
|
|
| |
widths (e.g., `Bl -tag -width "\s[blahblah]bar"). This has long since
been done for -Tascii but escaped noticed with -T[x]html.
|