mandoc - UNIX manpage compiler toolset

	Commit message (Collapse)	Author	Age	Files	Lines
*	At least in theory, this patch lets us compile on Windows (which does	Kristaps Dzonsons	2011-05-26	3	-3/+23
\| \| \| \|	not have mmap(), from what I can tell).
*	More updates to www. A version's on its way...	Kristaps Dzonsons	2011-05-26	1	-14/+17
\|
*	Have preconv install with mandoc.	Kristaps Dzonsons	2011-05-26	1	-2/+2
\|
*	Slightly clean up verbiage in coding tags.	Kristaps Dzonsons	2011-05-26	1	-2/+2
\|
*	preconv is now on encoding-recognition parity with groff. This last	Kristaps Dzonsons	2011-05-26	3	-29/+150
\| \| \| \| \| \| \| \| \|	commit adds parsing of "File Variables" in the first two lines in order to grok the encoding. This completes groff's recognition sequence (-e, BOM, File variables, -D, default). I've also cleaned up the manual to indicate this and for some general readability. preconv is now compiled by default in the Makefile.
*	The \*q predef certainly doesn't map to \"! Fix this.	Kristaps Dzonsons	2011-05-26	1	-1/+1
\|
*	Add notes about preconv.1 in the www and change some wording in the	Kristaps Dzonsons	2011-05-26	2	-1/+22
\| \| \| \|	manual regarding its output and `Nd' sentence.
*	Significantly improve preconv. Allow it to recode UTF-8 characters into	Kristaps Dzonsons	2011-05-26	3	-12/+260
\| \| \| \| \| \| \| \| \| \| \| \| \|	the \[uNNNN] strings (taking into account big-endian archs). Also allow it to determine from the BOM whether it's a UTF-8 file. Also add the initial manual. This has been tested over a random selection of UTF-8 documents, as % preconv -e utf-8 foo.1 \| ./mandoc -Tlocale where -Tlocale is allowed (-DUSE_WCHAR). Note that we're still missing the "type" indicator that preconv accepts.
*	If a predefined string is missing, emit a warning and make it an empty	Kristaps Dzonsons	2011-05-26	1	-5/+6
\| \| \| \| \| \|	string instead of passing it along to libmdoc/libman (where it'll be printed verbatim, now). This is what groff seems to do, too (of course without a warning).
*	Noticed that our skeleton mdoc.7 had lower-case `Dt'. Fixed and added	Kristaps Dzonsons	2011-05-26	2	-8/+13
\| \| \| \|	some language for clarity.
*	It's annoying that we don't have preconv, so throw together a quick	Kristaps Dzonsons	2011-05-26	2	-3/+332
\| \| \| \| \| \| \| \|	version and let it grow in-tree. Right now, this only supports the Latin-1 and US-ASCII encoding. I'll do UTF-8 next. It's call-compatible with GNU's preconv although I don't do fancy stuff like BOM or header check. This will come. I used read.c's file-grokking code.
*	Document that spec2cp never returns 0.	Kristaps Dzonsons	2011-05-24	1	-3/+1
\|
*	Use the correct Unicode value for the zero-width space, which means that	Kristaps Dzonsons	2011-05-24	2	-28/+11
\| \| \| \| \|	spec2cp never needs to fall through to spec2str. Then clean out html.c of its unnecessary print_res() function.
*	Remove all references to ESCAPE_PREDEF, which is now not exposed passed	Kristaps Dzonsons	2011-05-24	6	-109/+0
\| \| \| \|	the libroff point. This clears up a nice chunk of code.
*	Remove predefined strings from the chars.in file, as they're now local	Kristaps Dzonsons	2011-05-24	2	-70/+26
\| \| \| \| \|	to predefs.in. This also makes "BOTH" entries directly into CHAR. The res2str and spec2str are now effectively the same function.
*	Most important move in getting predefined strings entirely contained	Kristaps Dzonsons	2011-05-24	3	-2/+88
\| \| \| \| \| \| \| \| \| \| \|	within roff.c. These are now grokked from a table in the roff allocation routine and rest in the newly-created predefs.in (for consistency with chars.in). This is a first implementation and will likely be optimised along with the ds/de lookup table itself. This allows mandoc-defined predefined strings to be correctly removed or whatnot; earlier they couldn't. What will follow is the stripping-away of all predefined-string crud in the other parts of the system.
*	Have conditional closure for both text and macro lines call through to	Kristaps Dzonsons	2011-05-24	2	-32/+23
\| \| \| \| \| \|	ccond(). Fix the text handler to behave like the macro handler regarding escaped \}. Make \} actually become a zero-width space, too, and clean up the documentation in this regard.
*	Fix a TODO to the effect that `.if n \{\ foo .br \}' was failing due to	Kristaps Dzonsons	2011-05-24	2	-15/+23
\| \| \| \| \| \| \| \| \| \|	the `\}' not being directly after the `.br'. Now we check for `\}' in arbitrary parts of the line, and account for if it's escaped in funny ways. This behaviour diverges somewhat from groff in that the text at and following the `\}' is lost, while groff keeps it (sort-of). I'll add a COMPATIBILITY note to this effect.
*	nested .RS/.RE is becoming more important	Ingo Schwarze	2011-05-21	1	-0/+2
\|
*	remove a sentence which isn;t true;	Ingo Schwarze	2011-05-21	1	-2/+0
\| \| \| \|	from jmc@
*	Turn on -Tutf8 in the frontend. Here we go!	Kristaps Dzonsons	2011-05-20	2	-3/+22
\|
*	Flip on -Tutf8 backend support. This forces the UTF-8 LC_CTYPE and does	Kristaps Dzonsons	2011-05-20	3	-4/+18
\| \| \| \| \| \|	little else. Also remove the check for __STDC_ISO_10646__. It turns out that very few systems---even those that support it---actually declare this and it's just causing problems instead of being useful.
*	Allow non-ASCII terminal encodings to accept unicode values for the	Kristaps Dzonsons	2011-05-20	1	-17/+61
\| \| \| \| \|	special characters, if possible. This is broken into a separate switch statement for clarity.
*	Some release notes (this isn't signalling an impending release; I just	Kristaps Dzonsons	2011-05-19	1	-0/+12
\| \| \| \|	want to get some notes in).
*	It seems that __STDC_ISO_10646__ isn't defined even where it can be	Kristaps Dzonsons	2011-05-19	2	-3/+12
\| \| \| \| \| \| \| \|	defined, so remove the check for it and leave it up to people compiling the software (DOWNSTREAM) to take care of this. This will eventually need to be fixed up with a proper non-10646 converter and so on, but this is a simple start. While here, strengthen then language in the Makefile to this effect.
*	Make any un-recognised font be considered a call for the Roman font.	Kristaps Dzonsons	2011-05-18	3	-5/+6
\| \| \| \| \|	This makes sequences of \f[unknown] \fP not completely puke. From a TODO by schwarze@.
*	Add TODO entry for standalone `.' in tbl pages (pointed out by Yuri	Kristaps Dzonsons	2011-05-18	2	-1/+5
\| \| \| \|	Pankov). Also fix typo in Makefile, same reporter. Thanks!
*	Locale support. I'm checking this in to clean up fall-out in-tree, but	Kristaps Dzonsons	2011-05-17	4	-20/+138
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	it looks pretty good. Basically, the -Tlocale option propogates into term_ascii.c, where we set locale-specific console call-backs IFF (1) setlocale() works; (2) locale support is compiled in (see Makefile for -DUSE_WCHAR); (3) the internal structure of wchar_t maps directly to Unicode codepoints as defined by __STDC_ISO_10646__; and (4) the console supports multi-byte characters. To date, this configuration only supports GNU/Linux. OpenBSD doesn't export __STDC_ISO_10646__ although I'm told by stsp@openbsd.org that it should (it has the correct map). Apparently FreeBSD is the same way. NetBSD? Don't know. Apple also supports this, but doesn't define the macro. Special-casing! Benchmark: -Tlocale incurs less than 0.2 factor overhead when run through several thousand manuals when UTF8 output is enabled. Native mode (whether directly -Tascii or through no locale or whatever) is UNCHANGED: the function callbacks are the same as before. Note. If the underlying system does NOT support STDC_ISO_10646, there is a "slow" version possible with iconv or other means of flipping from a Unicode codepoint to a wchar_t.
*	Add mode for -Tlocale. This mode, with this commit, behaves exactly	Kristaps Dzonsons	2011-05-17	8	-37/+37
\| \| \| \| \| \| \|	like -Tascii. While adding this, inline term_alloc() (was a one-liner), remove some switches around the terminal encoding for the symbol table (unnecessary), and split out ascii_alloc() into ascii_init(), which is also called from locale_init().
*	In tbl layouts, we puked if a space didn't followed a vertical bar	Kristaps Dzonsons	2011-05-17	1	-0/+17
\| \| \| \| \| \|	(found by Yuri Pankov). This was due to looking for modifiers for the vertical bar. This has been fixed, along with other special-key layout types.
*	Documentation: note COMPATIBILITY of -Tascii `?' printing in mandoc.1	Kristaps Dzonsons	2011-05-17	2	-3/+10
\| \| \| \| \|	and remove some long-fixed notes in sthe same section. Also, add an `Lb' for the mandoc library to mandoc.3 (noted by Sascha Wildner).
*	Flip on printing `?' at Unicode codepoints in -Tascii, -Tpdf, and -Tps.	Kristaps Dzonsons	2011-05-17	1	-1/+9
\| \| \| \| \| \|	The reasoning behind printing SOMETHING at a Unicode codepoint is because the input is not "wrong" (we suppress printing of "wrong" things). It's just that ASCII can't handle it.
*	Flip on unicode output (via \[uNNNN]) in -T[x]html. Here we go!	Kristaps Dzonsons	2011-05-17	4	-4/+37
\|
*	Clean-up fallout: differentiate ID's and HREF's (where to put the `#').	Kristaps Dzonsons	2011-05-17	2	-3/+3
\| \| \| \|	Make buffmt functions internally bufinit(), too.
*	Cleanups in -T[x]html: make html_idcat() use the buffer and be called	Kristaps Dzonsons	2011-05-17	4	-63/+41
\| \| \| \| \| \|	bufcat_id(), then collapse it into a little function without so much crap. Next, make bufinit() only be called when we really need to do so, and not simply before pre/post calls.
*	Clean-ups in -T[x]html: inline print_num(), as it was just a single	Kristaps Dzonsons	2011-05-17	4	-67/+27
\| \| \| \| \| \| \| \| \| \|	conditional; same for print_xmltype() and print_doctype(), same reason; make bufncat() be static, as it was only being called from html.c; have bufcat() simply call through to strlcat(). Finally, assert() whenever we truncate. Also rename buffmt() -> bufcat_fmt() to differentiate from buffmt_man et al., which do not concatenate.
*	Clean up -T[x]html by using a table instead of a switch statement for	Kristaps Dzonsons	2011-05-17	1	-41/+16
\| \| \| \| \|	the roff units. Also remove a comment about CSS and number types (they all accept decimal numbers).
*	Remove function calls to res() and so forth in term_word(). These were	Kristaps Dzonsons	2011-05-15	2	-55/+17
\| \| \| \| \| \|	only used once and simply bloated the binary. Also fix mchars_num2char to correctly render the character instead of using atoi(). This makes the conversation more strict, but it's more correct.
*	Fix missing support for \N'n' when calculating string widths in -Tascii	Kristaps Dzonsons	2011-05-15	3	-4/+11
\| \| \| \|	(oops). Do the same for -Thtml (oops^2).
*	Support groff's escape for Unicode input. See	Kristaps Dzonsons	2011-05-15	3	-0/+23
\| \| \| \| \| \|	http://mdocml.bsd.lv/archives/tech/0368.html For the time being, we just throw it away.
*	Use strcspn() in term_strlen(). Clarifies the code.	Kristaps Dzonsons	2011-05-15	1	-10/+10
\|
*	Get rid of an "#if 0" that I don't anticipate being fixed ever (nor does	Kristaps Dzonsons	2011-05-15	1	-42/+0
\| \| \| \|	it really need to be fixed, anyway).
*	Move struct termp_ps into term_ps.c; remove the engine union in struct termp,	Kristaps Dzonsons	2011-05-15	2	-200/+197
\| \| \| \| \| \|	which only held one entry; finally (as per the first), make "ps" member into a pointer managed by term_ps.c. This frees up a nice chunk of memory during run-time and in the binary.
*	Continuing last commit with the style-sheet change.	Kristaps Dzonsons	2011-05-14	1	-19/+21
\|
*	Fix makewhatis.1 to have the correct name (it was MANDOC-DB, oops).	Kristaps Dzonsons	2011-05-14	1	-1/+1
\|
*	Make index.sgml look more like mandoc-cgi, which I find looks much cleaner	Kristaps Dzonsons	2011-05-14	1	-58/+17
\| \| \| \|	and nicer.
*	Make www style.css link up to example.style.css much nicer.	Kristaps Dzonsons	2011-05-14	1	-39/+32
\|
*	Make some values "int" that were "size_t". These are primarily used for	Kristaps Dzonsons	2011-05-14	2	-24/+28
\| \| \| \|	indexing into arrays, so this removes lots of casts from size_t to int.
*	Make character engine (-Tascii, -Tpdf, -Tps) ready for Unicode: make buffer	Kristaps Dzonsons	2011-05-14	6	-18/+22
\| \| \| \| \| \|	consist of type "int". This will take more work (especially in encode and friends), but this is a strong start. This commit also consists of some harmless lint fixes.
*	Give -Thtml and -Txhtml the gift of recognising escapes when calculating	Kristaps Dzonsons	2011-05-14	3	-5/+42
\| \| \| \| \|	widths (e.g., `Bl -tag -width "\s[blahblah]bar"). This has long since been done for -Tascii but escaped noticed with -T[x]html.