mandoc - UNIX manpage compiler toolset

	Commit message (Collapse)	Author	Age	Files	Lines
*	Many people have been complaining for a long time that ``...'' looks	Ingo Schwarze	2017-02-17	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \|	ugly in -Tascii output. For that reason, bentley@ submitted patches to render "..." instead to groff in November 2014 (yes, more than two years ago). Carsten Kunze yesterday merged them for the upcoming groff-1.22.4 release. Yay! Consequently, do the same in mandoc: Render \(Lq and \(Rq (which are used for .Do, .Dq, .Lb, and .St) as '"' in -Tascii output. All other output modes including -Tutf8 remain unchanged.
*	Major character table cleanup:	Ingo Schwarze	2015-10-13	1	-80/+399
\| \| \| \| \| \| \| \| \| \| \| \| \|	* Use ohash(3) rather than a hand-rolled hash table. * Make the character table static in the chars.c module: There is no need to pass a pointer around, we most certainly never want to use two different character tables concurrently. * No need to keep the characters in a separate file chars.in; that merely encourages downstream porters to mess with them. * Sort the characters to agree with the mandoc_chars(7) manual page. * Specify Unicode codepoints in hex, not decimal (that's the detail that originally triggered this patch). No functional change, minus 100 LOC, and i don't see a performance change.
*	modernize style: "return" is not a function	Ingo Schwarze	2015-10-06	1	-11/+11
\|
*	Render \(lq and \(rq as '"' in -Tascii mode but leave the rendering	Ingo Schwarze	2015-02-17	1	-1/+1
\| \| \| \| \| \|	of .Do/.Dc, .Dq, .Lb, and .St untouched. Reduces groff-mandoc differences in OpenBSD base by about 7%. Reminded of the issue by naddy@.
*	In terminal output, unify handling of Unicode and numbered character	Ingo Schwarze	2014-10-29	1	-5/+3
\| \| \| \| \| \| \| \| \| \| \|	escape sequences just like it was earlier implemented for -Thtml. Do not let control characters other than ASCII 9 (horizontal tab) propagate to the output, even though groff allows them; but that really doesn't look like a great idea. Let mchars_num2char() return int such that we can distinguish invalid \N syntax from \N'0'. This also reduces the danger of signed char issues popping up.
*	Make the character table available to libroff so it can check the	Ingo Schwarze	2014-10-28	1	-1/+1
\| \| \| \| \| \| \| \|	validity of character escape names and warn about unknown ones. This requires mchars_spec2cp() to report unknown names again. Fortunately, that doesn't require changing the calling code because according to groff, invalid character escapes should not produce output anyway, and now that we warn about them, that's fine.
*	Tighten Unicode escape name parsing.	Ingo Schwarze	2014-10-28	1	-8/+3
\| \| \| \| \|	Accept only 0xXXXX, 0xYXXXX, 0x10XXXX with Y != 0. This simplifies mchars_num2uc().
*	Fix a regression in term.c rev. 1.229 reported by bentley@:	Ingo Schwarze	2014-10-27	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	In UTF-8 output, do not print anything if mchars_spec2cp() returns 0. In particular, this repairs handling of zero-width spaces (\&). While here, let mchars_spec2cp() return 0xFFFD instead of -1 if the character is not found, simplifying the using code. In HTML output, do not print obfuscated ASCII characters and do not test for one-char escapes, mchars_spec2cp() already does that.
*	In -Tascii mode, provide approximations even for some Unicode escape	Ingo Schwarze	2014-10-26	1	-0/+11
\| \| \| \| \| \| \| \|	sequences above codepoint 512 by doing a reverse lookup in the existing mandoc_char(7) character table. Again, groff isn't smart enough to do this and silently discards such escape sequences without printing anything.
*	Improve -Tascii output for Unicode escape sequences: For the first 512	Ingo Schwarze	2014-10-26	1	-15/+6
\| \| \| \| \| \| \| \| \| \| \| \|	code points, provide ASCII approximations. This is already much better than what groff does, which prints nothing for most code points. A few minor fixes while here: * Handle Unicode escape sequences in the ASCII range. * In case of errors, use the REPLACEMENT CHARACTER U+FFFD for -Tutf8 and the string "<?>" for -Tascii output. * Handle all one-character escape sequences in mchars_spec2{cp,str}() and remove the workarounds on the higher level.
*	Get rid of HAVE_CONFIG_H, it is always defined; idea from libnbcompat.	Ingo Schwarze	2014-08-10	1	-2/+2
\| \| \| \| \| \|	Include <sys/types.h> where needed, it does not belong in config.h. Remove <stdio.h> from config.h; if it is missing somewhere, it should be added, but i cannot find a *.c file where it is missing.
*	Security fix:	Ingo Schwarze	2014-07-23	1	-1/+12
\| \| \| \| \| \| \| \| \| \|	After decoding numeric (\N) and one-character (\<, \> etc.) character escape sequences, do not forget to HTML-encode the resulting ASCII character. Malicious manuals were able to smuggle XSS content by roff-escaping the HTML-special characters they need. That's a classic bug type in many web applications, actually... :-( Found myself while auditing the HTML formatter for safe output handling.
*	KNF: case (FOO): -> case FOO:, remove /* LINTED / and / ARGSUSED */,	Ingo Schwarze	2014-04-20	1	-8/+9
\| \| \| \| \|	remove trailing whitespace and blanks before tabs, improve some indenting; no functional change
*	The files mandoc.c and mandoc.h contained both specialised low-level	Ingo Schwarze	2014-03-23	1	-0/+1
\| \| \| \| \| \| \|	functions used for multiple languages (mdoc, man, roff), for example mandoc_escape(), mandoc_getarg(), mandoc_eos(), and generic auxiliary functions. Split the auxiliaries out into their own file and header. While here, do some #include cleanup.
*	Implement the \: (optional line break) escape sequence,	Ingo Schwarze	2014-01-22	1	-1/+1
\| \| \| \| \| \| \|	documented in the Ossanna-Kernighan-Ritter troff manual and also supported by groff. Missing feature reported by Steffen Nurpmeso <sdaoden at gmail dot com>.
*	Improve handling of the roff(7) "\t" escape sequence:	Ingo Schwarze	2013-06-20	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \|	* Parsing macro arguments has to be done in copy mode, which implies replacing "\t" by a literal tab character. * Otherwise, render "\t" as the empty string, not as a 't' character. This fixes formatting of the distfile example in the oldrdist(1) manual. This also shows up in the unzip(1) manual as one of several issues preventing the removal of USE_GROFF from the archivers/unzip port. Thanks to espie@ for attracting my attention to the unzip(1) manual.
*	Even though the size of a pointer should not depend on the type of the	Ingo Schwarze	2013-05-18	1	-1/+1
\| \| \| \| \| \| \|	data pointed to, pass the size of the right pointer type to calloc; cosmetic issue reported by Ulrich Spoerlein <uqs@spoerlein.net> found in Coverity Scan CID 978734. No binary change - ok cmp(1).
*	Const-ify some mchars arguments. I think these are non-const for historical	Kristaps Dzonsons	2011-11-08	1	-6/+9
\| \| \| \|	dumbness on my part.
*	forgotten Copyright bumps; no code change	Ingo Schwarze	2011-09-18	1	-1/+1
\| \| \| \|	found while syncing to OpenBSD
*	Regression fixes after merging 1.11.3 to OpenBSD (rev. 1.20):	Ingo Schwarze	2011-07-31	1	-2/+4
\| \| \| \| \| \| \|	* Do not pass integers outside the ASCII range to isprint(). * Make sure escaped characters are really printed verbatim when the escape sequence has no special meaning. ok kristaps@
*	Add support for 1/2, 1/4, and 3/4 (needed by eqn).	Kristaps Dzonsons	2011-07-22	1	-1/+1
\|
*	Support `size' constructs in eqn.7. Generalise mandoc_strontou to this	Kristaps Dzonsons	2011-07-21	1	-2/+2
\| \| \| \|	effect.
*	Simplify chars.c---there's really no need for extra code to reorder the	Kristaps Dzonsons	2011-07-07	1	-60/+7
\| \| \| \|	hash chain or an extra function for checking matches.
*	Remove all references to ESCAPE_PREDEF, which is now not exposed passed	Kristaps Dzonsons	2011-05-24	1	-31/+0
\| \| \| \|	the libroff point. This clears up a nice chunk of code.
*	Remove predefined strings from the chars.in file, as they're now local	Kristaps Dzonsons	2011-05-24	1	-22/+11
\| \| \| \| \|	to predefs.in. This also makes "BOTH" entries directly into CHAR. The res2str and spec2str are now effectively the same function.
*	Flip on unicode output (via \[uNNNN]) in -T[x]html. Here we go!	Kristaps Dzonsons	2011-05-17	1	-2/+16
\|
*	Remove function calls to res() and so forth in term_word(). These were	Kristaps Dzonsons	2011-05-15	1	-3/+2
\| \| \| \| \| \|	only used once and simply bloated the binary. Also fix mchars_num2char to correctly render the character instead of using atoi(). This makes the conversation more strict, but it's more correct.
*	Fix missing support for \N'n' when calculating string widths in -Tascii	Kristaps Dzonsons	2011-05-15	1	-0/+1
\| \| \| \|	(oops). Do the same for -Thtml (oops^2).
*	Make character engine (-Tascii, -Tpdf, -Tps) ready for Unicode: make buffer	Kristaps Dzonsons	2011-05-14	1	-1/+2
\| \| \| \| \| \|	consist of type "int". This will take more work (especially in encode and friends), but this is a strong start. This commit also consists of some harmless lint fixes.
*	Filter all \N'' values with isprint(). Ok schwarze@.	Kristaps Dzonsons	2011-05-01	1	-9/+5
\|
*	Make mchars_num2char() return a char like it says.	Kristaps Dzonsons	2011-04-30	1	-10/+10
\|
*	Rename mchars_init() -> mchars_alloc() for consistency.	Kristaps Dzonsons	2011-04-30	1	-1/+1
\|
*	Remove enum mcharst, which hasn't been used in quite some time.	Kristaps Dzonsons	2011-04-30	1	-3/+1
\|
*	Move "chars" interface out of out.h and into mandoc.h. This doesn't	Kristaps Dzonsons	2011-04-29	1	-28/+20
\| \| \| \| \| \| \| \| \| \|	change any code but for renaming functions and types to be consistent with other mandoc.h stuff. The reason for moving into libmandoc is that the rendering of special characters is part of mandoc itself---not an external part. From mandoc(1)'s perspective, this changes nothing, but for other utilities, it's important to have these part of libmandoc. Note this isn't documented [yet] in mandoc.3 because there are some parts I'd like to change around beforehand.
*	Add \(Ai (ANSI) and \(Px (POSIX) predefined strings, which are part of	Kristaps Dzonsons	2011-04-20	1	-1/+1
\| \| \| \| \| \|	groff's tmac.doc package. Originally noted by Matthew Dempsky. Feedback by Jason McIntyre, joerg@, and schwarze@. Also add some documentation about predefined strings, tweaked by schwarze@.
*	Step 4: merge chars.h into out.h. The functions in this file are	Kristaps Dzonsons	2011-03-22	1	-1/+1
\| \| \| \| \|	necessary to all [real] front-ends, so stop pretending it's special. While here, add some documentation to the variable types.
*	Move mandoc_{realloc,malloc,calloc} out of libmandoc.h and into mandoc.h	Kristaps Dzonsons	2011-03-17	1	-11/+2
\| \| \| \| \| \| \| \|	so that everybody can use them. This follows the convention of libXXXX.h being internal to a library and XXXX.h being the external interface. Not only does this allow the removal of lots of redundant NULL-checking code, it also sets the tone for adding new mandoc-global routines.
*	Implement the \N'number' (numbered character) roff escape sequence.	Ingo Schwarze	2011-01-30	1	-0/+22
\| \| \| \| \| \| \|	Don't use it in new manuals, it is inherently non-portable, but we need it for backward-compatibility with existing manuals, for example in Xenocara driver pages. ok kristaps@ jmc@ and tested by Matthieu Herrb (matthieu at openbsd dot org)
*	Churn to get parts of 'struct tbl' visible from mandoc.h: rename the	Kristaps Dzonsons	2011-01-02	1	-11/+11
\| \| \| \| \| \| \|	existing 'struct tbl' as 'struct tbl_node', then move all option stuff into a 'struct tbl' in mandoc.h. This conflicted with a structure in chars.c, which was renamed.
*	Remove last pod2man escapes. These render ok, although \*(-- renders as	Kristaps Dzonsons	2010-09-15	1	-1/+1
\| \| \| \| \| \|	O- because the underlying macro depends on \(W, which a prior pod2man preamble `tr' macro rewrites as "-". This is an error in groff as this tramples on the real \(W, or Greek omega.
*	Churny commit to quiet lint. No functional changes.	Kristaps Dzonsons	2010-09-04	1	-2/+2
\|
*	Remove the pod2man table entries. They can now be properly read and	Kristaps Dzonsons	2010-08-29	1	-1/+1
\| \| \| \|	assigned within the pod2man preamble.
*	Implement a simple, consistent user interface for error handling.	Ingo Schwarze	2010-08-20	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We now have sufficient practical experience to know what we want, so this is intended to be final: - provide -Wlevel (warning, error or fatal) to select what you care about - provide -Wstop to stop after parsing a file with warnings you care about - provide consistent exit status codes for those warnings you care about - fully document what warnings, errors and fatal errors mean - remove all other cruft from the user interface, less is more: - remove all -f knobs along with the whole -f option - remove the old -Werror because calling warnings "fatal" is silly - always finish parsing each file, unless fatal errors prevent that This commit also includes a couple of related simplifications behind the scenes regarding error handling. Feedback and OK kristaps@; Joerg Sonnenberger (NetBSD) and Sascha Wildner (DragonFly BSD) agree with the general direction.
*	Remove \*(C+ from the pre-predefined strings. It is always `ds'-defined	Kristaps Dzonsons	2010-08-16	1	-1/+1
\| \| \| \| \| \|	when being used in manuals. Since we now support `ds', it's no longer necessary to account for it. From a bug report originally by Thomas Jeunet.
*	Sync to OpenBSD: add missing Copyright years.	Ingo Schwarze	2010-07-31	1	-1/+1
\| \| \| \| \|	I checked that substantial changes were committed to these files during these years.
*	Remove asciisz from chars.in. It frees up a nice chunk of memory and at	Kristaps Dzonsons	2010-07-26	1	-9/+8
\| \| \| \| \| \|	the overhead of running strlen() for ASCII strings (yes, I benchmarked this running mandoc_char(7) as input again and again with hundredth-second penalties... on my slow-ass alpha).
*	Clean up mandoc_special() (in order later to catch \m). It also flags	Kristaps Dzonsons	2010-07-18	1	-1/+2
\| \| \| \| \| \|	several syntactic errors that weren't caught before. Also un-puke chars.c on zero-length \[].
*	By letting strncmp() do its job and not helping it with a prior length	Kristaps Dzonsons	2010-07-17	1	-9/+8
\| \| \| \| \|	check, we can remove the hard-coded length of all escape patterns. This frees up a nice chunk of memory.
*	Change chars.in HTML encoding to be a Unicode codepoint (int), which is	Kristaps Dzonsons	2010-07-16	1	-23/+64
\| \| \| \|	later formatted in html.c.
*	Churn as I finish email address migration kth.se -> bsd.lv.	Kristaps Dzonsons	2010-06-19	1	-1/+1
\|