summaryrefslogtreecommitdiffstats
path: root/roff.c
Commit message (Collapse)AuthorAgeFilesLines
* move man(7) validation into the dedicated validation phase, tooIngo Schwarze2015-10-221-2/+2
|
* Move all mdoc(7) node validation done before child parsingIngo Schwarze2015-10-211-28/+12
| | | | | | to the new separate validation pass, except for a tiny bit needed by the parser which goes to the new mdoc_state() module; cleaner, simpler, and surprisingly also shorter by 15 lines.
* In order to become able to generate syntax tree nodes on the roff(7)Ingo Schwarze2015-10-201-3/+8
| | | | | | | | level, validation must be separated from parsing and rewinding. This first big step moves calling of the mdoc(7) post_*() functions out of the parser loop into their own mdoc_validate() pass, while using a new mdoc_state() module to make syntax tree state handling available to both the parser loop and the validation pass.
* Delete two preprocessor constants that are no longer used.Ingo Schwarze2015-10-151-3/+0
| | | | Patch from Michael Reed <m dot reed at mykolab dot com>.
* Major character table cleanup:Ingo Schwarze2015-10-131-4/+2
| | | | | | | | | | | | | * Use ohash(3) rather than a hand-rolled hash table. * Make the character table static in the chars.c module: There is no need to pass a pointer around, we most certainly never want to use two different character tables concurrently. * No need to keep the characters in a separate file chars.in; that merely encourages downstream porters to mess with them. * Sort the characters to agree with the mandoc_chars(7) manual page. * Specify Unicode codepoints in hex, not decimal (that's the detail that originally triggered this patch). No functional change, minus 100 LOC, and i don't see a performance change.
* To make the code more readable, delete 283 /* FALLTHROUGH */ commentsIngo Schwarze2015-10-121-24/+0
| | | | | | that were right between two adjacent case statement. Keep only those 24 where the first case actually executes some code before falling through to the next case.
* modernize style: "return" is not a functionIngo Schwarze2015-10-061-128/+128
|
* /* NOTREACHED */ after abort() is silly, delete itIngo Schwarze2015-09-261-3/+2
|
* If we have to reparse the text line because we spring an input line trap,Ingo Schwarze2015-08-291-16/+17
| | | | | | we must not escape breakable hyphens yet, or mparse_buf_r() in read.c will complain and replace the escaped hyphens with question marks. Bug found in ocserv(8) following a report from Kurt Jaeger <pi at FreeBSD>.
* Implement the escape sequence \\$*, expanding to all argumentsIngo Schwarze2015-08-291-15/+26
| | | | | | of the current user-defined macro. This is another missing feature required for ocserv(8). Problem reported by Kurt Jaeger <pi at FreeBSD>.
* Minimal implementation of the read-only number register \n(.$Ingo Schwarze2015-08-291-8/+18
| | | | | | which returns the number of arguments of the current macro. This is one of the missing features required for ocserv(8). Problem reported by Kurt Jaeger <pi at FreeBSD>.
* Ignore blank characters at the beginning of a conditional block,Ingo Schwarze2015-06-271-0/+2
| | | | | that is, after "\{". Issue found by Markus <Waldeck at gmx dot de> in bash(1).
* Implement the roff(7) `r' (register exists) conditional.Ingo Schwarze2015-05-311-5/+31
| | | | | Missing feature found by Markus <Waldeck at gmx dot de> in Debian's bash(1) manual page.
* Setting the "last" member of struct roff_node was done at an extremelyIngo Schwarze2015-05-011-0/+1
| | | | | | | | weird place. Move it to the obviously correct place. Surprisingly, this didn't cause any misformatting in the test suite or in any base system manuals, but i cannot believe the code was really correct for all conceivable input, and it would be very hard to verify. At the very least, it cannot have worked for man(7).
* Unify mdoc_deroff() and man_deroff() into a common function deroff().Ingo Schwarze2015-04-231-0/+46
| | | | | | | | No functional change except that for mdoc(7), it now skips leading escape sequences just like it already did for man(7). Escape sequences rarely occur in mdoc(7) code and if they do, skipping them is an improvement in this context. Minus 30 lines of code.
* Unify trickier node handling functions.Ingo Schwarze2015-04-191-0/+21
| | | | | | | * man_elem_alloc() -> roff_elem_alloc() * man_block_alloc() -> roff_block_alloc() The functions mdoc_elem_alloc() and mdoc_block_alloc() remain for now because they need to do mdoc(7)-specific argument processing.
* Unify some node handling functions that use TOKEN_NONE.Ingo Schwarze2015-04-191-2/+61
| | | | | | | | * mdoc_word_alloc(), man_word_alloc() -> roff_word_alloc() * mdoc_word_append(), man_word_append() -> roff_word_append() * mdoc_addspan(), man_addspan() -> roff_addtbl() * mdoc_addeqn(), man_addeqn() -> roff_addeqn() Minus 50 lines of code, no functional change.
* Unify node handling functions:Ingo Schwarze2015-04-191-6/+204
| | | | | | | | | | | * node_alloc() for mdoc and man_node_alloc() -> roff_node_alloc() * node_append() for mdoc and man_node_append() -> roff_node_append() * mdoc_head_alloc() and man_head_alloc() -> roff_head_alloc() * mdoc_body_alloc() and man_body_alloc() -> roff_body_alloc() * mdoc_node_unlink() and man_node_unlink() -> roff_node_unlink() * mdoc_node_free() and man_node_free() -> roff_node_free() * mdoc_node_delete() and man_node_delete() -> roff_node_delete() Minus 130 lines of code, no functional change.
* Unify {mdoc,man}_{alloc,reset,free}() into roff_man_{alloc,reset,free}().Ingo Schwarze2015-04-181-1/+69
| | | | | Minus 80 lines of code, no functional change. Written on the train from Koeln to Wolfsburg returning from p2k15.
* Don't allow breaking the output line after hyphens following escapeIngo Schwarze2015-04-041-0/+2
| | | | | sequences. Improves tic(1), sxpm(1), and a few Perl manuals. Quirk found by naddy@ in milter-greylist(8).
* Escape quotes when expanding macro arguments.Ingo Schwarze2015-02-211-16/+76
| | | | This fixes a bug naddy@ found in plan9/rc(1).
* Cope with another one of the many kinds of DocBook stupidity:Ingo Schwarze2015-02-171-2/+11
| | | | | | | | | | | | | | Instead of just using .br, DocBook sometimes fiddles with the utterly unportable internal register \n[an-break-flag] that is only available in the GNU implementation of man(7) and then arms an input line trap to call the equally unportable internal macro .an-trap that, in the GNU implementation, inspects that variable; all the world is GNU, isn't it? Since naddy@ reports that quite a few ports manuals suffer from this insanity, let's just translate it to the intended .br. Et ceterum censeo DocBookem esse delendam.
* Let .it accept numerical expressions, not just numerical constants.Ingo Schwarze2015-02-171-36/+41
| | | | | | | For .it, ignore scaling units in roff_getnum(). Inside parentheses, skip whitespace after a sign in roff_getnum(). Parse and ignore unary plus in roff_getnum(). As a bonus, get rid of the only call to mandoc_strntoi() in roff.c.
* replace the last legacy generic message type, "argument count wrong",Ingo Schwarze2015-02-061-4/+5
| | | | by more specific messages, improving diagnostics for .cc .tr .Bl -column
* correctly handle table layout lines starting with a dotIngo Schwarze2015-01-301-1/+1
|
* * Polish tbl(7) error reporting.Ingo Schwarze2015-01-281-3/+6
| | | | | | * Do not print out macro names in tbl(7) data blocks. * Like with GNU tbl, let empty tables cause a blank line. * Avoid producing empty tables in -Tman.
* For now, it can't be helped that mandoc tbl(7) ignores high-level macros,Ingo Schwarze2015-01-281-1/+7
| | | | | | but stop throwing away their arguments. This fixes information loss in a handful of Xenocara manuals, at the price of a small amount of formatting noise creeping through.
* Strangely, ignoring the roff(7) .na request was implemented in the man(7)Ingo Schwarze2015-01-241-1/+2
| | | | | parser. Simplify the code by moving it into the roff(7) parser, also making it work for mdoc(7).
* While ignoring the .ta (set tab stops) and .ti (temp indent) requestsIngo Schwarze2015-01-231-3/+3
| | | | | | is sometimes harmless, it often causes seriously ugly output, so flag these requests as unsupported rather than ignoring them. Discussed with naddy@.
* Wonders of roff(7): Integer numbers in numerical expressions can carryIngo Schwarze2015-01-231-2/+40
| | | | | scaling units, and some manuals (e.g. in devel/grcs) actually use that, so let's support it. Missing feature reported by naddy@.
* Slightly improve \w width measurements:Ingo Schwarze2015-01-221-1/+19
| | | | | | | Count special characters with the same width as ASCII characters and treat all other escape sequences as if they had a width of 0. Certainly not perfect, but a bit better. For example, GNU RCS ci(1) needs this; reported by naddy@.
* pass empty request lines through to tbl(7); sometimes, they end a layoutIngo Schwarze2015-01-211-10/+9
|
* Split the -Werror message level into -Werror (broken manual, probablyIngo Schwarze2015-01-201-18/+456
| | | | | | | | | | | | | using mandoc is better than using groff) and -Wunsupp (manual using unsupported low-level roff(7) feature, probably using groff is better than using mandoc). Once this feature is complete, it is intended to help porting, making the decision whether to USE_GROFF easier. As a first step, distinguish four classes of roff(7) requests: 1. Supported (currently 24 requests) 2. Currently ignored because unimportant (120) -> no message 3. Ignored for good because insecure (14) -> -Werror 4. Currently unsupported (68) -> these trigger the new -Wunsupp messages
* Parse and ignore .IX (generate index entry) macros because pod2man(1)Ingo Schwarze2015-01-161-0/+2
| | | | | emits them, by default without defining them, relying on the roff(7) quirk that undefined macros have no effect.
* downgrade ".so with absolute path" from FATAL to ERROR;Ingo Schwarze2015-01-141-2/+7
| | | | allows to get rid of ROFF_ERR
* Bugfix: When the invocation of a user-defined macro follows a roffIngo Schwarze2015-01-071-0/+1
| | | | | conditional request on the same input line, don't skip the first few bytes of its content.
* Fix a buffer overrun triggered by a trailing backslash at EOF inIngo Schwarze2015-01-011-4/+6
| | | | | | | an unclosed conditional body. If the memory contained the byte sequence "\}" after the end of the buffer before the next NUL, this could even write beyond the end of the buffer, specifically '&' to the location of the '}'. Found by jsg@ with afl.
* improve previous: do the size check up front to avoid leaking memoryIngo Schwarze2014-12-281-6/+4
|
* Reduce memory and time consumption on certain malformed input filesIngo Schwarze2014-12-251-0/+7
| | | | | | by limiting the length of expanded input lines during the (usually recursive) expansion of user defined strings. Resource hogging found by jsg@ with afl.
* Don't let the modulo operator divide by zero.Ingo Schwarze2014-12-181-1/+7
| | | | Found by jsg@ with afl.
* Ignore mdoc(7) and man(7) macros inside tbl(7) code because theyIngo Schwarze2014-12-161-4/+19
| | | | | would abort the table in an unclean way, causing assertion failures found by jsg@.
* When a string comparison condition contains no mismatching characterIngo Schwarze2014-12-161-1/+1
| | | | | | | | but ends without the final delimiter, the parse point was advanced one character too far and the invalid pointer returned to the caller of roff_parseln(). Later use could potentially advance the pointer even further and maybe even write to it. Fixing a buffer overrun found by jsg@ with afl (the most severe so far).
* When a numerical condition errors out after consuming at least oneIngo Schwarze2014-12-161-2/+5
| | | | | | | character of input, treat it as false, do not retry it as a string comparison condition. This also fixes a read buffer overrun that happened when the numerical condition advanced to the end of the input line before erroring out, found by jsg@ with afl.
* Empty conditions count as false.Ingo Schwarze2014-12-151-0/+2
| | | | | | When negated, they still count as false. Found when investigating crashes jsg@ found with afl. Not completely fixing the crashes yet.
* Support the ".if v" conditional operator (vroff mode, always false)Ingo Schwarze2014-11-191-0/+2
| | | | | | | for groff compatibility because pod2man(1) uses it that way. Weirdly, groff documents it as "for compatibility with other troff versions" but neither Heirloom nor Plan 9 have it. Issue reported by giovanni@ via sthen@.
* Use struct buf in libroff, it is very natural thereIngo Schwarze2014-11-011-166/+164
| | | | | | and reduces the number of arguments of many functions. While here, sprinkle some KNF. No functional change.
* Make the character table available to libroff so it can check theIngo Schwarze2014-10-281-2/+9
| | | | | | | | validity of character escape names and warn about unknown ones. This requires mchars_spec2cp() to report unknown names again. Fortunately, that doesn't require changing the calling code because according to groff, invalid character escapes should not produce output anyway, and now that we warn about them, that's fine.
* With the current architecture, we can't support inline equationsIngo Schwarze2014-10-251-1/+2
| | | | | | | inside tables, sorry. So don't even try to parse tbl(7) blocks for eqn(7) delimiters. Broken table layout found in glPixelMap(3) while investigating a bug report by Theo Buehler <theo at math dot ethz dot ch>.
* Report arguments to .EQ as an error, and simplify the code:Ingo Schwarze2014-10-251-20/+9
| | | | | | | * drop trivial wrapper function roff_openeqn() * drop unused first arg of function eqn_alloc() * drop usused member "name" of struct eqn_node While here, sync to OpenBSD by killing some trailing blanks.
* Protect the roff parser from dividing by zero. ok schwarze@Kristaps Dzonsons2014-10-201-13/+24
|