summaryrefslogtreecommitdiffstats
path: root/roff.c
Commit message (Collapse)AuthorAgeFilesLines
* Explicitly ignore .br, .ce, and .sp inside tbl(7) text blocks.Ingo Schwarze2017-06-131-2/+3
| | | | | | With the current code structure, they would appear at the wrong place in the syntax tree, so it is better to not insert them into the tree at all and issue an UNSUPP message instead.
* Properly reinitialize roffce_node between parses,Ingo Schwarze2017-06-081-0/+4
| | | | | or this may crash with use-after-free in makewhatis(8); reported by jmc@, thanks!
* Implement the roff(7) .rn (rename macro or string) request.Ingo Schwarze2017-06-071-5/+97
| | | | | | | | | | Renaming a user-defined macro is very simple: just copy the definition to the new name and delete the old name. Renaming high-level macros is a bit tricky: use a dedicated key-value-table, with non-standard names as keys and standard names as values. When a macro is found that is not user-defined, look it up in the "renamed" table and translate it back to the standard name before passing it on to the high-level parsers.
* Minimal implementation of the roff(7) .ce request (center a numberIngo Schwarze2017-06-061-9/+60
| | | | | of input lines without filling). Contrary to groff, high-level macros abort .ce mode for now.
* Implement the roff(7) .mc (right margin character) request.Ingo Schwarze2017-06-041-4/+4
| | | | | | The Tcl/Tk manual pages use this extensively. Delete the TERM_MAXMARGIN hack, it breaks .mc inside .nf; instead, implement a proper TERMP_BRNEVER flag.
* Pure preprocessor implementation of the roff(7) .ec and .eo requestsIngo Schwarze2017-06-041-13/+99
| | | | | | | | | | | | | | | | | (escape character control), touching nothing after the preprocessing stage and keeping even the state variable local to the preprocessor. Since the escape character is also used for line continuation, this requires pulling the implementation of line continuation from the input reader to the preprocessor, which also considerably shortens the code required for that. When the escape character is changed, simply let the preprocessor replace bare by escaped backslashes and instances of the non-standard escape character with bare backslashes - that's all we need. Oh, and if anybody dares to use these requests in OpenBSD manuals, sending a medium-sized pack of axe-murderers after them might be a worthwhile part of the punishment, but probably insuffient on its own.
* Line-breaking roff(7) requests also break man(7) next-line scope.Ingo Schwarze2017-05-081-1/+7
| | | | | | Considering that real roff implements next-line scope using input line traps, that isn't all that surprising. Issue found in the games/xbattle port.
* Basic implementation of the roff(7) .ti (temporary indent) request.Ingo Schwarze2017-05-081-3/+3
| | | | Needed by about four dozen ports (thanks to naddy@ for the research).
* Basic implementation of the roff(7) .ta (define tab stops) request.Ingo Schwarze2017-05-071-3/+27
| | | | | | This is the first feature made possible by the parser reorganization. Improves the formatting of the SYNOPSIS in many Xenocara GL manuals. Also important for ports, as reported by many, including naddy@.
* Move .sp to the roff modules. Enough infrastructure is in placeIngo Schwarze2017-05-051-4/+6
| | | | now that this actually saves code: -70 LOC.
* move .ll to the roff modulesIngo Schwarze2017-05-051-3/+4
|
* Move handling of the roff(7) .ft request from the man(7)Ingo Schwarze2017-05-051-2/+32
| | | | | modules to the new roff(7) modules. As a side effect, mdoc(7) now handles .ft, too. Of course, do not use that.
* Parser reorg:Ingo Schwarze2017-05-041-10/+21
| | | | | Generate the first node on the roff level: .br Fix some column numbers in diagnostic messages while here.
* Parser unification: use nice ohashes for all three request and macro tables;Ingo Schwarze2017-04-291-303/+297
| | | | no functional change, minus two source files, minus 200 lines of code.
* Continue parser unification:Ingo Schwarze2017-04-241-280/+131
| | | | | | | | * Make enum rofft an internal interface as enum roff_tok in "roff.h". * Represent mdoc and man macros in enum roff_tok. * Make TOKEN_NONE a proper enum value and use it throughout. * Put the prologue macros first in the macro tables. * Unify mdoc_macroname[] and man_macroname[] into roff_name[].
* Fix blunder in previous: we must keep the line parse bufferIngo Schwarze2017-03-091-0/+2
| | | | | | | | consistent even when aborting the parsing of the line. That buffer is not our own, but owned and reused by mparse_buf_r(), read.c. Returning without cleanup leaked memory and caused write overruns of the old, typically much smaller buffer in mparse_buf_r(). Promptly noticed by tb@ with afl(1), using MALLOC_OPTIONS=C.
* prevent infinite recursion while expanding the argumentsIngo Schwarze2017-03-081-2/+15
| | | | of a user-defined macro; issue found by tb@ with afl(1)
* remove a few redundant conditions that jsg@ found with cppcheckIngo Schwarze2017-03-031-1/+1
|
* Fix previous: do not access the byte before the string if the stringIngo Schwarze2017-03-031-1/+1
| | | | is empty; found by jsg@ with afl(1).
* Fix a read buffer overrun that copied random data from memory intoIngo Schwarze2017-02-171-3/+11
| | | | | | | | | | | text nodes when a string passed to deroff() ended in a backslash and the byte after the terminating NUL was non-NUL, found by tb@ with afl(1). Invalid bytes so copied with the high bit set could later sometimes trigger another out of bounds read access to static memory in roff_strdup(), so add an assertion there to abort safely in case of similar data corruption.
* Skipping all escape sequences at the beginning of strings in deroff()Ingo Schwarze2017-01-121-8/+4
| | | | | | | | was too aggressive. There are strings that legitimately begin with an escape sequence. Only skip leading escape sequences representing whitespace. Bug reported by martijn@.
* For the .Ux/.Ox family of macros, do text production at the validationIngo Schwarze2017-01-101-2/+6
| | | | | stage rather than in each and every individual formatter, using the new NODE_NOSRC flag. More rigorous and also ten lines less code.
* simplify; NODE_ENDED does no harm in man(7)Ingo Schwarze2017-01-101-8/+2
|
* unify names of AST node flags; no change of cpp outputIngo Schwarze2017-01-101-8/+8
|
* Delete the redundant "nchild" member of struct roff_node, replacingIngo Schwarze2016-01-081-3/+0
| | | | | | | | most uses by one, a few by two pointer checks, and only one by a tiny loop - not only making data smaller, but code shorter as well. This gets rid of an implicit invariant that confused both static analysis tools and human auditors. No functional change.
* move man(7) validation into the dedicated validation phase, tooIngo Schwarze2015-10-221-2/+2
|
* Move all mdoc(7) node validation done before child parsingIngo Schwarze2015-10-211-28/+12
| | | | | | to the new separate validation pass, except for a tiny bit needed by the parser which goes to the new mdoc_state() module; cleaner, simpler, and surprisingly also shorter by 15 lines.
* In order to become able to generate syntax tree nodes on the roff(7)Ingo Schwarze2015-10-201-3/+8
| | | | | | | | level, validation must be separated from parsing and rewinding. This first big step moves calling of the mdoc(7) post_*() functions out of the parser loop into their own mdoc_validate() pass, while using a new mdoc_state() module to make syntax tree state handling available to both the parser loop and the validation pass.
* Delete two preprocessor constants that are no longer used.Ingo Schwarze2015-10-151-3/+0
| | | | Patch from Michael Reed <m dot reed at mykolab dot com>.
* Major character table cleanup:Ingo Schwarze2015-10-131-4/+2
| | | | | | | | | | | | | * Use ohash(3) rather than a hand-rolled hash table. * Make the character table static in the chars.c module: There is no need to pass a pointer around, we most certainly never want to use two different character tables concurrently. * No need to keep the characters in a separate file chars.in; that merely encourages downstream porters to mess with them. * Sort the characters to agree with the mandoc_chars(7) manual page. * Specify Unicode codepoints in hex, not decimal (that's the detail that originally triggered this patch). No functional change, minus 100 LOC, and i don't see a performance change.
* To make the code more readable, delete 283 /* FALLTHROUGH */ commentsIngo Schwarze2015-10-121-24/+0
| | | | | | that were right between two adjacent case statement. Keep only those 24 where the first case actually executes some code before falling through to the next case.
* modernize style: "return" is not a functionIngo Schwarze2015-10-061-128/+128
|
* /* NOTREACHED */ after abort() is silly, delete itIngo Schwarze2015-09-261-3/+2
|
* If we have to reparse the text line because we spring an input line trap,Ingo Schwarze2015-08-291-16/+17
| | | | | | we must not escape breakable hyphens yet, or mparse_buf_r() in read.c will complain and replace the escaped hyphens with question marks. Bug found in ocserv(8) following a report from Kurt Jaeger <pi at FreeBSD>.
* Implement the escape sequence \\$*, expanding to all argumentsIngo Schwarze2015-08-291-15/+26
| | | | | | of the current user-defined macro. This is another missing feature required for ocserv(8). Problem reported by Kurt Jaeger <pi at FreeBSD>.
* Minimal implementation of the read-only number register \n(.$Ingo Schwarze2015-08-291-8/+18
| | | | | | which returns the number of arguments of the current macro. This is one of the missing features required for ocserv(8). Problem reported by Kurt Jaeger <pi at FreeBSD>.
* Ignore blank characters at the beginning of a conditional block,Ingo Schwarze2015-06-271-0/+2
| | | | | that is, after "\{". Issue found by Markus <Waldeck at gmx dot de> in bash(1).
* Implement the roff(7) `r' (register exists) conditional.Ingo Schwarze2015-05-311-5/+31
| | | | | Missing feature found by Markus <Waldeck at gmx dot de> in Debian's bash(1) manual page.
* Setting the "last" member of struct roff_node was done at an extremelyIngo Schwarze2015-05-011-0/+1
| | | | | | | | weird place. Move it to the obviously correct place. Surprisingly, this didn't cause any misformatting in the test suite or in any base system manuals, but i cannot believe the code was really correct for all conceivable input, and it would be very hard to verify. At the very least, it cannot have worked for man(7).
* Unify mdoc_deroff() and man_deroff() into a common function deroff().Ingo Schwarze2015-04-231-0/+46
| | | | | | | | No functional change except that for mdoc(7), it now skips leading escape sequences just like it already did for man(7). Escape sequences rarely occur in mdoc(7) code and if they do, skipping them is an improvement in this context. Minus 30 lines of code.
* Unify trickier node handling functions.Ingo Schwarze2015-04-191-0/+21
| | | | | | | * man_elem_alloc() -> roff_elem_alloc() * man_block_alloc() -> roff_block_alloc() The functions mdoc_elem_alloc() and mdoc_block_alloc() remain for now because they need to do mdoc(7)-specific argument processing.
* Unify some node handling functions that use TOKEN_NONE.Ingo Schwarze2015-04-191-2/+61
| | | | | | | | * mdoc_word_alloc(), man_word_alloc() -> roff_word_alloc() * mdoc_word_append(), man_word_append() -> roff_word_append() * mdoc_addspan(), man_addspan() -> roff_addtbl() * mdoc_addeqn(), man_addeqn() -> roff_addeqn() Minus 50 lines of code, no functional change.
* Unify node handling functions:Ingo Schwarze2015-04-191-6/+204
| | | | | | | | | | | * node_alloc() for mdoc and man_node_alloc() -> roff_node_alloc() * node_append() for mdoc and man_node_append() -> roff_node_append() * mdoc_head_alloc() and man_head_alloc() -> roff_head_alloc() * mdoc_body_alloc() and man_body_alloc() -> roff_body_alloc() * mdoc_node_unlink() and man_node_unlink() -> roff_node_unlink() * mdoc_node_free() and man_node_free() -> roff_node_free() * mdoc_node_delete() and man_node_delete() -> roff_node_delete() Minus 130 lines of code, no functional change.
* Unify {mdoc,man}_{alloc,reset,free}() into roff_man_{alloc,reset,free}().Ingo Schwarze2015-04-181-1/+69
| | | | | Minus 80 lines of code, no functional change. Written on the train from Koeln to Wolfsburg returning from p2k15.
* Don't allow breaking the output line after hyphens following escapeIngo Schwarze2015-04-041-0/+2
| | | | | sequences. Improves tic(1), sxpm(1), and a few Perl manuals. Quirk found by naddy@ in milter-greylist(8).
* Escape quotes when expanding macro arguments.Ingo Schwarze2015-02-211-16/+76
| | | | This fixes a bug naddy@ found in plan9/rc(1).
* Cope with another one of the many kinds of DocBook stupidity:Ingo Schwarze2015-02-171-2/+11
| | | | | | | | | | | | | | Instead of just using .br, DocBook sometimes fiddles with the utterly unportable internal register \n[an-break-flag] that is only available in the GNU implementation of man(7) and then arms an input line trap to call the equally unportable internal macro .an-trap that, in the GNU implementation, inspects that variable; all the world is GNU, isn't it? Since naddy@ reports that quite a few ports manuals suffer from this insanity, let's just translate it to the intended .br. Et ceterum censeo DocBookem esse delendam.
* Let .it accept numerical expressions, not just numerical constants.Ingo Schwarze2015-02-171-36/+41
| | | | | | | For .it, ignore scaling units in roff_getnum(). Inside parentheses, skip whitespace after a sign in roff_getnum(). Parse and ignore unary plus in roff_getnum(). As a bonus, get rid of the only call to mandoc_strntoi() in roff.c.
* replace the last legacy generic message type, "argument count wrong",Ingo Schwarze2015-02-061-4/+5
| | | | by more specific messages, improving diagnostics for .cc .tr .Bl -column
* correctly handle table layout lines starting with a dotIngo Schwarze2015-01-301-1/+1
|