summaryrefslogtreecommitdiffstats
path: root/mdoc.c
Commit message (Collapse)AuthorAgeFilesLines
* Support manual tagging of .Pp, .Bd, .D1, .Dl, .Bl, and .It.Ingo Schwarze2020-04-061-5/+8
| | | | | | In HTML output, improve the logic for writing inside permalinks: skip them when there is no child content or when there is a risk that the children might contain flow content.
* Cleanup, no functional change:Ingo Schwarze2018-12-311-3/+3
| | | | | | Use the new parser flag ROFF_NOFILL in the mdoc(7) parser, too, instead of the old MDOC_LITERAL, which was an alias for the former MAN_LITERAL.
* Cleanup, minus 15 LOC, no functional change:Ingo Schwarze2018-12-311-9/+0
| | | | | | | | | Simplify the way the man(7) and mdoc(7) validators are called. Reset the parser state with a common function before calling them. There is no need to again reset the parser state afterwards, the parsers are no longer used after validation. This allows getting rid of man_node_validate() and mdoc_node_validate() as separate functions.
* Cleanup, no functional change:Ingo Schwarze2018-12-301-1/+1
| | | | | | | | | | | | | | The struct roff_man used to be a bad mixture of internal parser state and public parsing results. Move the public results to the parsing result struct roff_meta, which is already public. Move the rest of struct roff_man to the parser-internal header roff_int.h. Since the validators need access to the parser state, call them from the top level parser during mparse_result() rather than from the main programs, also reducing code duplication. This keeps parser internal state out of thee main programs (five in mandoc portable) and out of eight formatters.
* Almost mechanical diff to remove the "struct mparse *" argumentIngo Schwarze2018-12-141-10/+5
| | | | | | | | from mandoc_msg(), where it is no longer used. While here, rename mandoc_vmsg() to mandoc_msg() and retire the old version: There is really no point in having another function merely to save "%s" in a few places. Minus 140 lines of code.
* Clean up the validation of .Pp, .PP, .sp, and .br. Make sure allIngo Schwarze2018-12-041-9/+0
| | | | | | | | | | | | | | combinations are handled, and are handled in a systematic manner. This resolves some erratic duplicate handling, handles a number of missing cases, and improves diagnostics in various respects. Move validation of .br and .sp to the roff validation module rather than doing that twice in the mdoc and man validation modules. Move the node relinking function to the roff library where it belongs. In validation functions, only look at the node itself, at previous nodes, and at descendants, not at following nodes or ancestors, such that only nodes are inspected which are already validated.
* Remove more pointer arithmetic passing via regions outside the arrayIngo Schwarze2018-08-171-12/+6
| | | | | that is undefined according to the C standard. Robert Elz <kre at munnari dot oz dot au> pointed out i wasn't quite done yet.
* Make the "new sentence, new line" check stricter, allowing digitsIngo Schwarze2017-08-111-2/+2
| | | | | | in the last two letters of the last word of the sentence. No false positives in base or Xenocara. Suggested by and OK jmc@.
* correct handling of blank lines after \cIngo Schwarze2017-06-171-6/+22
|
* Also catch "new sentence, new line" if there are three blanksIngo Schwarze2017-06-071-6/+12
| | | | | between the sentences. Thomas Klausner says he has seen some of these, and i don't see any false positives.
* Make "new sentence, new line" detection stricter:Ingo Schwarze2017-06-071-1/+1
| | | | | | | | | Also catch cases where the new sentence starts with a one-letter word and the input line is broken right after that word. Suggested by Thomas Klausner <wiz @ NetBSD>. It's merely a three-bit diff, changing one byte from 0x34 to 0x33, so what can possibly go wrong...
* Move .sp to the roff modules. Enough infrastructure is in placeIngo Schwarze2017-05-051-1/+1
| | | | now that this actually saves code: -70 LOC.
* Parser unification: use nice ohashes for all three request and macro tables;Ingo Schwarze2017-04-291-16/+9
| | | | no functional change, minus two source files, minus 200 lines of code.
* Continue parser unification:Ingo Schwarze2017-04-241-46/+8
| | | | | | | | * Make enum rofft an internal interface as enum roff_tok in "roff.h". * Represent mdoc and man macros in enum roff_tok. * Make TOKEN_NONE a proper enum value and use it throughout. * Put the prologue macros first in the macro tables. * Unify mdoc_macroname[] and man_macroname[] into roff_name[].
* remove a few redundant conditions that jsg@ found with cppcheckIngo Schwarze2017-03-031-1/+1
|
* Remove the ENDBODY_NOSPACE flag, simplifying the code.Ingo Schwarze2017-02-161-2/+2
| | | | | | | | Comparing to groff output, it appears that all cases where it was used and made a difference actually require the opposite, ENDBODY_SPACE. I have no idea why i added it back in 2010; maybe to compensate for some other bug that has long been fixed.
* Add a warning "new sentence, new line".Ingo Schwarze2017-01-281-1/+17
| | | | | | | | | This does not attempt to pinpoint each and every offender, but instead tries very hard to avoid false positives: Currently, there are only two false positives in the whole OpenBSD base system. Only do this in mdoc(7), not in man(7), because manuals written in man(7) typically have much worse problems than this. OK jmc@ on a previous version of the patch
* unify names of AST node flags; no change of cpp outputIngo Schwarze2017-01-101-4/+4
|
* If a column list starts with implicit rows (that is, rows without .It)Ingo Schwarze2016-08-201-42/+19
| | | | | | and roff-level nodes (e.g. tbl or eqn) follow, don't run into an assertion. Instead, wrap the roff-level nodes in their own row. Issue found by tb@ with afl(1).
* If a .Bd block has no arguments at all, drop the block and only keepIngo Schwarze2015-10-301-0/+1
| | | | | its contents. Removing a gratuitious difference to groff output found after a related bug report from krw@.
* In order to become able to generate syntax tree nodes on the roff(7)Ingo Schwarze2015-10-201-19/+10
| | | | | | | | level, validation must be separated from parsing and rewinding. This first big step moves calling of the mdoc(7) post_*() functions out of the parser loop into their own mdoc_validate() pass, while using a new mdoc_state() module to make syntax tree state handling available to both the parser loop and the validation pass.
* To make the code more readable, delete 283 /* FALLTHROUGH */ commentsIngo Schwarze2015-10-121-12/+0
| | | | | | that were right between two adjacent case statement. Keep only those 24 where the first case actually executes some code before falling through to the next case.
* modernize style: "return" is not a functionIngo Schwarze2015-10-061-24/+24
|
* Unify mdoc_deroff() and man_deroff() into a common function deroff().Ingo Schwarze2015-04-231-39/+0
| | | | | | | | No functional change except that for mdoc(7), it now skips leading escape sequences just like it already did for man(7). Escape sequences rarely occur in mdoc(7) code and if they do, skipping them is an improvement in this context. Minus 30 lines of code.
* Get rid of two empty wrapper functions. No functional change.Ingo Schwarze2015-04-231-7/+0
|
* Unify trickier node handling functions.Ingo Schwarze2015-04-191-1/+1
| | | | | | | * man_elem_alloc() -> roff_elem_alloc() * man_block_alloc() -> roff_block_alloc() The functions mdoc_elem_alloc() and mdoc_block_alloc() remain for now because they need to do mdoc(7)-specific argument processing.
* Unify some node handling functions that use TOKEN_NONE.Ingo Schwarze2015-04-191-53/+1
| | | | | | | | * mdoc_word_alloc(), man_word_alloc() -> roff_word_alloc() * mdoc_word_append(), man_word_append() -> roff_word_append() * mdoc_addspan(), man_addspan() -> roff_addtbl() * mdoc_addeqn(), man_addeqn() -> roff_addeqn() Minus 50 lines of code, no functional change.
* Decouple the token code for "no request or macro" from the individualIngo Schwarze2015-04-191-9/+10
| | | | | | high-level parsers to allow further unification of functions that only need to recognize this code, but that don't care about different high-level macrosets beyond that.
* Unify node handling functions:Ingo Schwarze2015-04-191-215/+19
| | | | | | | | | | | * node_alloc() for mdoc and man_node_alloc() -> roff_node_alloc() * node_append() for mdoc and man_node_append() -> roff_node_append() * mdoc_head_alloc() and man_head_alloc() -> roff_head_alloc() * mdoc_body_alloc() and man_body_alloc() -> roff_body_alloc() * mdoc_node_unlink() and man_node_unlink() -> roff_node_unlink() * mdoc_node_free() and man_node_free() -> roff_node_free() * mdoc_node_delete() and man_node_delete() -> roff_node_delete() Minus 130 lines of code, no functional change.
* Delete the wrapper functions mdoc_meta(), man_meta(), mdoc_node(),Ingo Schwarze2015-04-181-14/+0
| | | | | | man_node() from the mandoc(3) semi-public interface and the internal wrapper functions print_mdoc() and print_man() from the HTML formatters. Minus 60 lines of code, no functional change.
* Unify {mdoc,man}_{alloc,reset,free}() into roff_man_{alloc,reset,free}().Ingo Schwarze2015-04-181-84/+0
| | | | | Minus 80 lines of code, no functional change. Written on the train from Koeln to Wolfsburg returning from p2k15.
* Move mdoc_hash_init() and man_hash_init() to libmandoc.hIngo Schwarze2015-04-181-1/+0
| | | | | and call them from mparse_alloc() and choose_parser(), preparing unified allocation of struct roff_man.
* Profit from the unified struct roff_man and reduce the number ofIngo Schwarze2015-04-181-0/+1
| | | | | arguments of mparse_result() by one. No functional change. Written on the ICE Bruxelles-Koeln on the way back from p2k15.
* Replace the structs mdoc and man by a unified struct roff_man.Ingo Schwarze2015-04-181-51/+51
| | | | | Almost completely mechanical, no functional change. Written on the train from Exeter to London returning from p2k15.
* Third step towards parser unification:Ingo Schwarze2015-04-021-4/+4
| | | | | Replace struct mdoc_meta and struct man_meta by a unified struct roff_meta. Written of the train from London to Exeter on the way to p2k15.
* Second step towards parser unification:Ingo Schwarze2015-04-021-41/+41
| | | | | | | | | Replace struct mdoc_node and struct man_node by a unified struct roff_node. To be able to use the tok member for both mdoc(7) and man(7) without defining all the macros in roff.h, sacrifice a tiny bit of type safety and make tok an int rather than an enum. Almost mechanical, no functional change. Written on the Eurostar from Bruxelles to London on the way to p2k15.
* First step towards parser unification:Ingo Schwarze2015-04-021-36/+37
| | | | | | Replace enum mdoc_type and enum man_type by a unified enum roff_type. Almost mechanical, no functional change. Written on the ICE train from Frankfurt to Bruxelles on the way to p2k15.
* Do not confuse .Bl -column lists that just broken another blockIngo Schwarze2015-02-121-4/+4
| | | | | | with newly opened .Bl -column lists; fixing an assertion failure jsg@ found with afl: test case #481, Bl It Bl -column It Bd El text text El
* Delete the mdoc_node.pending pointer and the function calculatingIngo Schwarze2015-02-121-2/+4
| | | | | | | | | | | | | | | | | | | | | | | it, make_pending(), which was the most difficult function of the whole mdoc(7) parser. After almost five years of maintaining this hellhole, i just noticed the pointer isn't needed after all. Blocks are always rewound in the reverse order they were opened; that even holds for broken blocks. Consequently, it is sufficient to just mark broken blogs with the flag MDOC_BROKEN and breaking blocks with the flag MDOC_ENDED. When rewinding, instead of iterating the pending pointers, just iterate from each broken block to its parents, rewinding all that are MDOC_ENDED and stopping after processing the first ancestor that it not MDOC_BROKEN. For ENDBODY markers, use the mdoc_node.body pointer in place of the former mdoc_node.pending. This also fixes an assertion failure found by jsg@ with afl, test case #467 (Bo Bl It Bd Bc It), where (surprise surprise) the pending pointer got corrupted. Improved functionality, minus one function, minus one struct field, minus 50 lines of code.
* Simplify by deleting the "lastline" member of struct mdoc_node.Ingo Schwarze2015-02-051-1/+0
| | | | Minus one struct member, minus 17 lines of code, no functional change.
* Get rid of all calls to rew_sub() in blk_exp_close(); only ten callsIngo Schwarze2015-02-021-1/+2
| | | | | | remain in other functions. As a bonus, this fixes an assertion failure jsg@ found some time ago with afl (test case 982) and improves minor details in error reporting.
* Fatal errors no longer exist.Ingo Schwarze2015-01-151-2/+1
| | | | | | If a file can be opened, mandoc will produce some output; at worst, the output may be almost empty. Simplifies error handling and frees a message type for future use.
* Simplify by making the eqn and tbl steering functions void;Ingo Schwarze2014-11-281-4/+2
| | | | no functional change, minus 15 lines of code.
* Simplify by making the mdoc parser callbacks void, and some cleanup;Ingo Schwarze2014-11-281-14/+18
| | | | no functional change, minus 50 lines of code.
* Simplify the code by making various mdoc parser helper functions void.Ingo Schwarze2014-11-281-25/+15
| | | | No functional change, minus 130 lines of code.
* Simplify code by making mdoc validation handlers void.Ingo Schwarze2014-11-281-37/+17
| | | | No functional change, minus 90 lines of code.
* Escape sequences terminate high-level macro names, and when doing so,Ingo Schwarze2014-11-191-7/+17
| | | | | | they are ignored, just in the same way as for request names and for low-level macro names. This also cures a warning in the pod2man(1) preamble.
* correct the spacing after in-line equationsIngo Schwarze2014-10-201-1/+2
| | | | | that start at the beginning of an input line but end before the end of an input line
* correct spacing before inline equationsIngo Schwarze2014-10-201-0/+2
|
* Implement in-line equations, much needed by Xenocara manuals.Ingo Schwarze2014-10-161-57/+0
| | | | | | | | Put the steering into the roff parser rather than into the mdoc parser such that it works for all macro languages and on both text and macro lines. Line breaks and blank characters generated before and after in-line equations are not perfect yet, but let's do one thing at a time.