summaryrefslogtreecommitdiffstats
path: root/mdoc.c
Commit message (Collapse)AuthorAgeFilesLines
* Third step towards parser unification:Ingo Schwarze2015-04-021-4/+4
| | | | | Replace struct mdoc_meta and struct man_meta by a unified struct roff_meta. Written of the train from London to Exeter on the way to p2k15.
* Second step towards parser unification:Ingo Schwarze2015-04-021-41/+41
| | | | | | | | | Replace struct mdoc_node and struct man_node by a unified struct roff_node. To be able to use the tok member for both mdoc(7) and man(7) without defining all the macros in roff.h, sacrifice a tiny bit of type safety and make tok an int rather than an enum. Almost mechanical, no functional change. Written on the Eurostar from Bruxelles to London on the way to p2k15.
* First step towards parser unification:Ingo Schwarze2015-04-021-36/+37
| | | | | | Replace enum mdoc_type and enum man_type by a unified enum roff_type. Almost mechanical, no functional change. Written on the ICE train from Frankfurt to Bruxelles on the way to p2k15.
* Do not confuse .Bl -column lists that just broken another blockIngo Schwarze2015-02-121-4/+4
| | | | | | with newly opened .Bl -column lists; fixing an assertion failure jsg@ found with afl: test case #481, Bl It Bl -column It Bd El text text El
* Delete the mdoc_node.pending pointer and the function calculatingIngo Schwarze2015-02-121-2/+4
| | | | | | | | | | | | | | | | | | | | | | | it, make_pending(), which was the most difficult function of the whole mdoc(7) parser. After almost five years of maintaining this hellhole, i just noticed the pointer isn't needed after all. Blocks are always rewound in the reverse order they were opened; that even holds for broken blocks. Consequently, it is sufficient to just mark broken blogs with the flag MDOC_BROKEN and breaking blocks with the flag MDOC_ENDED. When rewinding, instead of iterating the pending pointers, just iterate from each broken block to its parents, rewinding all that are MDOC_ENDED and stopping after processing the first ancestor that it not MDOC_BROKEN. For ENDBODY markers, use the mdoc_node.body pointer in place of the former mdoc_node.pending. This also fixes an assertion failure found by jsg@ with afl, test case #467 (Bo Bl It Bd Bc It), where (surprise surprise) the pending pointer got corrupted. Improved functionality, minus one function, minus one struct field, minus 50 lines of code.
* Simplify by deleting the "lastline" member of struct mdoc_node.Ingo Schwarze2015-02-051-1/+0
| | | | Minus one struct member, minus 17 lines of code, no functional change.
* Get rid of all calls to rew_sub() in blk_exp_close(); only ten callsIngo Schwarze2015-02-021-1/+2
| | | | | | remain in other functions. As a bonus, this fixes an assertion failure jsg@ found some time ago with afl (test case 982) and improves minor details in error reporting.
* Fatal errors no longer exist.Ingo Schwarze2015-01-151-2/+1
| | | | | | If a file can be opened, mandoc will produce some output; at worst, the output may be almost empty. Simplifies error handling and frees a message type for future use.
* Simplify by making the eqn and tbl steering functions void;Ingo Schwarze2014-11-281-4/+2
| | | | no functional change, minus 15 lines of code.
* Simplify by making the mdoc parser callbacks void, and some cleanup;Ingo Schwarze2014-11-281-14/+18
| | | | no functional change, minus 50 lines of code.
* Simplify the code by making various mdoc parser helper functions void.Ingo Schwarze2014-11-281-25/+15
| | | | No functional change, minus 130 lines of code.
* Simplify code by making mdoc validation handlers void.Ingo Schwarze2014-11-281-37/+17
| | | | No functional change, minus 90 lines of code.
* Escape sequences terminate high-level macro names, and when doing so,Ingo Schwarze2014-11-191-7/+17
| | | | | | they are ignored, just in the same way as for request names and for low-level macro names. This also cures a warning in the pod2man(1) preamble.
* correct the spacing after in-line equationsIngo Schwarze2014-10-201-1/+2
| | | | | that start at the beginning of an input line but end before the end of an input line
* correct spacing before inline equationsIngo Schwarze2014-10-201-0/+2
|
* Implement in-line equations, much needed by Xenocara manuals.Ingo Schwarze2014-10-161-57/+0
| | | | | | | | Put the steering into the roff parser rather than into the mdoc parser such that it works for all macro languages and on both text and macro lines. Line breaks and blank characters generated before and after in-line equations are not perfect yet, but let's do one thing at a time.
* Simplify by handling empty request lines at the one logical placeIngo Schwarze2014-09-061-9/+0
| | | | | in the roff parser instead of in three other places in other parsers. No functional change.
* Get rid of HAVE_CONFIG_H, it is always defined; idea from libnbcompat.Ingo Schwarze2014-08-101-2/+0
| | | | | | Include <sys/types.h> where needed, it does not belong in config.h. Remove <stdio.h> from config.h; if it is missing somewhere, it should be added, but i cannot find a *.c file where it is missing.
* Bring the handling of defective prologues even closer to groff,Ingo Schwarze2014-08-061-38/+21
| | | | | | | | | | | | in particular relaxing the distinction between prologue and body and further improving messages. * The last .Dd wins and the last .Os wins, even in the body. * The last .Dt before the first body macro wins. * Missing title in .Dt defaults to UNTITLED. Warn about it. * Missing section in .Dt does not default to 1. But warn about it. * Do not warn multiple times about the same mdoc(7) prologue macro. * Warn about missing .Os. * Incomplete .TH defaults to empty strings. Warn about it.
* mention requests and macros in more messagesIngo Schwarze2014-08-011-2/+2
|
* garbage collect three unused global flags; no functional changeIngo Schwarze2014-07-301-34/+6
|
* mark defos as const; nobody needs to change it,Ingo Schwarze2014-07-091-1/+1
| | | | and it is occasionally useful to be able to pass literal strings
* no need to skip content before first section headerIngo Schwarze2014-07-071-21/+0
|
* Clean up messages related to plain text and to escape sequences.Ingo Schwarze2014-07-061-4/+8
| | | | | * Mention invalid escape sequences and string names, and fallbacks. * Hierarchical naming.
* Implement the obsolete macros .En .Es .Fr .Ot for backward compatibility,Ingo Schwarze2014-07-021-0/+2
| | | | | since this is hardly more complicated than explicitly ignoring them as we did in the past. Of course, do not use them!
* Clean up the warnings related to document structure.Ingo Schwarze2014-07-011-2/+2
| | | | | | | | | * Hierarchical naming of the related enum mandocerr items. * Mention the offending macro, section title, or string. While here, improve some wordings: * Descriptive instead of imperative style. * Uniform style for "missing" and "skipping". * Where applicable, mention the fallback used.
* Start systematic improvements of error reporting.Ingo Schwarze2014-06-201-2/+4
| | | | | | | | | | | So far, this covers all WARNINGs related to the prologue. 1) hierarchical naming of MANDOCERR_* constants 2) mention the macro name in messages where that adds clarity 3) add one missing MANDOCERR_DATE_MISSING msg 4) fix the wording of one message related to the man(7) prologue Started on the plane back from Ottawa.
* Fix a minor optimization i broke in rev. 1.163 on August 20, 2010:Ingo Schwarze2014-04-251-1/+1
| | | | | | Do not bother looking into the hash table when the length of the macro already tells us it's invalid. No functional change. Noticed by jsg@, thanks!
* KNF: case (FOO): -> case FOO:, remove /* LINTED */ and /* ARGSUSED */,Ingo Schwarze2014-04-201-89/+66
| | | | | remove trailing whitespace and blanks before tabs, improve some indenting; no functional change
* Implement the roff(7) .ll (line length) request.Ingo Schwarze2014-03-301-1/+1
| | | | | Found by naddy@ in the textproc/enchant(1) port. Of course, do not use this in new manuals.
* If an .Nd block contains macros, avoid fragmented entries in mandocdb(8),Ingo Schwarze2014-03-231-0/+40
| | | | instead use the .Nd content recursively.
* avoid repetitive code for asprintf error handlingIngo Schwarze2014-03-231-4/+1
|
* The files mandoc.c and mandoc.h contained both specialised low-levelIngo Schwarze2014-03-231-0/+1
| | | | | | | functions used for multiple languages (mdoc, man, roff), for example mandoc_escape(), mandoc_getarg(), mandoc_eos(), and generic auxiliary functions. Split the auxiliaries out into their own file and header. While here, do some #include cleanup.
* Add an option -Q (quick) to mandocdb(8)Ingo Schwarze2014-01-051-2/+10
| | | | | | | | | | | | | | | for accelerated generation of reduced-size databases. Implement this by allowing the parsers to optionally abort the parse sequence after the NAME section. While here, garbage collect the unused void *arg attribute of struct mparse and mparse_alloc() and fix some errors in mandoc(3). This reduces the processing time of mandocdb(8) on /usr/share/man by a factor of 2 and the database size by a factor of 4. However, it still takes 5 times the time and 6 times the space of makewhatis(8), so more work is clearly needed.
* Simplify: Remove an unused argument from the mandoc_eos() function.Ingo Schwarze2013-12-311-1/+1
| | | | No functional change.
* When deciding whether two consecutive macros are on the same input line,Ingo Schwarze2013-12-241-0/+1
| | | | | | | | we have to compare the line where the first one *ends* (not where it begins) to the line where the second one starts. This fixes the bug that .Bk allowed output line breaks right after block macros spanning more than one input line, even when the next macro follows on the same line.
* There are three kinds of input lines: text lines, macros takingIngo Schwarze2013-10-211-1/+18
| | | | | | | | | | | | | | | positional arguments (like Dt Fn Xr) and macros taking text as arguments (like Nd Sh Em %T An). In the past, even the latter put each word of their arguments into its own MDOC_TEXT node; instead, concatenate arguments unless delimiters, keeps or spacing mode prevent that. Regarding mandoc(1), this is internal refactoring, no output change intended. Regarding mandocdb(8), this fixes yet another regression introduced when switching from DB to SQLite: The ability to search for strings crossing word boundaries was lost and is hereby restored. At the same time, database sizes and build times are both reduced by a bit more than 5% each.
* Support setting arbitrary roff(7) number registers,Ingo Schwarze2013-10-051-6/+4
| | | | | | | | | | | | | | preserving read support for the ".nr nS" SYNOPSIS state register. Inspired by NetBSD roff.c rev. 1.18 (Christos Zoulas, March 21, 2013), but implemented differently. I don't want to have yet another different implementation of a hash table in mandoc - it would be the second one in roff.c alone and the fifth one in mandoc grand total. Instead, i designed and implemented roff_setreg() and roff_getreg() to be similar to roff_setstrn() and roff_getstrn(). Once we feel the need to optimize, we can introduce one common hash table implementation for everything in mandoc.
* Cleanup naming of local variables to make the code easier on the eye:Ingo Schwarze2012-11-171-140/+140
| | | | | | | | Settle for "struct man *man", "struct mdoc *mdoc", "struct meta *meta" and avoid the confusing "*m" which was sometimes this, sometimes that. No functional change. ok kristaps@ some time ago
* Fix a crash triggered by .Bl -tag .It Xo .El .Sh found by florian@.Ingo Schwarze2012-11-161-0/+3
| | | | | | | | | | | | | | | * When allocating a body end marker, copy the pointer to the normalized block information from the body block, avoiding the risk of subsequent null pointer derefence. * When inserting the body end marker into the syntax tree, do not try to copy that pointer from the parent block, because not being a direkt child of the block it belongs to is the whole point of a body end marker. * Even non-callable blocks (like Bd and Bl) can break other blocks; when this happens, postpone closing them out in the usual way. Completed and tested at the OpenBSD impromptu Coimbra hackathon (c2k12). Thanks to Pedro Almeida and the Laborat'orio de Computa,c~ao Avan,cada da Universidade de Coimbra (http://www.uc.pt/lca) for their hospitality!
* Fix handling of paragraph macros inside lists:Ingo Schwarze2012-07-181-1/+9
| | | | | | | * When they are trailing the last item, move them outside the list. * When they are trailing any other none-compact item, drop them. OpenBSD rev. mdoc_validate.c 1.107, mdoc.c 1.91
* The mdoc(7) \*(Ba predefined string actually forces roman font;Ingo Schwarze2012-07-181-1/+1
| | | | | | | | | | | | that's stupid because it may break enclosing font changes, but let's do the same for groff bug compatibility. --> Never use \*(Ba, use just plain "|"! <-- Also, predefined strings are already expanded by the roff(7) parser, so the mdoc(7) parser has to look for the expanded string. OpenBSD rev. mdoc.c 1.90 and predefs.in 1.3
* Several -mdoc parser improvements related to vertical spacing:Ingo Schwarze2012-07-161-1/+2
| | | | | | | | | | * So far, .Pp and .Lp were removed before paragraph type blocks. * Now also remove .br before paragraph type blocks. * Treat .Lp as a paragraph like .Pp, so remove .Pp, .Lp, .br before it. * Do not treat .sp as a paragraph, don't remove anything before it. * After .Sh, .Ss, .Pp, and .Lp, remove .Pp, .Lp, .sp, .br, and blank lines. * After .sp and .br, remove .br. OpenBSD rev. mdoc.c 1.89 and mdoc_validate.c 1.106
* Add `cc' support.Kristaps Dzonsons2012-06-121-1/+1
| | | | | | | | | | | | | | | This was reported by espie@ and in the TODO. Caveat: `cc' has buggy behaviour when invoked in groff(1) and followed by a line-breaking control character macro, e.g., in a -man doc, .cc | .B foo 'B foo |cc 'B foo will cause groff(1) to behave properly for `.B' but inline the macro definition for `B' when invoked with the line-breaking macro.
* Support -Ios='OpenBSD 5.1' to override uname(3) as the source of theIngo Schwarze2012-05-271-1/+2
| | | | | | | | | | default value for the mdoc(7) .Os macro. Needed for man.cgi on the OpenBSD website. Problem with man.cgi first noticed by deraadt@; beck@ and deraadt@ agree with the way to solve the issue. "Please check them in and I'll look into them later!" kristaps@
* implement .Ap .Bd .Bo .Bq .D1 .Ic .Lp .Oo .Pf .Po .Ss .Sx .Sy .br .spIngo Schwarze2011-09-301-0/+1
| | | | | | | | implement .Bl -bullet add more information to the .TH line escape dots at the beginnings of lines add trailing newline character at the end of the file do not misinterpret the ROOT block as .Ap
* An implementation of `tr'. This routes allocations of TEXT nodesKristaps Dzonsons2011-07-281-1/+1
| | | | | | | through libroff, which does the appropriate translations of `tr'. This is SLOW: it uses the backend of `ds' and `de', which is a simple linear list. However, unlike `ds' and `de', it iterates over EACH CHARACTER of the entire file looking for replacements.
* Simply word allocation in libmdoc and libman.Kristaps Dzonsons2011-07-271-8/+1
|
* Disable in-line eqn processing for a bit.Kristaps Dzonsons2011-07-271-1/+5
|
* First, roff_res() has no need to invoke ROFF_RERUN: since it's executedKristaps Dzonsons2011-07-271-5/+0
| | | | | | | | | | | before any other roff processing occurs, it's Ok to just let it do its thing and pass through. Also, make sure this function is ALWAYS called, not just when first_string is defined. Second, add a new function, roff_parsetext(), that post-processes non-macro lines. This, for the time being, amounts to detecting soft hyphens. This fixes a long-standing bug in that -man now has proper hyphen breaking!