summaryrefslogtreecommitdiffstats
path: root/mdoc.c
Commit message (Collapse)AuthorAgeFilesLines
* Add an option -Q (quick) to mandocdb(8)Ingo Schwarze2014-01-051-2/+10
| | | | | | | | | | | | | | | for accelerated generation of reduced-size databases. Implement this by allowing the parsers to optionally abort the parse sequence after the NAME section. While here, garbage collect the unused void *arg attribute of struct mparse and mparse_alloc() and fix some errors in mandoc(3). This reduces the processing time of mandocdb(8) on /usr/share/man by a factor of 2 and the database size by a factor of 4. However, it still takes 5 times the time and 6 times the space of makewhatis(8), so more work is clearly needed.
* Simplify: Remove an unused argument from the mandoc_eos() function.Ingo Schwarze2013-12-311-1/+1
| | | | No functional change.
* When deciding whether two consecutive macros are on the same input line,Ingo Schwarze2013-12-241-0/+1
| | | | | | | | we have to compare the line where the first one *ends* (not where it begins) to the line where the second one starts. This fixes the bug that .Bk allowed output line breaks right after block macros spanning more than one input line, even when the next macro follows on the same line.
* There are three kinds of input lines: text lines, macros takingIngo Schwarze2013-10-211-1/+18
| | | | | | | | | | | | | | | positional arguments (like Dt Fn Xr) and macros taking text as arguments (like Nd Sh Em %T An). In the past, even the latter put each word of their arguments into its own MDOC_TEXT node; instead, concatenate arguments unless delimiters, keeps or spacing mode prevent that. Regarding mandoc(1), this is internal refactoring, no output change intended. Regarding mandocdb(8), this fixes yet another regression introduced when switching from DB to SQLite: The ability to search for strings crossing word boundaries was lost and is hereby restored. At the same time, database sizes and build times are both reduced by a bit more than 5% each.
* Support setting arbitrary roff(7) number registers,Ingo Schwarze2013-10-051-6/+4
| | | | | | | | | | | | | | preserving read support for the ".nr nS" SYNOPSIS state register. Inspired by NetBSD roff.c rev. 1.18 (Christos Zoulas, March 21, 2013), but implemented differently. I don't want to have yet another different implementation of a hash table in mandoc - it would be the second one in roff.c alone and the fifth one in mandoc grand total. Instead, i designed and implemented roff_setreg() and roff_getreg() to be similar to roff_setstrn() and roff_getstrn(). Once we feel the need to optimize, we can introduce one common hash table implementation for everything in mandoc.
* Cleanup naming of local variables to make the code easier on the eye:Ingo Schwarze2012-11-171-140/+140
| | | | | | | | Settle for "struct man *man", "struct mdoc *mdoc", "struct meta *meta" and avoid the confusing "*m" which was sometimes this, sometimes that. No functional change. ok kristaps@ some time ago
* Fix a crash triggered by .Bl -tag .It Xo .El .Sh found by florian@.Ingo Schwarze2012-11-161-0/+3
| | | | | | | | | | | | | | | * When allocating a body end marker, copy the pointer to the normalized block information from the body block, avoiding the risk of subsequent null pointer derefence. * When inserting the body end marker into the syntax tree, do not try to copy that pointer from the parent block, because not being a direkt child of the block it belongs to is the whole point of a body end marker. * Even non-callable blocks (like Bd and Bl) can break other blocks; when this happens, postpone closing them out in the usual way. Completed and tested at the OpenBSD impromptu Coimbra hackathon (c2k12). Thanks to Pedro Almeida and the Laborat'orio de Computa,c~ao Avan,cada da Universidade de Coimbra (http://www.uc.pt/lca) for their hospitality!
* Fix handling of paragraph macros inside lists:Ingo Schwarze2012-07-181-1/+9
| | | | | | | * When they are trailing the last item, move them outside the list. * When they are trailing any other none-compact item, drop them. OpenBSD rev. mdoc_validate.c 1.107, mdoc.c 1.91
* The mdoc(7) \*(Ba predefined string actually forces roman font;Ingo Schwarze2012-07-181-1/+1
| | | | | | | | | | | | that's stupid because it may break enclosing font changes, but let's do the same for groff bug compatibility. --> Never use \*(Ba, use just plain "|"! <-- Also, predefined strings are already expanded by the roff(7) parser, so the mdoc(7) parser has to look for the expanded string. OpenBSD rev. mdoc.c 1.90 and predefs.in 1.3
* Several -mdoc parser improvements related to vertical spacing:Ingo Schwarze2012-07-161-1/+2
| | | | | | | | | | * So far, .Pp and .Lp were removed before paragraph type blocks. * Now also remove .br before paragraph type blocks. * Treat .Lp as a paragraph like .Pp, so remove .Pp, .Lp, .br before it. * Do not treat .sp as a paragraph, don't remove anything before it. * After .Sh, .Ss, .Pp, and .Lp, remove .Pp, .Lp, .sp, .br, and blank lines. * After .sp and .br, remove .br. OpenBSD rev. mdoc.c 1.89 and mdoc_validate.c 1.106
* Add `cc' support.Kristaps Dzonsons2012-06-121-1/+1
| | | | | | | | | | | | | | | This was reported by espie@ and in the TODO. Caveat: `cc' has buggy behaviour when invoked in groff(1) and followed by a line-breaking control character macro, e.g., in a -man doc, .cc | .B foo 'B foo |cc 'B foo will cause groff(1) to behave properly for `.B' but inline the macro definition for `B' when invoked with the line-breaking macro.
* Support -Ios='OpenBSD 5.1' to override uname(3) as the source of theIngo Schwarze2012-05-271-1/+2
| | | | | | | | | | default value for the mdoc(7) .Os macro. Needed for man.cgi on the OpenBSD website. Problem with man.cgi first noticed by deraadt@; beck@ and deraadt@ agree with the way to solve the issue. "Please check them in and I'll look into them later!" kristaps@
* implement .Ap .Bd .Bo .Bq .D1 .Ic .Lp .Oo .Pf .Po .Ss .Sx .Sy .br .spIngo Schwarze2011-09-301-0/+1
| | | | | | | | implement .Bl -bullet add more information to the .TH line escape dots at the beginnings of lines add trailing newline character at the end of the file do not misinterpret the ROOT block as .Ap
* An implementation of `tr'. This routes allocations of TEXT nodesKristaps Dzonsons2011-07-281-1/+1
| | | | | | | through libroff, which does the appropriate translations of `tr'. This is SLOW: it uses the backend of `ds' and `de', which is a simple linear list. However, unlike `ds' and `de', it iterates over EACH CHARACTER of the entire file looking for replacements.
* Simply word allocation in libmdoc and libman.Kristaps Dzonsons2011-07-271-8/+1
|
* Disable in-line eqn processing for a bit.Kristaps Dzonsons2011-07-271-1/+5
|
* First, roff_res() has no need to invoke ROFF_RERUN: since it's executedKristaps Dzonsons2011-07-271-5/+0
| | | | | | | | | | | before any other roff processing occurs, it's Ok to just let it do its thing and pass through. Also, make sure this function is ALWAYS called, not just when first_string is defined. Second, add a new function, roff_parsetext(), that post-processes non-macro lines. This, for the time being, amounts to detecting soft hyphens. This fixes a long-standing bug in that -man now has proper hyphen breaking!
* Implement the first steps of equation parsing from within libmdoc.Kristaps Dzonsons2011-07-251-1/+53
| | | | | | This consists of a shim around the text parser that calls out to libroff if equation components exist on the line. Right now this will do nothing, as the equation delimiter always returns nil.
* Finish the eqn syntactic parser. This correctly parses terms and doesKristaps Dzonsons2011-07-211-2/+2
| | | | | | | the proper `define' dance, which amounts to pure word-replace (you can, say, define `foo' as `define' then define `define' as something else). eqn.c is now ready for some semantic parsing of `box' and `eqn' productions as defined by the grammar.
* Make `struct roff' be passed into libmdoc and libman upon creation.Kristaps Dzonsons2011-07-181-4/+4
| | | | | This is required for supporting in-line equations. While here, push registers properly into roff and add an set/get/mod interface.
* Have libman and libmdoc use mandoc_getcontrol() to determine whether aKristaps Dzonsons2011-03-281-41/+24
| | | | macro has been invoked. libroff is next.
* libmdoc.h and libman.h were including mdoc.h and man.h, respectively.Kristaps Dzonsons2011-03-221-0/+1
| | | | | Don't have them do that (includes in header files = faugh), and have individual files directly include these files.
* Move mandoc_isdelim() back into libmdoc.h. This fixes an unreportedKristaps Dzonsons2011-03-221-0/+44
| | | | | | | | | error where (1) -man pages were punctuating delimiters (e.g., `.B a ;') and where (2) standalone punctuation in -mdoc or -man (e.g., ";" on its own line) would also be punctuated. This introduces a small amount of complexity of mdoc_{html,term}.c must manage their own spacing with running print_word() or print_text(). The check for delimiting now happens in mdoc_macro.c's dword().
* Consolidate messages. Have all parse-time messages (in libmdoc,Kristaps Dzonsons2011-03-201-22/+6
| | | | | | | | | libroff, etc., etc.) route into mandoc_msg() and mandoc_vmsg(), for the time being in libmandoc.h. This requires struct mparse to be passed into the allocation routines instead of mandocmsg and a void pointer. Then, move some of the functionality of the old mmsg() into read.c's mparse_mmsg() (check against wlevel and setting of file_status) and use main.c's mmsg() as simply a printing tool.
* Clean-up in libmdoc: fix last checks for mdoc_*msg return value, thenKristaps Dzonsons2011-03-171-2/+2
| | | | | make mdoc_vmsg not return an int. libmdoc is now completely clean of return-value checks from the message subsystem.
* Plug memory leak of normalised-date field.Kristaps Dzonsons2011-03-151-0/+2
|
* Clean up date handling,Ingo Schwarze2011-03-071-3/+4
| | | | | | | | | | | | as a first step to get rid of the frequent petty warnings in this area: - always store dates as strings, not as seconds since the Epoch - for input, try the three most common formats everywhere - for unrecognized format, just pass the date though verbatim - when there is no date at all, still use the current date Originally triggered by a one-line patch from Tim van der Molen, <tbvdm at xs4all dot nl>, which is included here. Feedback and OK on manual parts from jmc@. "please check this in" kristaps@
* Allow EQN data to be pushed down into libmdoc via mdoc_addeqn(). OnlyKristaps Dzonsons2011-02-091-0/+24
| | | | the adding itself is implemented; equation data is not yet shown.
* Put tbl_alloc function right into the addspan() one, as this is the onlyKristaps Dzonsons2011-02-081-20/+9
| | | | place that it's called.
* Use tbl_span line number for warnings/errors.Kristaps Dzonsons2011-02-061-2/+1
|
* Let the line-number of a tbl_span be remembered.Kristaps Dzonsons2011-02-061-2/+1
|
* Clarified the role of MDOC_HALT in libmdoc functions by having accessorKristaps Dzonsons2011-01-031-9/+8
| | | | | | | | | | functions assert() if they're called after MDOC_HALT is set. This makes more sense than returning 0 because this return value is used for parse errors, not programme-flow errors, and it's inconsistent to use the same value for both. Plus, prior to this, I'd return 0 without printing an error message, which would cause failure to go unreported to the operator.
* Add -man support for tables. Like -mdoc, this consists of anKristaps Dzonsons2011-01-011-0/+2
| | | | | | | external-facing function man_addspan() (this required shuffling around the descope routine) and hooks elsewhere. Also fixed mdoc.c's post-validation of tables.
* Add table processing structures to -mdoc. This consists of anKristaps Dzonsons2011-01-011-0/+36
| | | | | | external-facing function mdoc_addspan(), then various bits to prohibit printing and scanning (this requires some if's to be converted into switch's).
* Clean up {mdoc,man}_pmsg and vmsg invocations (ignore return values).Kristaps Dzonsons2011-01-011-14/+16
|
* Specifying both %T and %J in an `Rs' block causes the title to be quotedKristaps Dzonsons2010-12-251-0/+2
| | | | | instead of underlined. This only happens in -Tascii, as -T[x]html both underlines and italicises.
* As per schwarze@'s suggestions, roll back the refcount structure inKristaps Dzonsons2010-12-241-6/+44
| | | | | | | | favour of a simpler shim for normalised data in the node allocation and free routines. This removes the need to bump and copy references within validator handlers, removes a pointer redirect, and also kills the refcount structure itself. Data is assumed to "live" either in a MDOC_BLOCK or MDOC_ELEM and is copied accordingly.
* Implement reference-counted version of original union mdoc_data. ThisKristaps Dzonsons2010-12-221-22/+6
| | | | | | | simplifies clean-up and allows for more types without extra hassle. Also made in-line literal types in -T[x]html use CODE instead of SPAN to match how literal blocks use PRE.
* Migrate `An' to use a pointer in its data, like everybody else. This isKristaps Dzonsons2010-12-161-0/+3
| | | | | the first step to having a simpler ref-counted system for "data" associated with a node.
* Add a "last child" member of struct mdoc_node.Kristaps Dzonsons2010-12-151-0/+2
| | | | | | | Remove `Pp' or `Lp' if it is the FIRST or LAST child of an `Sh' or `Sh' body. Make "skipping paragraph" be an error, not a warning, as information (an invoked macro) is ignored.
* Merge schwarze@'s relaxation of scope-breaking rules: allow implicitKristaps Dzonsons2010-12-061-3/+1
| | | | ending of scopes and drop stray scope-endings.
* Make sure that the manual section defaults to `1' if it's unset. ThisKristaps Dzonsons2010-12-011-0/+2
| | | | | behaviour only happens if `Dt' isn't specified, which can be exhibited by running mandoc -mdoc on a man manual.
* mdoc_action.c is no more. Attic it and remove it from the Makefile.Kristaps Dzonsons2010-11-301-4/+0
| | | | | Remove references to MDOC_ACTED (it was only assertions) and the pre- and post-action functions.
* Merge from OpenBSD right after 1.10.6; now back to full sync.Ingo Schwarze2010-09-271-7/+4
| | | | | | | | | | * mdoc.c: blank lines outside literal mode are more similar to .sp than .Pp * backslashes do not terminate macros; partial revert of mdoc.c 1.164; the intention of that commit is fully achieved in roff.c * mdoc_term.c: no need to list the same prototype twice * mdoc_validate.c: drop .Pp before .sp just like .Pp before .Pp * fix off-by-one found by jsg@ with parfait, OpenBSD term_ps.c 1.12 ok kristaps@
* Allow `.xx\}' where xx is a macro (e.g., `.br\}') to close scope. This isKristaps Dzonsons2010-08-291-2/+5
| | | | | | experimental and hasn't been rigorously tested. It's only implemented in -mdoc for the time being. This is absolutely required for pod2man. It does, however, make the pod2man preamble be processed in full.
* Implement a simple, consistent user interface for error handling.Ingo Schwarze2010-08-201-29/+6
| | | | | | | | | | | | | | | | | We now have sufficient practical experience to know what we want, so this is intended to be final: - provide -Wlevel (warning, error or fatal) to select what you care about - provide -Wstop to stop after parsing a file with warnings you care about - provide consistent exit status codes for those warnings you care about - fully document what warnings, errors and fatal errors mean - remove all other cruft from the user interface, less is more: - remove all -f knobs along with the whole -f option - remove the old -Werror because calling warnings "fatal" is silly - always finish parsing each file, unless fatal errors prevent that This commit also includes a couple of related simplifications behind the scenes regarding error handling. Feedback and OK kristaps@; Joerg Sonnenberger (NetBSD) and Sascha Wildner (DragonFly BSD) agree with the general direction.
* simplify the code copying the macro name, and sync theIngo Schwarze2010-08-081-8/+5
| | | | | accompagnying comment between man_pmacro() and mdoc_pmacro(); ok'd by kristaps@ together with main.c rev. 1.102
* Clean out the isgraph() checks in mdoc.c and man.c. These code pathsKristaps Dzonsons2010-08-071-12/+1
| | | | | were never taken since main.c begin skipping over unrecognisable characters, so they were noops.
* "Groff allows the initial macro on a line to be delimited by a space ofKristaps Dzonsons2010-08-071-6/+15
| | | | | by a tab; so allow the tab in mandoc, too." Original problem noted by schwarze@. Sync with OpenBSD.
* Text ending in a full stop, exclamation mark or question markIngo Schwarze2010-07-181-1/+1
| | | | | | | | | | | | | | | | | | should not flag the end of a sentence if: 1) The punctuation is followed by closing delimiters and not preceded by alphanumeric characters, like in "There is no full stop (.) in this sentence" or 2) The punctuation is a child of a macro and not preceded by alphanumeric characters, like in "There is no full stop .Pq \&. in this sentence" "looks fine" to kristaps@; tested by jmc@ and sobrado@