summaryrefslogtreecommitdiffstats
path: root/mandoc.c
Commit message (Collapse)AuthorAgeFilesLines
...
* s/[Nn]ull/NUL/ in comments where appropriate;Ingo Schwarze2013-12-251-3/+3
| | | | suggested by Thomas Klausner <wiz @ NetBSD dot org>.
* Support the alternative syntax \C'uXXXX' for Unicode characters.Ingo Schwarze2013-11-101-1/+4
| | | | | | | | | | It is already documented in the Heirloom troff manual, and groff handles it as well. Bug reported by Bjarni Ingi Gislason <bjarniig at rhi dot hi dot is> on <bug-groff at gnu dot org>. Well, admittedly, that bug was reported against groff, but mandoc was even more broken than groff with respect to this syntax...
* Cleanup suggested by gcc-4.8.1, following hints by Christos Zoulas:Ingo Schwarze2013-10-051-1/+1
| | | | | | | | - avoid bad qualifier casting in roff.c, roff_parsetext() by changing the mandoc_escape arguments to "const char const **" - avoid bad qualifier casting in mandocdb.c, index_merge() - do not complain about unused variables in test-*.c - garbage collect a few unused variables elsewhere
* Implement the roff(7) font-escape sequence \f(BI "bold+italic".Ingo Schwarze2013-08-081-8/+14
| | | | | This improves the formatting of about 40 base manuals and reduces groff-mandoc formatting differences in base by about 5%.
* Improve handling of the roff(7) "\t" escape sequence:Ingo Schwarze2013-06-201-5/+23
| | | | | | | | | | | * Parsing macro arguments has to be done in copy mode, which implies replacing "\t" by a literal tab character. * Otherwise, render "\t" as the empty string, not as a 't' character. This fixes formatting of the distfile example in the oldrdist(1) manual. This also shows up in the unzip(1) manual as one of several issues preventing the removal of USE_GROFF from the archivers/unzip port. Thanks to espie@ for attracting my attention to the unzip(1) manual.
* Add `cc' support.Kristaps Dzonsons2012-06-121-26/+0
| | | | | | | | | | | | | | | This was reported by espie@ and in the TODO. Caveat: `cc' has buggy behaviour when invoked in groff(1) and followed by a line-breaking control character macro, e.g., in a -man doc, .cc | .B foo 'B foo |cc 'B foo will cause groff(1) to behave properly for `.B' but inline the macro definition for `B' when invoked with the line-breaking macro.
* While i already got my fingers dirty on mandoc_escape(),Ingo Schwarze2012-05-311-67/+64
| | | | | | | | | | | | | profit of the occasion to pull out some spaghetti, that is, three confusing variables and fourteen pointless assignments among them; instead, always operate on the official pointers **start, **end, and *sz, each of which conveys an obvious meaning. No functional change intended, and the new tests confirm that everything still (err...) "works", as far as that word can be applied to the kind of roff(7) mock-up code i'm polishing here. "just commit" kristaps@
* Make recursive parsing of roff(7) escapes actually work in the general case,Ingo Schwarze2012-05-311-117/+36
| | | | | | | | | | | | | | in particular when the inner escapes are preceded or followed by other terms. While doing so, remove lots of bogus code that was trying to make pointless distinctions between numeric and non-numeric escape sequences, while both actually share the same syntax and we ignore the semantics anyway. This prevents some of the strings defined in the pod2man(1) preamble from producing garbage output, in particular in scandinavian words. Of course, proper rendering of scandinavian national characters cannot be expected even with these fixes. "just commit" kristaps@
* Implement the roff \z escape sequence, intended to output the nextIngo Schwarze2012-05-311-1/+11
| | | | | | | | | | | | | | | | | | character without advancing the cursor position; implement it to simply skip the next character, as it will usually be overwritten. With this change, the pod2man(1) preamble user-defined string \*:, intended to render as a diaeresis or umlaut diacritic above the preceding character, is rendered in a slightly less ugly way, though still not correctly. It was rendered as "z.." and is now rendered as ".". Given that the definition of \*: uses elaborate manual \h positioning, there is little chance for mandoc(1) to ever render it correctly, but at least we can refrain from printing out a spurious "z", and we can make the \z do something semi-reasonable for easier cases. "just commit" kristaps@
* ISO style "%Y-%m-%d" dates are common in man(7) .TH.Ingo Schwarze2011-12-031-2/+3
| | | | | | | | | | | They have been considered valid in the past, but were reformatted to the mdoc(7) "Month day, year" style. To make page footers more similar to groff, no longer reformat them, just print them as they are. This doesn't change anything with respect to what's considered valid or what is warned about. ok kristaps@
* Accomodate for \f(Cx formatting. Noted by Andreas Vogele, thanks!Kristaps Dzonsons2011-11-061-1/+8
|
* Handle \N numbered character escapes the same way as groff:Ingo Schwarze2011-10-241-6/+22
| | | | | | | | | | | | | If \N is followed by a digit, ignore \N and the digit. If \N is followed by a non-digit, the next non-digit ends the character number; the two delimiters need not match. Kristaps calls that "gross, but not our fault". For now, i'm fixing \N only. Other escapes taking numeric arguments may or may not need similar handling, but \N is by far the most important for practical purposes. ok kristaps@
* forgotten Copyright bumps; no code changeIngo Schwarze2011-09-181-1/+1
| | | | found while syncing to OpenBSD
* Move mandoc_hyph() into roff_parsetext() as a single conditional. WhileKristaps Dzonsons2011-07-271-38/+0
| | | | | here, do some function renames for clarity and make all function prototypes be in one place.
* Update mandoc_hyph() to the extent that numbers on either side of theKristaps Dzonsons2011-07-271-9/+20
| | | | hyphen make for a non-breakable hyphen. Found by random testing.
* Scary-looking but otherwise harmless changes allow me to build for Windows.Kristaps Dzonsons2011-07-241-5/+8
| | | | | | | | | | | | That is to say, with mingw32. This amounts to the following: (1) break compat.c into compat_strlcpy.c and compat_strlcat.c (2) add compat_getsubopt.c (from OpenBSD) and test-getsubopt.c (3) add test-strptime.c for HAVE_STRPTIME (4) add ifdef bits here and there, where necessary (5) remove some harmless unportable stuff (u_char, localtime_r) I've added the appropriate mdocml.zip target to the Makefile, too.
* Complete eqn.7 parsing. Features all productions from the original 1975Kristaps Dzonsons2011-07-211-0/+10
| | | | | | | | CACM paper in an LR(1) parse (1 -> eqn_rewind()). Right now the code is a little jungly, but will clear up as I consolidate parse components. The AST structure will also be cleaned up, as right now it's pretty ad hoc (this won't change the parse itself). I added the mandoc_strndup() function will here.
* Support `size' constructs in eqn.7. Generalise mandoc_strontou to thisKristaps Dzonsons2011-07-211-6/+5
| | | | effect.
* Remove all references to ESCAPE_PREDEF, which is now not exposed passedKristaps Dzonsons2011-05-241-4/+0
| | | | the libroff point. This clears up a nice chunk of code.
* Support groff's escape for Unicode input. SeeKristaps Dzonsons2011-05-151-0/+8
| | | | | | http://mdocml.bsd.lv/archives/tech/0368.html For the time being, we just throw it away.
* Make character engine (-Tascii, -Tpdf, -Tps) ready for Unicode: make bufferKristaps Dzonsons2011-05-141-1/+1
| | | | | | consist of type "int". This will take more work (especially in encode and friends), but this is a strong start. This commit also consists of some harmless lint fixes.
* Move roff.c's strtol into libmandoc.h for use by other parts of the codeKristaps Dzonsons2011-05-141-0/+34
| | | | (which will come).
* No code change: fixing spelling errors. From a patch by uqs@. Thanks!Kristaps Dzonsons2011-04-301-1/+1
|
* Clean up parsing of delimiters in -mdoc. First, remove the "dowarn"Kristaps Dzonsons2011-04-191-13/+3
| | | | | | | | variable from mandoc_getarg() so that it prints the warning every time. Then, remove the warning from args_checkpunct(). This way, warnings are being posted at the correct time. This makes the flag argument to mdoc_zargs() superfluous, so make it be zero when it's invoked. Finally, move the args() flags into mdoc_argv.c and make them enums.
* Get mdoc_argv.c ready to use [some of] mandoc_getarg() by giving saidKristaps Dzonsons2011-04-171-5/+6
| | | | function a parameter to suppress warnings.
* Lint catching some potential issues.Kristaps Dzonsons2011-04-091-3/+3
|
* Remove a2roffdeco() and mandoc_special() functions and replace them withKristaps Dzonsons2011-04-091-140/+304
| | | | | | | | | | | | | | | | | | a public (mandoc.h) function mandoc_escape(), which merges the functionality of both prior functions. Reason: code duplication. The a2roffdeco() and mandoc_special() functions were pretty much the same thing and both quite complex. This allows one function to receive improvements in (e.g.) subexpression handling and performance, instead of having to replicate functionality. As such, the mandoc_escape() function already handles a superset of the escapes handled in previous versions and has improvements in performance (using strcspn(), for example) and reliable handling of subexpressions. This code Works For Me, but may need work to catch any regressions. Since the benefits are great (leaner code, simpler API), I'd rather have it in-tree than floating as a patch.
* Have libman and libmdoc use mandoc_getcontrol() to determine whether aKristaps Dzonsons2011-03-281-0/+25
| | | | macro has been invoked. libroff is next.
* Move mandoc_isdelim() back into libmdoc.h. This fixes an unreportedKristaps Dzonsons2011-03-221-50/+0
| | | | | | | | | error where (1) -man pages were punctuating delimiters (e.g., `.B a ;') and where (2) standalone punctuation in -mdoc or -man (e.g., ";" on its own line) would also be punctuated. This introduces a small amount of complexity of mdoc_{html,term}.c must manage their own spacing with running print_word() or print_text(). The check for delimiting now happens in mdoc_macro.c's dword().
* Consolidate messages. Have all parse-time messages (in libmdoc,Kristaps Dzonsons2011-03-201-8/+8
| | | | | | | | | libroff, etc., etc.) route into mandoc_msg() and mandoc_vmsg(), for the time being in libmandoc.h. This requires struct mparse to be passed into the allocation routines instead of mandocmsg and a void pointer. Then, move some of the functionality of the old mmsg() into read.c's mparse_mmsg() (check against wlevel and setting of file_status) and use main.c's mmsg() as simply a printing tool.
* Tiny optimisation in mandoc_isdelim() check.Kristaps Dzonsons2011-03-171-2/+2
|
* Move mdoc_isdelim() into mandoc.h as mandoc_isdelim(). This allows theKristaps Dzonsons2011-03-171-5/+51
| | | | | | removal of manual delimiter checks in html.c and term.c. Finally, add the escaped period as a closing delimiter, removing a TODO to this effect.
* Make lint shut up a little bit.Kristaps Dzonsons2011-03-151-1/+1
|
* my $buf = "string"; return $string; is cool in Perl, but not in C;Ingo Schwarze2011-03-151-16/+22
| | | | | found by Ulrich Spoerlein <uqs at freebsd> using the clang static analyzer; "ok, but please document the numbers" kristaps@
* Clean up date handling,Ingo Schwarze2011-03-071-27/+46
| | | | | | | | | | | | as a first step to get rid of the frequent petty warnings in this area: - always store dates as strings, not as seconds since the Epoch - for input, try the three most common formats everywhere - for unrecognized format, just pass the date though verbatim - when there is no date at all, still use the current date Originally triggered by a one-line patch from Tim van der Molen, <tbvdm at xs4all dot nl>, which is included here. Feedback and OK on manual parts from jmc@. "please check this in" kristaps@
* Unify roff macro argument parsing (in roff.c, roff_userdef()) and man macroIngo Schwarze2011-01-031-2/+80
| | | | | | | | | | | | | | | | | | | | | | | | | | | | argument parsing (in man_argv.c, man_args()), both having different bugs, to use one common macro argument parser (in mandoc.c, mandoc_getarg()), because from the point of view of roff, man macros are just roff macros, hence their arguments are parsed in exactly the same way. While doing so, fix these bugs: * Escaped blanks (i.e. those preceded by an odd number of backslashes) were mishandled as argument separators in unquoted arguments to user-defined roff macros. * Unescaped blanks preceded by an even number of backslashes were not recognized as argument separators in unquoted arguments to man macros. * Escaped backslashes (i.e. pairs of backslashes) were not reduced to single backslashes both in unquoted and quoted arguments both to user-defined roff macros and to man macros. * Escaped quotes (i.e. pairs of quotes inside quoted arguments) were not reduced to single quotes in man macros. OK kristaps@ Note that mdoc macro argument parsing is yet another beast for no good reason and is probably afflicted by similar bugs. But i don't attempt to fix that right now because it is intricately entangled with lots of unrelated high-level mdoc(7) functionality, like delimiter handling and column list phrase handling. Disentagling that would waste too much time now.
* Churny commit to quiet lint. No functional changes.Kristaps Dzonsons2010-09-041-4/+4
|
* Remove overstrike `\o'. This isn't the best solution because we reallyKristaps Dzonsons2010-08-291-2/+2
| | | | | should be printing the contents, but for the time being, this is good enough.
* Handle nested, recursive mathematical subexpressions. This isKristaps Dzonsons2010-08-241-1/+24
| | | | | definitely not general, but it's good enough for pod2man definitions (after I clean up the roff, which will be addressed in later fixes).
* Strip out `\k' escape.Kristaps Dzonsons2010-08-241-1/+1
|
* Stripping out of `\w' groff escape. Yet another for pod2man...Kristaps Dzonsons2010-08-241-2/+6
|
* Strip out the `\z' escape. This is the first recursive sequence,Kristaps Dzonsons2010-08-241-0/+7
| | | | getting mandoc ready to handle pod2man's complex escapes.
* Implement a simple, consistent user interface for error handling.Ingo Schwarze2010-08-201-4/+4
| | | | | | | | | | | | | | | | | We now have sufficient practical experience to know what we want, so this is intended to be final: - provide -Wlevel (warning, error or fatal) to select what you care about - provide -Wstop to stop after parsing a file with warnings you care about - provide consistent exit status codes for those warnings you care about - fully document what warnings, errors and fatal errors mean - remove all other cruft from the user interface, less is more: - remove all -f knobs along with the whole -f option - remove the old -Werror because calling warnings "fatal" is silly - always finish parsing each file, unless fatal errors prevent that This commit also includes a couple of related simplifications behind the scenes regarding error handling. Feedback and OK kristaps@; Joerg Sonnenberger (NetBSD) and Sascha Wildner (DragonFly BSD) agree with the general direction.
* Add \v and \h to ignored escapes. These are in the category of \s.Kristaps Dzonsons2010-08-161-9/+11
| | | | | | Also made sign-less \s-style escapes be ok (this is technically against what's in the groff.7 manual, but seems pretty widespread). Noted by Thomas Jeunet as uglifying the gcc.1 manual.
* Ensure that isalnum is called with unsigned char argument.Joerg Sonnenberger2010-07-251-1/+1
|
* Accept "\s0" (i.e., properly ignore it). Found in the wild (e.g., gfdl.7).Kristaps Dzonsons2010-07-221-0/+3
|
* Accomodate for groff's crappy behaviour wherein an unrecognisedKristaps Dzonsons2010-07-211-1/+1
| | | | | | | | | | | | | single-character escape (and ONLY this type of escape) will map back into itself: "If a backslash is followed by a character that does not constitute a defined escape sequence the backslash is silently ignored and the character maps to itself." (From groff.7.) Found by Jason McIntyre.
* Throw out a2roffdeco() in out.c for a readable version. The prior oneKristaps Dzonsons2010-07-181-2/+61
| | | | | | | | | | | | | was completely unmaintainable. The new one is both readable and quite similar to mandoc_special(), which in future versions will easily allow throwing-away of unsupported escapes (such as \m). It's also a fair bit smaller. DECO_SIZE has been removed: this crap, like colours, will not be supported. mandoc_special() also has #if 0'd switch branches for ALL groff.7 escapes and some lint fixes.
* Text ending in a full stop, exclamation mark or question markIngo Schwarze2010-07-181-9/+12
| | | | | | | | | | | | | | | | | | should not flag the end of a sentence if: 1) The punctuation is followed by closing delimiters and not preceded by alphanumeric characters, like in "There is no full stop (.) in this sentence" or 2) The punctuation is a child of a macro and not preceded by alphanumeric characters, like in "There is no full stop .Pq \&. in this sentence" "looks fine" to kristaps@; tested by jmc@ and sobrado@
* Clean up mandoc_special() (in order later to catch \m). It also flagsKristaps Dzonsons2010-07-181-150/+62
| | | | | | several syntactic errors that weren't caught before. Also un-puke chars.c on zero-length \[].