summaryrefslogtreecommitdiffstats
path: root/mandoc.c
Commit message (Collapse)AuthorAgeFilesLines
* implement the roff(7) \p (break output line) escape sequenceIngo Schwarze2017-06-141-0/+2
|
* Style message about legacy man(7) date format in mdoc(7) documentsIngo Schwarze2017-06-111-6/+10
| | | | | and operating system dependent messages about missing or unexpected Mdocdate; inspired by mdoclint(1).
* Partial implementation of \h (horizontal line drawing function).Ingo Schwarze2017-06-021-2/+12
| | | | | | | | | | | A full implementation would require access to output device properties and state variables (both only available after the main parser has finalized the parse tree) before numerical expansions in the roff preprocessor (i.e., before the main parser is even started). Not trying to pull that stunt right now because the static-width implementation committed here is sufficient for tcl-style manual pages and already more complicated than i would have suspected.
* Minimal implementation of the \h (horizontal motion) escape sequence.Ingo Schwarze2017-06-011-1/+1
| | | | Good enough to cope with the average DocBook insanity.
* Simplify the logic in mandoc_normdate() and add some comments.Ingo Schwarze2015-11-121-15/+31
| | | | | | | | Also add a comment in time2a() explaining why it isn't possible to use just one single call to strftime(). Do some style cleanup while here. No functional change. Triggered by a very different patch from des@FreeBSD.
* Delete two preprocessor constants that are no longer used.Ingo Schwarze2015-10-151-2/+0
| | | | Patch from Michael Reed <m dot reed at mykolab dot com>.
* Reject the escape sequences \[uD800] to \[uDFFF] in the parser.Ingo Schwarze2015-10-131-0/+3
| | | | | | | These surrogates are not valid Unicode codepoints, so treat them just like any other undefined character escapes: Warn about them and do not produce output. Issue noticed while talking to stsp@, semarie@, and bentley@.
* To make the code more readable, delete 283 /* FALLTHROUGH */ commentsIngo Schwarze2015-10-121-31/+0
| | | | | | that were right between two adjacent case statement. Keep only those 24 where the first case actually executes some code before falling through to the next case.
* modernize style: "return" is not a functionIngo Schwarze2015-10-061-25/+26
|
* Parse and ignore the escape sequences \, and \/ (italic corrections).Ingo Schwarze2015-08-291-0/+4
| | | | | | | | Actually using these is very stupid because they are groff extensions and other roff(7) implementations typically print unintended characters at the places where they are used. Nevertheless, some manuals contain them, for example ocserv(8). Problem reported by Kurt Jaeger <pi at FreeBSD>.
* For selecting a two-digit font size, support the historic syntax \s12Ingo Schwarze2015-02-201-0/+8
| | | | | | | in addition to the classic syntax \s(12, the modern syntax \s[12], and the alternative syntax \s'12'. The historic syntax only works for the font sizes 10-39. Real-world usage found by naddy@ in plan9/rc.
* Rudimentary implementation of the roff(7) \o escape sequence (overstrike).Ingo Schwarze2015-01-211-4/+6
| | | | | | This is of some relevance because the pod2man(1) preamble abuses it for the icelandic letter Thorn, instead of simply using \(TP and \(Tp. Missing feature found by sthen@ in DateTime::Locale::is_IS(3p).
* Fix a read buffer overrun triggered by trailing \s- or trailing \s+Ingo Schwarze2015-01-011-3/+3
| | | | without the required subsequent argument; found by jsg@ with afl.
* Catch localtime() failure for additional safety;Ingo Schwarze2014-12-151-0/+2
| | | | patch from Jan Stary <hans at stare dot cz> some time ago.
* Tighten Unicode escape name parsing.Ingo Schwarze2014-10-281-4/+9
| | | | | Accept only 0xXXXX, 0xYXXXX, 0x10XXXX with Y != 0. This simplifies mchars_num2uc().
* Stricter syntax checking of Unicode character names:Ingo Schwarze2014-10-131-12/+11
| | | | | | | Require exactly 4, 5 or 6 hex digits and allow nothing else. This avoids mishandling stuff like \[ua] and \C'uA' as Unicode and also fixes underlining in eqn(7) -Thtml output which uses \[ul]. Problem found and semantics suggested by kristaps@.
* Fix a corner case where \H<nil> (where <nil> is the \0 character) wouldKristaps Dzonsons2014-08-181-1/+2
| | | | | | cause mandoc_escape() to read past the end of an allocated string. Found when a script scanning of all Mac OSX manual accidentally also scanned binary (gzip'd) files, discussed with schwarze@ on tech@.
* Improve build system and autodetection.Ingo Schwarze2014-08-161-1/+1
| | | | | | | | | * Make ./configure standalone, that's what people expect. * Let people write a ./configure.local from scratch, not edit existing files. * Autodetect wchar, sqlite3, and manpath and act accordingly. * Autodetect the need for -L/usr/local/lib and -lutil. * Get rid of config.h.p{re,ost}, let ./configure only write what's needed. * Let ./configure write a Makefile.local snippet, that's quite flexible.
* Get rid of HAVE_CONFIG_H, it is always defined; idea from libnbcompat.Ingo Schwarze2014-08-101-2/+0
| | | | | | Include <sys/types.h> where needed, it does not belong in config.h. Remove <stdio.h> from config.h; if it is missing somewhere, it should be added, but i cannot find a *.c file where it is missing.
* Clean up messages related to plain text and to escape sequences.Ingo Schwarze2014-07-061-2/+2
| | | | | * Mention invalid escape sequences and string names, and fallbacks. * Hierarchical naming.
* Fix handling of escape sequences taking numeric arguments.Ingo Schwarze2014-07-061-1/+3
| | | | | | | * Repair detection of invalid delimiters. * Discard the invalid delimiter together with the invalid sequence. Note to self: In general, strchr("\0...", c) is a thoroughly bad idea.
* Clean up the warnings related to document structure.Ingo Schwarze2014-07-011-1/+1
| | | | | | | | | * Hierarchical naming of the related enum mandocerr items. * Mention the offending macro, section title, or string. While here, improve some wordings: * Descriptive instead of imperative style. * Uniform style for "missing" and "skipping". * Where applicable, mention the fallback used.
* Start systematic improvements of error reporting.Ingo Schwarze2014-06-201-2/+2
| | | | | | | | | | | So far, this covers all WARNINGs related to the prologue. 1) hierarchical naming of MANDOCERR_* constants 2) mention the macro name in messages where that adds clarity 3) add one missing MANDOCERR_DATE_MISSING msg 4) fix the wording of one message related to the man(7) prologue Started on the plane back from Ottawa.
* KNF: case (FOO): -> case FOO:, remove /* LINTED */ and /* ARGSUSED */,Ingo Schwarze2014-04-201-61/+61
| | | | | remove trailing whitespace and blanks before tabs, improve some indenting; no functional change
* Fully implement the \B (validate numerical expression) andIngo Schwarze2014-04-081-5/+2
| | | | | | | | | | | partially implement the \w (measure text width) escape sequence in a way that makes them usable in numerical expressions and in conditional requests, similar to how \n (interpolate number register) and \* (expand user-defined string) are implemented. This lets mandoc(1) handle the baroque low-level roff code found at the beginning of the ggrep(1) manual. Thanks to pascal@ for the report.
* Accept arbitrary argument delimiters for various roff(7) escape sequences.Ingo Schwarze2014-04-071-4/+4
| | | | Needed for example by the new Perl pod2man(1) preamble.
* The files mandoc.c and mandoc.h contained both specialised low-levelIngo Schwarze2014-03-231-68/+1
| | | | | | | functions used for multiple languages (mdoc, man, roff), for example mandoc_escape(), mandoc_getarg(), mandoc_eos(), and generic auxiliary functions. Split the auxiliaries out into their own file and header. While here, do some #include cleanup.
* Simplify: Remove an unused argument from the mandoc_eos() function.Ingo Schwarze2013-12-311-4/+4
| | | | No functional change.
* Remove duplicate const specifiers from the declaration of mandoc_escape().Ingo Schwarze2013-12-301-1/+1
| | | | | Found by Thomas Klausner <wiz at NetBSD dot org> using clang. No functional change.
* I have no idea how it happened that \B, \H, \h, \L, and \l gotIngo Schwarze2013-12-261-7/+5
| | | | | | | | | | | mapped to ESCAPE_NUMBERED (which is for \N and only for \N), that made no sense at all. Properly remap them to ESCAPE_IGNORE. While here, move \B and \w from the group taking number arguments to the group taking string arguments; right now, that doesn't imply any functional change, but if we ever go ahead and implement a parser for roff(7) numerical expressions, it will suddenly start to matter, and cause confusion.
* Parse and ignore the roff(7) escape sequences \d (move half line down)Ingo Schwarze2013-12-251-0/+8
| | | | und \u (move half line up). Found by bentley@ in some DocBook crap.
* s/[Nn]ull/NUL/ in comments where appropriate;Ingo Schwarze2013-12-251-3/+3
| | | | suggested by Thomas Klausner <wiz @ NetBSD dot org>.
* Support the alternative syntax \C'uXXXX' for Unicode characters.Ingo Schwarze2013-11-101-1/+4
| | | | | | | | | | It is already documented in the Heirloom troff manual, and groff handles it as well. Bug reported by Bjarni Ingi Gislason <bjarniig at rhi dot hi dot is> on <bug-groff at gnu dot org>. Well, admittedly, that bug was reported against groff, but mandoc was even more broken than groff with respect to this syntax...
* Cleanup suggested by gcc-4.8.1, following hints by Christos Zoulas:Ingo Schwarze2013-10-051-1/+1
| | | | | | | | - avoid bad qualifier casting in roff.c, roff_parsetext() by changing the mandoc_escape arguments to "const char const **" - avoid bad qualifier casting in mandocdb.c, index_merge() - do not complain about unused variables in test-*.c - garbage collect a few unused variables elsewhere
* Implement the roff(7) font-escape sequence \f(BI "bold+italic".Ingo Schwarze2013-08-081-8/+14
| | | | | This improves the formatting of about 40 base manuals and reduces groff-mandoc formatting differences in base by about 5%.
* Improve handling of the roff(7) "\t" escape sequence:Ingo Schwarze2013-06-201-5/+23
| | | | | | | | | | | * Parsing macro arguments has to be done in copy mode, which implies replacing "\t" by a literal tab character. * Otherwise, render "\t" as the empty string, not as a 't' character. This fixes formatting of the distfile example in the oldrdist(1) manual. This also shows up in the unzip(1) manual as one of several issues preventing the removal of USE_GROFF from the archivers/unzip port. Thanks to espie@ for attracting my attention to the unzip(1) manual.
* Add `cc' support.Kristaps Dzonsons2012-06-121-26/+0
| | | | | | | | | | | | | | | This was reported by espie@ and in the TODO. Caveat: `cc' has buggy behaviour when invoked in groff(1) and followed by a line-breaking control character macro, e.g., in a -man doc, .cc | .B foo 'B foo |cc 'B foo will cause groff(1) to behave properly for `.B' but inline the macro definition for `B' when invoked with the line-breaking macro.
* While i already got my fingers dirty on mandoc_escape(),Ingo Schwarze2012-05-311-67/+64
| | | | | | | | | | | | | profit of the occasion to pull out some spaghetti, that is, three confusing variables and fourteen pointless assignments among them; instead, always operate on the official pointers **start, **end, and *sz, each of which conveys an obvious meaning. No functional change intended, and the new tests confirm that everything still (err...) "works", as far as that word can be applied to the kind of roff(7) mock-up code i'm polishing here. "just commit" kristaps@
* Make recursive parsing of roff(7) escapes actually work in the general case,Ingo Schwarze2012-05-311-117/+36
| | | | | | | | | | | | | | in particular when the inner escapes are preceded or followed by other terms. While doing so, remove lots of bogus code that was trying to make pointless distinctions between numeric and non-numeric escape sequences, while both actually share the same syntax and we ignore the semantics anyway. This prevents some of the strings defined in the pod2man(1) preamble from producing garbage output, in particular in scandinavian words. Of course, proper rendering of scandinavian national characters cannot be expected even with these fixes. "just commit" kristaps@
* Implement the roff \z escape sequence, intended to output the nextIngo Schwarze2012-05-311-1/+11
| | | | | | | | | | | | | | | | | | character without advancing the cursor position; implement it to simply skip the next character, as it will usually be overwritten. With this change, the pod2man(1) preamble user-defined string \*:, intended to render as a diaeresis or umlaut diacritic above the preceding character, is rendered in a slightly less ugly way, though still not correctly. It was rendered as "z.." and is now rendered as ".". Given that the definition of \*: uses elaborate manual \h positioning, there is little chance for mandoc(1) to ever render it correctly, but at least we can refrain from printing out a spurious "z", and we can make the \z do something semi-reasonable for easier cases. "just commit" kristaps@
* ISO style "%Y-%m-%d" dates are common in man(7) .TH.Ingo Schwarze2011-12-031-2/+3
| | | | | | | | | | | They have been considered valid in the past, but were reformatted to the mdoc(7) "Month day, year" style. To make page footers more similar to groff, no longer reformat them, just print them as they are. This doesn't change anything with respect to what's considered valid or what is warned about. ok kristaps@
* Accomodate for \f(Cx formatting. Noted by Andreas Vogele, thanks!Kristaps Dzonsons2011-11-061-1/+8
|
* Handle \N numbered character escapes the same way as groff:Ingo Schwarze2011-10-241-6/+22
| | | | | | | | | | | | | If \N is followed by a digit, ignore \N and the digit. If \N is followed by a non-digit, the next non-digit ends the character number; the two delimiters need not match. Kristaps calls that "gross, but not our fault". For now, i'm fixing \N only. Other escapes taking numeric arguments may or may not need similar handling, but \N is by far the most important for practical purposes. ok kristaps@
* forgotten Copyright bumps; no code changeIngo Schwarze2011-09-181-1/+1
| | | | found while syncing to OpenBSD
* Move mandoc_hyph() into roff_parsetext() as a single conditional. WhileKristaps Dzonsons2011-07-271-38/+0
| | | | | here, do some function renames for clarity and make all function prototypes be in one place.
* Update mandoc_hyph() to the extent that numbers on either side of theKristaps Dzonsons2011-07-271-9/+20
| | | | hyphen make for a non-breakable hyphen. Found by random testing.
* Scary-looking but otherwise harmless changes allow me to build for Windows.Kristaps Dzonsons2011-07-241-5/+8
| | | | | | | | | | | | That is to say, with mingw32. This amounts to the following: (1) break compat.c into compat_strlcpy.c and compat_strlcat.c (2) add compat_getsubopt.c (from OpenBSD) and test-getsubopt.c (3) add test-strptime.c for HAVE_STRPTIME (4) add ifdef bits here and there, where necessary (5) remove some harmless unportable stuff (u_char, localtime_r) I've added the appropriate mdocml.zip target to the Makefile, too.
* Complete eqn.7 parsing. Features all productions from the original 1975Kristaps Dzonsons2011-07-211-0/+10
| | | | | | | | CACM paper in an LR(1) parse (1 -> eqn_rewind()). Right now the code is a little jungly, but will clear up as I consolidate parse components. The AST structure will also be cleaned up, as right now it's pretty ad hoc (this won't change the parse itself). I added the mandoc_strndup() function will here.
* Support `size' constructs in eqn.7. Generalise mandoc_strontou to thisKristaps Dzonsons2011-07-211-6/+5
| | | | effect.
* Remove all references to ESCAPE_PREDEF, which is now not exposed passedKristaps Dzonsons2011-05-241-4/+0
| | | | the libroff point. This clears up a nice chunk of code.