summaryrefslogtreecommitdiffstats
path: root/read.c
Commit message (Collapse)AuthorAgeFilesLines
* provide a STYLE message when mandoc knows the file name and the extensionIngo Schwarze2020-04-241-2/+8
| | | | | disagrees with the section number given in the .Dt or .TH macro; feature suggested and patch tested by jmc@
* When a .Tg is attached to a paragraph, attach the permalinkIngo Schwarze2020-04-181-3/+3
| | | | to the first word, or the first few words if they are short.
* Separate the place to put the <a href> permalink (now markedIngo Schwarze2020-04-071-0/+1
| | | | | | | with NODE_HREF) from the target element of the link (still marked with NODE_ID). In many cases, use this to move the target to the beginning of the paragraph, such that readers don't get dropped into the middle of a sentence.
* Properly reset the validation part of the tagging module between files.Ingo Schwarze2020-03-131-0/+2
| | | | This fixes a crash in makewhatis(8) encountered by naddy@.
* Split tagging into a validation part including prioritizationIngo Schwarze2020-03-131-2/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | in tag.{h,c} and {mdoc,man}_validate.c and into a formatting part including command line argument checking in term_tag.{h,c}, html.c, and {mdoc|man}_{term|html}.c. Immediate functional benefits include: * Improved prioritization of automatic tags for .Em and .Sy. * Avoiding bogus automatic tags when .Em, .Fn, or .Sy are explicitly tagged. * Explicit tagging of .Er and .Fl now works in HTML output. * Automatic tagging of .IP and .TP now works in HTML output. But mainly, this patch provides clean earth to build further improvements on. Technical changes: * Main program: Write a tag file for ASCII and UTF-8 output only. * All formatters: There is no more need to delay writing the tags. * mdoc(7)+man(7) formatters: No more need for elaborate syntax tree inspection. * HTML formatter: If available, use the "string" attribute as the tag. * HTML formatter: New function to write permalinks, to reduce code duplication. Style cleanup in the vicinity while here: * mdoc(7) terminal formatter: To set up bold font for children, defer to termp_bold_pre() rather than calling term_fontpush() manually. * mdoc(7) terminal formatter: Garbage collect some duplicate functions. * mdoc(7) HTML formatter: Unify <code> handling, delete redundant functions. * Where possible, use switch statements rather than if cascades. * Get rid of some more Yoda notation. The necessity for such changes was first discussed with kn@, but i didn't bother him with a request to review the resulting -673/+782 line patch.
* Some time ago, i simplified mandoc_msg() such that it can be usedIngo Schwarze2019-07-101-17/+16
| | | | | | | | | everywhere and not only in the parsers. For more uniform messages, use it at more places instead of err(3), in particular in the main program. While here, integrate a few trivial functions called at exactly one place into the main option parser, and let a few more functions use the normal convention of returning 0 for success and -1 for error.
* Initialize the local variable "lastln" in mparse_buf_r().Ingo Schwarze2019-06-031-1/+1
| | | | | | | | While there is no bug, it logically makes sense given the meaning of the variable that lastln is NULL as long as firstln is NULL. Michal Nowak <mnowak at startmail dot com> reported that gcc 4.4.4 and 7.4.0 on illumos throw -Wuninitialized false positives.
* When the last line of the input is empty and the previous line reducedIngo Schwarze2019-03-191-0/+2
| | | | | | | the line input buffer to a length of one byte, do not write one byte past the end of the line input buffer. Minimal code to show the bug: printf ".ds X\n.X\n\n" | MALLOC_OPTIONS=C mandoc Bug found by bentley@ in the sysutils/rancid par(1) manual page.
* Improve error reporting when a file given on the command lineIngo Schwarze2019-01-111-2/+4
| | | | | | cannot be opened: * Mention the filename. * Report the errno for the file itself, not the one with .gz appended.
* Cleanup, minus 15 LOC, no functional change:Ingo Schwarze2018-12-311-0/+1
| | | | | | | | | Simplify the way the man(7) and mdoc(7) validators are called. Reset the parser state with a common function before calling them. There is no need to again reset the parser state afterwards, the parsers are no longer used after validation. This allows getting rid of man_node_validate() and mdoc_node_validate() as separate functions.
* Cleanup, no functional change:Ingo Schwarze2018-12-301-25/+21
| | | | | | | | | | | | | | The struct roff_man used to be a bad mixture of internal parser state and public parsing results. Move the public results to the parsing result struct roff_meta, which is already public. Move the rest of struct roff_man to the parser-internal header roff_int.h. Since the validators need access to the parser state, call them from the top level parser during mparse_result() rather than from the main programs, also reducing code duplication. This keeps parser internal state out of thee main programs (five in mandoc portable) and out of eight formatters.
* Move the full responsibility for reporting open(2) errors fromIngo Schwarze2018-12-201-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | mparse_open() to the caller. That is better because only the caller knows its preferred reporting method and format and only the caller has access to all the data that should be included - like the column number in .so processing or the current manpath in makewhatis(8). Moving the mandoc_msg() call out is possible because the caller can call strerror(3) just as easily as mparse_open() can. Move mandoc_msg_setinfilename() closer to the parsing of the file contents, to avoid problems *with* the file (like non-existence, lack of permissions, etc.) getting misreported as problems *in* the file. Fix the column number reported for .so failure: let it point to the beginning of the filename. Taken together, this prevents makewhatis(8) from spewing confusing messages about .so failures to stderr, a bug reported by Raf Czlonka <rczlonka at gmail dot com> on ports@. It also prevents mandoc(1) from issuing *two* messages for every single .so failure.
* Cleanup, no functional change:Ingo Schwarze2018-12-141-2/+2
| | | | | | | | | | Now that message handling is properly encapsulated, remove struct mparse pointers from four structs (roff, roff_man, tbl_node, eqn_node) and from the argument lists of five functions (roff_alloc, roff_man_alloc, mandoc_getarg, tbl_alloc, eqn_alloc). Except for being passed to the main program as an opaque object, it now only occurs in read.c, as it should, and not across 15 files like in the past.
* Almost mechanical diff to remove the "struct mparse *" argumentIngo Schwarze2018-12-141-21/+19
| | | | | | | | from mandoc_msg(), where it is no longer used. While here, rename mandoc_vmsg() to mandoc_msg() and retire the old version: There is really no point in having another function merely to save "%s" in a few places. Minus 140 lines of code.
* Fold mparse_parse_buffer() into mparse_readfd(), making the codeIngo Schwarze2018-12-141-40/+41
| | | | | | | | | | considerably more readable. This is possible now that i finally deleted mparse_readmem() from mandoc portable - an unused function that never existed in OpenBSD. This cleanup already made me find a minor bug: after a recursive parse, restoring the line number of the parent file was forgotten. This is fixed now.
* Delete the function mparse_readmem() that has been unused for almost aIngo Schwarze2018-12-141-13/+0
| | | | | | decade but regularly makes maintenance harder. Mandoc is not a general-purpose library, and being as pluggable as possible is not among the goals of the project.
* Major cleanup; may imply minor changes in edge cases of error reporting.Ingo Schwarze2018-12-141-298/+10
| | | | | | | | | | | Finally, drop support for the run-time configurable mandocmsg() callback. It was over-engineered from the start, never used for anything in a decade, and repeatedly caused maintenance headaches. Consolidate reporting infrastructure into two files, mandoc.h and mandoc_msg.c, mopping up the bits and pieces that were scattered around main.c, read.c, mandoc_parse.h, libmandoc.h, the prototypes of four parsing-related functions, and both parser structs.
* Cleanup, no functional change:Ingo Schwarze2018-12-131-0/+1
| | | | | | | | | | Split the top level parser interface out of the utility header mandoc.h, into a new header mandoc_parse.h, for use in the main program and in the main parser only. Move enum mandoc_os into roff.h because struct roff_man is the place where it is stored. This allows removal of mandoc.h from seven files in low-level parsers and in formatters.
* Cleanup, no functional change:Ingo Schwarze2018-12-131-0/+1
| | | | | | Move the roffhash_*() functions from roff.h to roff_int.h because they are only intended for use by parsers, neither by main programs nor by formatters.
* Rudimentary implementation of the roff(7) .char (output glyphIngo Schwarze2018-08-251-0/+2
| | | | | | | | | definition) request, used for example by groff_hdtbl(7). This simplistic implementation may interact incorrectly with the .tr (input character translation) request. But come on, you are not only using .char *and* .tr, but you do so with respect to the same character in the same manual page?
* Rudimentary implementation of the roff(7) .while request.Ingo Schwarze2018-08-241-54/+118
| | | | | | | | | | | Needed for example by groff_hdtbl(7). There are two limitations: It does not support nested .while requests yet, and each .while loop must start and end in the same scope. The roff_parseln() return codes are now more flexible and allow OR'ing options.
* The upcoming .while request will have to re-execute roff(7) linesIngo Schwarze2018-08-231-70/+67
| | | | | | | parsed earlier, so they will have to be saved for reuse - but the read.c preparser does not know yet whether a line contains a .while request before passing it to the roff parser. To cope with that, save all parsed lines for now. Even shortens the code by 20 lines.
* Implement the roff(7) .shift and .return requests,Ingo Schwarze2018-08-231-11/+31
| | | | | | | | | | | | | | for example used by groff_hdtbl(7) and groff_mom(7). Also correctly interpolate arguments during nested macro execution even after .shift and .return, implemented using a stack of argument arrays. Note that only read.c, but not roff.c can detect the end of a macro execution, and the existence of .shift implies that arguments cannot be interpolated up front, so unfortunately, this includes a partial revert of roff.c rev. 1.337, moving argument interpolation back into the function roff_res().
* Issue a STYLE message when normalizing the date format in .Dd/.TH.Ingo Schwarze2018-07-281-1/+2
| | | | | | | | Leah Neukirchen pointed out that mdoclint(1) used to warn about a leading zero before the day number, so we know that both NetBSD and Void Linux want the message. It does no harm on OpenBSD because Mdocdate always does the right thing anyway. jmc@ agrees that it makes sense in contexts not using Mdocdate.
* Style message about bad input encoding of em-dashes as -- instead of \(em.Ingo Schwarze2018-03-161-0/+1
| | | | Suggested by Thomas Klausner <wiz at NetBSD>; discussed with jmc@.
* After opening a file with gzdopen(3), we have to call gzclose(3) orIngo Schwarze2018-02-231-6/+31
| | | | | | we leak memory internally used by zlib to keep compression state. Bug reported by Wolfgang Mueller <vehk at vehk dot de> who also provided an incomplete patch, part of which i'm using in this commit.
* be less assertive when warning about a possible typo;Ingo Schwarze2017-11-101-1/+1
| | | | from jca@, ok jmc@
* Do not call err(3) from the parser. Call mandoc_vmsg() andIngo Schwarze2017-07-201-9/+15
| | | | return failure such that we can continue with the next file.
* Simplify by creating struct roff_node syntax tree nodes for tbl(7)Ingo Schwarze2017-07-081-14/+1
| | | | | | | | | | | | right from roff_parseln() rather than delegating to read.c, similar to what i just did for eqn(7). The interface function roff_span() becomes obsolete and is deleted, the former interface function roff_addtbl() becomes static, the interface functions tbl_read() and tbl_cdata() become void, and minus twelve linus of code. No functional change.
* 1. Eliminate struct eqn, instead use the existing membersIngo Schwarze2017-07-081-3/+0
| | | | | | of struct roff_node which is allocated for each equation anyway. 2. Do not keep a list of equation parsers, one parser is enough. Minus fifty lines of code, no functional change.
* Now that we have the -Wstyle message level, downgrade six warningsIngo Schwarze2017-07-061-8/+8
| | | | | | that are not syntax mistakes and that do not cause wrong formatting or content to style suggestions. Also upgrade two warnings that may cause information loss to errors.
* Printing "BASE:" in messages about violations of base system conventionsIngo Schwarze2017-07-041-1/+1
| | | | | is confusing, simply print "STYLE:", which is intuitive and does not sound excessively alarming; suggested by jmc@, OK tedu@ jmc@.
* report trailing delimiters after macros where they are usually a mistake;Ingo Schwarze2017-07-031-1/+1
| | | | the idea came up in a discussion with Thomas Klausner <wiz at NetBSD>
* warn about time machines; suggested by Thomas Klausner <wiz @ NetBSD>Ingo Schwarze2017-07-031-0/+1
|
* add warning "cross reference to self"; inspired by mdoclintIngo Schwarze2017-07-021-0/+1
|
* Basic reporting of .Xrs to manual pages that don't existIngo Schwarze2017-07-011-0/+1
| | | | | | | | | | | | in the base system, inspired by mdoclint(1). We are able to do this because (1) the -mdoc parser, the -Tlint validator, and the man(1) manual page lookup code are all in the same program and (2) the mandoc.db(5) database format allows fast lookup. Feedback from, previous versions tested by, and OK jmc@. A few features will be added to this in the tree, step by step.
* warn about some non-portable idioms in .Bl -column;Ingo Schwarze2017-06-291-0/+2
| | | | triggered by a question from Yuri Pankov (illumos)
* Catch typos in .Sh names; suggested by jmc@.Ingo Schwarze2017-06-251-0/+1
| | | | | | I'm using a very simple, linear time / zero space fuzzy string matching heuristic rather than a full Levenshtein metric, to keep the code both simple and fast.
* operating system dependent message about unknown architecture;Ingo Schwarze2017-06-241-0/+1
| | | | inspired by mdoclint
* in the base system, suggest leaving .Os blank; inspired by mdoclintIngo Schwarze2017-06-241-0/+1
|
* Split -Wstyle into -Wstyle and the even lower -Wbase, and addIngo Schwarze2017-06-241-12/+16
| | | | | | | | | | | | | | | -Wopenbsd and -Wnetbsd to check conventions for the base system of a specific operating system. Mark operating system specific messages with "(OpenBSD)" at the end. Please use just "-Tlint" to check base system manuals (defaulting to -Wall, which is now -Wbase), but prefer "-Tlint -Wstyle" for the manuals of portable software projects you maintain that are not part of OpenBSD base, to avoid bogus recommendations about base system conventions that do not apply. Issue originally reported by semarie@, solution using an idea from tedu@, discussed with jmc@ and jca@.
* style message about duplicate RCS ids; inspired by mdoclintIngo Schwarze2017-06-171-0/+1
|
* style message about missing RCS ids; inspired by mdoclintIngo Schwarze2017-06-171-0/+1
|
* Style message about legacy man(7) date format in mdoc(7) documentsIngo Schwarze2017-06-111-0/+3
| | | | | and operating system dependent messages about missing or unexpected Mdocdate; inspired by mdoclint(1).
* style message about missing .Fn markup; inspired by mdoclintIngo Schwarze2017-06-111-0/+1
|
* style message about missing blank before trailing delimiter;Ingo Schwarze2017-06-101-0/+1
| | | | inspired by mdoclint(1), and jmc@ considers it useful
* warning about unknown .Lb arguments; inspired by mdoclint(1)Ingo Schwarze2017-06-081-0/+1
|
* style checks related to .Er; inspired by mdoclint(1)Ingo Schwarze2017-06-071-0/+2
|
* Minimal implementation of the roff(7) .ce request (center a numberIngo Schwarze2017-06-061-0/+1
| | | | | of input lines without filling). Contrary to groff, high-level macros abort .ce mode for now.
* Pure preprocessor implementation of the roff(7) .ec and .eo requestsIngo Schwarze2017-06-041-69/+3
| | | | | | | | | | | | | | | | | (escape character control), touching nothing after the preprocessing stage and keeping even the state variable local to the preprocessor. Since the escape character is also used for line continuation, this requires pulling the implementation of line continuation from the input reader to the preprocessor, which also considerably shortens the code required for that. When the escape character is changed, simply let the preprocessor replace bare by escaped backslashes and instances of the non-standard escape character with bare backslashes - that's all we need. Oh, and if anybody dares to use these requests in OpenBSD manuals, sending a medium-sized pack of axe-murderers after them might be a worthwhile part of the punishment, but probably insuffient on its own.