summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* At least in theory, this patch lets us compile on Windows (which doesKristaps Dzonsons2011-05-263-3/+23
| | | | not have mmap(), from what I can tell).
* More updates to www. A version's on its way...Kristaps Dzonsons2011-05-261-14/+17
|
* Have preconv install with mandoc.Kristaps Dzonsons2011-05-261-2/+2
|
* Slightly clean up verbiage in coding tags.Kristaps Dzonsons2011-05-261-2/+2
|
* preconv is now on encoding-recognition parity with groff. This lastKristaps Dzonsons2011-05-263-29/+150
| | | | | | | | | commit adds parsing of "File Variables" in the first two lines in order to grok the encoding. This completes groff's recognition sequence (-e, BOM, File variables, -D, default). I've also cleaned up the manual to indicate this and for some general readability. preconv is now compiled by default in the Makefile.
* The \*q predef certainly doesn't map to \"! Fix this.Kristaps Dzonsons2011-05-261-1/+1
|
* Add notes about preconv.1 in the www and change some wording in theKristaps Dzonsons2011-05-262-1/+22
| | | | manual regarding its output and `Nd' sentence.
* Significantly improve preconv. Allow it to recode UTF-8 characters intoKristaps Dzonsons2011-05-263-12/+260
| | | | | | | | | | | | | the \[uNNNN] strings (taking into account big-endian archs). Also allow it to determine from the BOM whether it's a UTF-8 file. Also add the initial manual. This has been tested over a random selection of UTF-8 documents, as % preconv -e utf-8 foo.1 | ./mandoc -Tlocale where -Tlocale is allowed (-DUSE_WCHAR). Note that we're still missing the "type" indicator that preconv accepts.
* If a predefined string is missing, emit a warning and make it an emptyKristaps Dzonsons2011-05-261-5/+6
| | | | | | string instead of passing it along to libmdoc/libman (where it'll be printed verbatim, now). This is what groff seems to do, too (of course without a warning).
* Noticed that our skeleton mdoc.7 had lower-case `Dt'. Fixed and addedKristaps Dzonsons2011-05-262-8/+13
| | | | some language for clarity.
* It's annoying that we don't have preconv, so throw together a quickKristaps Dzonsons2011-05-262-3/+332
| | | | | | | | version and let it grow in-tree. Right now, this only supports the Latin-1 and US-ASCII encoding. I'll do UTF-8 next. It's call-compatible with GNU's preconv although I don't do fancy stuff like BOM or header check. This will come. I used read.c's file-grokking code.
* Document that spec2cp never returns 0.Kristaps Dzonsons2011-05-241-3/+1
|
* Use the correct Unicode value for the zero-width space, which means thatKristaps Dzonsons2011-05-242-28/+11
| | | | | spec2cp never needs to fall through to spec2str. Then clean out html.c of its unnecessary print_res() function.
* Remove all references to ESCAPE_PREDEF, which is now not exposed passedKristaps Dzonsons2011-05-246-109/+0
| | | | the libroff point. This clears up a nice chunk of code.
* Remove predefined strings from the chars.in file, as they're now localKristaps Dzonsons2011-05-242-70/+26
| | | | | to predefs.in. This also makes "BOTH" entries directly into CHAR. The res2str and spec2str are now effectively the same function.
* Most important move in getting predefined strings entirely containedKristaps Dzonsons2011-05-243-2/+88
| | | | | | | | | | | within roff.c. These are now grokked from a table in the roff allocation routine and rest in the newly-created predefs.in (for consistency with chars.in). This is a first implementation and will likely be optimised along with the ds/de lookup table itself. This allows mandoc-defined predefined strings to be correctly removed or whatnot; earlier they couldn't. What will follow is the stripping-away of all predefined-string crud in the other parts of the system.
* Have conditional closure for both text and macro lines call through toKristaps Dzonsons2011-05-242-32/+23
| | | | | | ccond(). Fix the text handler to behave like the macro handler regarding escaped \}. Make \} actually become a zero-width space, too, and clean up the documentation in this regard.
* Fix a TODO to the effect that `.if n \{\ foo .br \}' was failing due toKristaps Dzonsons2011-05-242-15/+23
| | | | | | | | | | the `\}' not being directly after the `.br'. Now we check for `\}' in arbitrary parts of the line, and account for if it's escaped in funny ways. This behaviour diverges somewhat from groff in that the text at and following the `\}' is lost, while groff keeps it (sort-of). I'll add a COMPATIBILITY note to this effect.
* nested .RS/.RE is becoming more importantIngo Schwarze2011-05-211-0/+2
|
* remove a sentence which isn;t true;Ingo Schwarze2011-05-211-2/+0
| | | | from jmc@
* Turn on -Tutf8 in the frontend. Here we go!Kristaps Dzonsons2011-05-202-3/+22
|
* Flip on -Tutf8 backend support. This forces the UTF-8 LC_CTYPE and doesKristaps Dzonsons2011-05-203-4/+18
| | | | | | little else. Also remove the check for __STDC_ISO_10646__. It turns out that very few systems---even those that support it---actually declare this and it's just causing problems instead of being useful.
* Allow non-ASCII terminal encodings to accept unicode values for theKristaps Dzonsons2011-05-201-17/+61
| | | | | special characters, if possible. This is broken into a separate switch statement for clarity.
* Some release notes (this isn't signalling an impending release; I justKristaps Dzonsons2011-05-191-0/+12
| | | | want to get some notes in).
* It seems that __STDC_ISO_10646__ isn't defined even where it can beKristaps Dzonsons2011-05-192-3/+12
| | | | | | | | defined, so remove the check for it and leave it up to people compiling the software (DOWNSTREAM) to take care of this. This will eventually need to be fixed up with a proper non-10646 converter and so on, but this is a simple start. While here, strengthen then language in the Makefile to this effect.
* Make any un-recognised font be considered a call for the Roman font.Kristaps Dzonsons2011-05-183-5/+6
| | | | | This makes sequences of \f[unknown] \fP not completely puke. From a TODO by schwarze@.
* Add TODO entry for standalone `.' in tbl pages (pointed out by YuriKristaps Dzonsons2011-05-182-1/+5
| | | | Pankov). Also fix typo in Makefile, same reporter. Thanks!
* Locale support. I'm checking this in to clean up fall-out in-tree, butKristaps Dzonsons2011-05-174-20/+138
| | | | | | | | | | | | | | | | | | | | | | | | it looks pretty good. Basically, the -Tlocale option propogates into term_ascii.c, where we set locale-specific console call-backs IFF (1) setlocale() works; (2) locale support is compiled in (see Makefile for -DUSE_WCHAR); (3) the internal structure of wchar_t maps directly to Unicode codepoints as defined by __STDC_ISO_10646__; and (4) the console supports multi-byte characters. To date, this configuration only supports GNU/Linux. OpenBSD doesn't export __STDC_ISO_10646__ although I'm told by stsp@openbsd.org that it should (it has the correct map). Apparently FreeBSD is the same way. NetBSD? Don't know. Apple also supports this, but doesn't define the macro. Special-casing! Benchmark: -Tlocale incurs less than 0.2 factor overhead when run through several thousand manuals when UTF8 output is enabled. Native mode (whether directly -Tascii or through no locale or whatever) is UNCHANGED: the function callbacks are the same as before. Note. If the underlying system does NOT support STDC_ISO_10646, there is a "slow" version possible with iconv or other means of flipping from a Unicode codepoint to a wchar_t.
* Add mode for -Tlocale. This mode, with this commit, behaves exactlyKristaps Dzonsons2011-05-178-37/+37
| | | | | | | like -Tascii. While adding this, inline term_alloc() (was a one-liner), remove some switches around the terminal encoding for the symbol table (unnecessary), and split out ascii_alloc() into ascii_init(), which is also called from locale_init().
* In tbl layouts, we puked if a space didn't followed a vertical barKristaps Dzonsons2011-05-171-0/+17
| | | | | | (found by Yuri Pankov). This was due to looking for modifiers for the vertical bar. This has been fixed, along with other special-key layout types.
* Documentation: note COMPATIBILITY of -Tascii `?' printing in mandoc.1Kristaps Dzonsons2011-05-172-3/+10
| | | | | and remove some long-fixed notes in sthe same section. Also, add an `Lb' for the mandoc library to mandoc.3 (noted by Sascha Wildner).
* Flip on printing `?' at Unicode codepoints in -Tascii, -Tpdf, and -Tps.Kristaps Dzonsons2011-05-171-1/+9
| | | | | | The reasoning behind printing SOMETHING at a Unicode codepoint is because the input is not "wrong" (we suppress printing of "wrong" things). It's just that ASCII can't handle it.
* Flip on unicode output (via \[uNNNN]) in -T[x]html. Here we go!Kristaps Dzonsons2011-05-174-4/+37
|
* Clean-up fallout: differentiate ID's and HREF's (where to put the `#').Kristaps Dzonsons2011-05-172-3/+3
| | | | Make buffmt functions internally bufinit(), too.
* Cleanups in -T[x]html: make html_idcat() use the buffer and be calledKristaps Dzonsons2011-05-174-63/+41
| | | | | | bufcat_id(), then collapse it into a little function without so much crap. Next, make bufinit() only be called when we really need to do so, and not simply before pre/post calls.
* Clean-ups in -T[x]html: inline print_num(), as it was just a singleKristaps Dzonsons2011-05-174-67/+27
| | | | | | | | | | conditional; same for print_xmltype() and print_doctype(), same reason; make bufncat() be static, as it was only being called from html.c; have bufcat() simply call through to strlcat(). Finally, assert() whenever we truncate. Also rename buffmt() -> bufcat_fmt() to differentiate from buffmt_man et al., which do not concatenate.
* Clean up -T[x]html by using a table instead of a switch statement forKristaps Dzonsons2011-05-171-41/+16
| | | | | the roff units. Also remove a comment about CSS and number types (they all accept decimal numbers).
* Remove function calls to res() and so forth in term_word(). These wereKristaps Dzonsons2011-05-152-55/+17
| | | | | | only used once and simply bloated the binary. Also fix mchars_num2char to correctly render the character instead of using atoi(). This makes the conversation more strict, but it's more correct.
* Fix missing support for \N'n' when calculating string widths in -TasciiKristaps Dzonsons2011-05-153-4/+11
| | | | (oops). Do the same for -Thtml (oops^2).
* Support groff's escape for Unicode input. SeeKristaps Dzonsons2011-05-153-0/+23
| | | | | | http://mdocml.bsd.lv/archives/tech/0368.html For the time being, we just throw it away.
* Use strcspn() in term_strlen(). Clarifies the code.Kristaps Dzonsons2011-05-151-10/+10
|
* Get rid of an "#if 0" that I don't anticipate being fixed ever (nor doesKristaps Dzonsons2011-05-151-42/+0
| | | | it really need to be fixed, anyway).
* Move struct termp_ps into term_ps.c; remove the engine union in struct termp,Kristaps Dzonsons2011-05-152-200/+197
| | | | | | which only held one entry; finally (as per the first), make "ps" member into a pointer managed by term_ps.c. This frees up a nice chunk of memory during run-time and in the binary.
* Continuing last commit with the style-sheet change.Kristaps Dzonsons2011-05-141-19/+21
|
* Fix makewhatis.1 to have the correct name (it was MANDOC-DB, oops).Kristaps Dzonsons2011-05-141-1/+1
|
* Make index.sgml look more like mandoc-cgi, which I find looks much cleanerKristaps Dzonsons2011-05-141-58/+17
| | | | and nicer.
* Make www style.css link up to example.style.css much nicer.Kristaps Dzonsons2011-05-141-39/+32
|
* Make some values "int" that were "size_t". These are primarily used forKristaps Dzonsons2011-05-142-24/+28
| | | | indexing into arrays, so this removes lots of casts from size_t to int.
* Make character engine (-Tascii, -Tpdf, -Tps) ready for Unicode: make bufferKristaps Dzonsons2011-05-146-18/+22
| | | | | | consist of type "int". This will take more work (especially in encode and friends), but this is a strong start. This commit also consists of some harmless lint fixes.
* Give -Thtml and -Txhtml the gift of recognising escapes when calculatingKristaps Dzonsons2011-05-143-5/+42
| | | | | widths (e.g., `Bl -tag -width "\s[blahblah]bar"). This has long since been done for -Tascii but escaped noticed with -T[x]html.