Several improvements to escape sequence handling.

* Add the missing special character \_ (underscore). * Partial implementations of \a (leader character) and \E (uninterpreted escape character). * Parse and ignore \r (reverse line feed). * Add a WARNING message about undefined escape sequences. * Add an UNSUPP message about unsupported escape sequences. * Mark \! and \? (transparent throughput) and \O (suppress output) as unsupported. * Treat the various variants of zero-width spaces as one-byte escape sequences rather than as special characters, to avoid defining bogus forms with square brackets. * For special characters with one-byte names, do not define bogus forms with square brackets, except for \[-], which is valid. * In the form with square brackets, undefined special characters do not fall back to printing the name verbatim, not even for one-byte names. * Starting a special character name with a blank is an error. * Undefined escape sequences never abort formatting of the input string, not even in HTML output mode. * Document the newly handled escapes, and a few that were missing. * Regression tests for most of the above.
author: Ingo Schwarze <schwarze@openbsd.org> 2018-12-15 19:30:25 +0000
committer: Ingo Schwarze <schwarze@openbsd.org> 2018-12-15 19:30:25 +0000
commit: 0e3f0b740ea18224c3b2c07114be601dd8be97bb (patch)
tree: c930c6fd7e739e926a7fad1c372897af5ea601fb /regress/char/accent
parent: 2b0b19a54638a1b40d908611acc8498a911df29c (diff)
download: mandoc-0e3f0b740ea18224c3b2c07114be601dd8be97bb.tar.gz
5 files changed, 15 insertions, 12 deletions
diff --git a/regress/char/accent/Makefile b/regress/char/accent/Makefile
index 4bc149a7..3712217e 100644
--- a/regress/char/accent/Makefile
+++ b/regress/char/accent/Makefile
@@ -3,5 +3,6 @@
 REGRESS_TARGETS = nocombine utf8only combine
 SKIP_ASCII = utf8only combine
 UTF8_TARGETS = nocombine utf8only combine
+LINT_TARGETS = nocombine
 
 .include <bsd.regress.mk>
diff --git a/regress/char/accent/nocombine.in b/regress/char/accent/nocombine.in
index a81d446b..637d337e 100644
--- a/regress/char/accent/nocombine.in
+++ b/regress/char/accent/nocombine.in
@@ -1,17 +1,17 @@
 .\" $OpenBSD: nocombine.in,v 1.2 2017/07/04 14:53:23 schwarze Exp $
-.TH CHAR-ACCENT-NOCOMBINE 1 "March 8, 2014"
+.TH CHAR-ACCENT-NOCOMBINE 1 "December 15, 2018"
 .SH NAME
 \fBchar-accent-nocombine\fR - non-combining accents
 .SH DESCRIPTION
 bare acute accent: e'e
 .br
-escaped acute accent: e\'e
+escaped acute accent: e\'e\[']e
 .br
 acute accent sequence: e\(aae
 .br
 bare grave accent: e`e
 .br
-escaped grave accent: e\`e
+escaped grave accent: e\`e\[`]e
 .br
 acute grave sequence: e\(gae
 .br
@@ -20,15 +20,15 @@ hungarian umlaut: e\(a"e
 .\" XXX This is ridiculous.
 .\" XXX groff prints the macron as an underscore in the previous line.
 .\" macron: e\(a-e
-.br
+.\" .br
 .\" XXX groff doesn't have a dot in ASCII mode, only in UTF-8 mode.
 .\" dotted: e\(a.e
-.br
+.\" .br
 circumflex: e\(a^e
 .br
 .\" XXX groff uses a backspace for this one in ASCII mode.
 .\" breve: e\(abe
-.br
+.\" .br
 cedilla: e\(ace
 .br
 dieresis: e\(ade
diff --git a/regress/char/accent/nocombine.out_ascii b/regress/char/accent/nocombine.out_ascii
index bc1cce15..0f18ac4a 100644
--- a/regress/char/accent/nocombine.out_ascii
+++ b/regress/char/accent/nocombine.out_ascii
@@ -7,10 +7,10 @@ NNAAMMEE
 
 DDEESSCCRRIIPPTTIIOONN
        bare acute accent: e'e
-       escaped acute accent: e'e
+       escaped acute accent: e'ee
        acute accent sequence: e'e
        bare grave accent: e`e
-       escaped grave accent: e`e
+       escaped grave accent: e`ee
        acute grave sequence: e`e
        hungarian umlaut: e"e
        circumflex: e^e
@@ -25,4 +25,4 @@ DDEESSCCRRIIPPTTIIOONN
 
 
 
-OpenBSD                          March 8, 2014        CHAR-ACCENT-NOCOMBINE(1)
+OpenBSD                        December 15, 2018      CHAR-ACCENT-NOCOMBINE(1)
diff --git a/regress/char/accent/nocombine.out_lint b/regress/char/accent/nocombine.out_lint
new file mode 100644
index 00000000..c9de4162
--- /dev/null
+++ b/regress/char/accent/nocombine.out_lint
@@ -0,0 +1,2 @@
+mandoc: nocombine.in:8:27: WARNING: invalid escape sequence: \[']
+mandoc: nocombine.in:14:27: WARNING: invalid escape sequence: \[`]
diff --git a/regress/char/accent/nocombine.out_utf8 b/regress/char/accent/nocombine.out_utf8
index 3aa441a2..497bf6fd 100644
--- a/regress/char/accent/nocombine.out_utf8
+++ b/regress/char/accent/nocombine.out_utf8
@@ -7,10 +7,10 @@ NNAAMMEE
 
 DDEESSCCRRIIPPTTIIOONN
        bare acute accent: e'e
-       escaped acute accent: e´e
+       escaped acute accent: e´ee
        acute accent sequence: e´e
        bare grave accent: e`e
-       escaped grave accent: e`e
+       escaped grave accent: e`ee
        acute grave sequence: e`e
        hungarian umlaut: e˝e
        circumflex: e^e
@@ -25,4 +25,4 @@ DDEESSCCRRIIPPTTIIOONN
 
 
 
-OpenBSD                          March 8, 2014        CHAR-ACCENT-NOCOMBINE(1)
+OpenBSD                        December 15, 2018      CHAR-ACCENT-NOCOMBINE(1)
author	Ingo Schwarze <schwarze@openbsd.org>	2018-12-15 19:30:25 +0000
committer	Ingo Schwarze <schwarze@openbsd.org>	2018-12-15 19:30:25 +0000
commit	0e3f0b740ea18224c3b2c07114be601dd8be97bb (patch)
tree	c930c6fd7e739e926a7fad1c372897af5ea601fb /regress/char/accent
parent	2b0b19a54638a1b40d908611acc8498a911df29c (diff)
download	mandoc-0e3f0b740ea18224c3b2c07114be601dd8be97bb.tar.gz