Several improvements to escape sequence handling.

* Add the missing special character \_ (underscore). * Partial implementations of \a (leader character) and \E (uninterpreted escape character). * Parse and ignore \r (reverse line feed). * Add a WARNING message about undefined escape sequences. * Add an UNSUPP message about unsupported escape sequences. * Mark \! and \? (transparent throughput) and \O (suppress output) as unsupported. * Treat the various variants of zero-width spaces as one-byte escape sequences rather than as special characters, to avoid defining bogus forms with square brackets. * For special characters with one-byte names, do not define bogus forms with square brackets, except for \[-], which is valid. * In the form with square brackets, undefined special characters do not fall back to printing the name verbatim, not even for one-byte names. * Starting a special character name with a blank is an error. * Undefined escape sequences never abort formatting of the input string, not even in HTML output mode. * Document the newly handled escapes, and a few that were missing. * Regression tests for most of the above.
author: Ingo Schwarze <schwarze@openbsd.org> 2018-12-15 19:30:25 +0000
committer: Ingo Schwarze <schwarze@openbsd.org> 2018-12-15 19:30:25 +0000
commit: 0e3f0b740ea18224c3b2c07114be601dd8be97bb (patch)
tree: c930c6fd7e739e926a7fad1c372897af5ea601fb /term.c
parent: 2b0b19a54638a1b40d908611acc8498a911df29c (diff)
download: mandoc-0e3f0b740ea18224c3b2c07114be601dd8be97bb.tar.gz
1 files changed, 10 insertions, 8 deletions
diff --git a/term.c b/term.c
index 0ceb28d6..b0f6fb0d 100644
--- a/term.c
+++ b/term.c
@@ -477,9 +477,6 @@ term_word(struct termp *p, const char *word)
 
 		word++;
 		esc = mandoc_escape(&word, &seq, &sz);
-		if (ESCAPE_ERROR == esc)
-			continue;
-
 		switch (esc) {
 		case ESCAPE_UNICODE:
 			uc = mchars_num2uc(seq + 1, sz - 1);
@@ -500,6 +497,9 @@ term_word(struct termp *p, const char *word)
 					encode1(p, uc);
 			}
 			continue;
+		case ESCAPE_UNDEF:
+			uc = *seq;
+			break;
 		case ESCAPE_FONTBOLD:
 			term_fontrepl(p, TERMFONT_BOLD);
 			continue;
@@ -587,6 +587,9 @@ term_word(struct termp *p, const char *word)
 				case ESCAPE_SPECIAL:
 					uc = mchars_spec2cp(cp, sz);
 					break;
+				case ESCAPE_UNDEF:
+					uc = *seq;
+					break;
 				default:
 					uc = -1;
 					break;
@@ -845,12 +848,8 @@ term_strlen(const struct termp *p, const char *cp)
 		switch (*cp) {
 		case '\\':
 			cp++;
-			esc = mandoc_escape(&cp, &seq, &ssz);
-			if (ESCAPE_ERROR == esc)
-				continue;
-
 			rhs = NULL;
-
+			esc = mandoc_escape(&cp, &seq, &ssz);
 			switch (esc) {
 			case ESCAPE_UNICODE:
 				uc = mchars_num2uc(seq + 1, ssz - 1);
@@ -871,6 +870,9 @@ term_strlen(const struct termp *p, const char *cp)
 						sz += cond_width(p, uc, &skip);
 				}
 				continue;
+			case ESCAPE_UNDEF:
+				uc = *seq;
+				break;
 			case ESCAPE_DEVICE:
 				if (p->type == TERMTYPE_PDF) {
 					rhs = "pdf";
author	Ingo Schwarze <schwarze@openbsd.org>	2018-12-15 19:30:25 +0000
committer	Ingo Schwarze <schwarze@openbsd.org>	2018-12-15 19:30:25 +0000
commit	0e3f0b740ea18224c3b2c07114be601dd8be97bb (patch)
tree	c930c6fd7e739e926a7fad1c372897af5ea601fb /term.c
parent	2b0b19a54638a1b40d908611acc8498a911df29c (diff)
download	mandoc-0e3f0b740ea18224c3b2c07114be601dd8be97bb.tar.gz