aboutsummaryrefslogtreecommitdiffstats
path: root/filters
Commit message (Collapse)AuthorAgeFilesLines
* colorize: style chunk function nameJason Cox2023-05-162-2/+7
| | | | | | | | | It's nice to use a different style for the chunk's function name to make it clear that the name is not necessarily adjacent to the chunk's actual lines. Signed-off-by: Jason Cox <me@jasoncarloscox.com> Acked-by: Robin Jarry <robin@jarry.cc>
* filters: fix option parsing on arm cpusRobin Jarry2023-05-112-36/+39
| | | | | | | | | | | | | | | | | | | | | | | | | | | When running colorize or wrap on ARM, both programs exit immediately displaying their help message whatever the provided arguments. This is caused by an implicit downcast of the getopt return value. On most architectures, char is signed by default. On ARM, char is unsigned for performance reasons. Since the signed int return value of getopt is forced into a char, the results differ on ARM compared to x86. * Add -Wconversion -Warith-conversion to CFLAGS in CI builds to ensure catching such issues in the future. * Fix all -Wconversion -Warith-conversion reported issues. * Wide char functions need to deal with wint_t and wchar_t and it is guaranteed that a wchar_t can always fit into a wint_t. Add explicit casts to silence the reported warnings. Link: https://www.arm.linux.org.uk/docs/faqs/signedchar.php Link: https://lwn.net/Articles/911914/ Fixes: https://todo.sr.ht/~rjarry/aerc/164 Reported-by: Bence Ferdinandy <bence@ferdinandy.com> Suggested-by: Allen Sobot <chilledfrogs@disroot.org> Signed-off-by: Robin Jarry <robin@jarry.cc> Tested-by: Bence Ferdinandy <bence@ferdinandy.com>
* colorize: support email domains that start/end with digitsRobin Jarry2023-04-221-1/+1
| | | | | | | | | | | | According to RFC 1123: "... a host domain name is now allowed to begin with a digit and could legally be entirely numeric ..." Link: https://datatracker.ietf.org/doc/html/rfc1123#section-2 Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: Robin Jarry <robin@jarry.cc> Reviewed-by: Bence Ferdinandy <bence@ferdinandy.com>
* colorize: don't print an id in osc8 terminatorTim Culverhouse2023-04-153-9/+13
| | | | | | | | | | | Printing an ID in the OSC8 terminator can cause issues in some pagers and/or terminals. The "spec" doesn't allow for an ID in the terminator, but most applications and terminals will ignore it if it's there. Prevent printing it in the first place for better compatibility. Signed-off-by: Tim Culverhouse <tim@timculverhouse.com> Signed-off-by: Robin Jarry <robin@jarry.cc> Acked-by: Robin Jarry <robin@jarry.cc>
* wrap: do not strip signature delimiter trailing spaceRobin Jarry2023-04-103-5/+13
| | | | | | | | | | Some tools expect this trailing space to be present to detect email signatures start. Reported-by: Jd <john1doe@ya.ru> Fixes: https://todo.sr.ht/~rjarry/aerc/131 Signed-off-by: Robin Jarry <robin@jarry.cc> Tested-by: Jd <john1doe@ya.ru>
* colorize: make url parsing more robustRobin Jarry2023-04-023-25/+84
| | | | | | | | | | | | | | | | | | Reuse the URL parsing algorithm from foot. Basically, it involves recording the opening [, (, < and take into account their closing counterparts. If a closing character is encountered with no matching opening one, assume the URL ends. This allows handling markdown link syntax such as: [http://foobaz.org/xxx](http://foobaz.org/xxx) Avoid coloring bare URL protocols such as http:// or https:// Update test vector to handle more corner cases. Link: https://codeberg.org/dnkl/foot/src/tag/1.13.1/url-mode.c#L331-L471 Signed-off-by: Robin Jarry <robin@jarry.cc> Tested-by: Kirill Chibisov <contact@kchibisov.com>
* colorize: emit OSC 8 for URLs and emailsKirill Chibisov2023-03-263-7/+27
| | | | | | | | | | Mark URLs with OSC 8 escape sequence to help terminal emulators with opening multi-line URLs with the mouse and attach the hyperlink to email addresses, so the users could open them. Link: https://gist.github.com/egmontkob/eb114294efbcd5adb1944c9f3cb5feda Signed-off-by: Kirill Chibisov <contact@kchibisov.com> Acked-by: Robin Jarry <robin@jarry.cc>
* colorize: use bool for boolean variablesRobin Jarry2023-03-261-27/+28
| | | | | | Do not use int for true/false values. Signed-off-by: Robin Jarry <robin@jarry.cc>
* colorize: stop parsing theme when other section startsRobin Jarry2023-03-021-1/+6
| | | | | | | | In order to allow multiple sections in a styleset, colorize must stop parsing the theme when it encounters a new section. Signed-off-by: Robin Jarry <robin@jarry.cc> Tested-by: Tim Culverhouse <tim@timculverhouse.com>
* colorize: restore previous default themeRobin Jarry2023-02-023-81/+81
| | | | | | | | | | | Restore the default theme from the previous colorize awk script. It is more colorful and may be more appealing to new users out of the box. Since colorize is now configurable via stylesets, power users can do whatever they like. Requested-by: Andrea Pappacoda <andrea@pappacoda.it> Signed-off-by: Robin Jarry <robin@jarry.cc> Agreed-by: Bence Ferdinandy <bence@ferdinandy.com>
* wrap: be more robust with localeRobin Jarry2023-01-281-7/+36
| | | | | | | | | | | | | | | | | | | | | | | On (some?) MacOS systems there is no C.UTF-8 locale available. Instead there is a non-standard "UTF-8" (encoding only) replacement. Running wrap on MacOS results in an error: error: failed to set locale: Bad file descriptor Instead of expecting that C.UTF-8 will always be available, try to use the user set locale (either from the $LC_ALL or $LANG environment variables). If these variables are unset or if they are set to an invalid/non-existent locale, fallback on C.UTF-8. If C.UTF-8 is not available, make one last desperate attempt for this UTF-8 non-standard locale (MacOS only). aerc will always send UTF-8 encoded text to the filter commands, If the locale that we managed to load does not use the UTF-8 character encoding, exit with an explicit error instead of risking undefined behaviour. Reported-by: Ben Cohen <ben@bencohen.net> Signed-off-by: Robin Jarry <robin@jarry.cc>
* filters: make colorize URL regex more strictAndrea Pappacoda2023-01-281-4/+4
| | | | | | | | | | | | | | | | | | | | | | | The previous URL regex was too lax, allowing all "[:graph:]" characters after the protocol:// part. This caused the script to mark as part of an URL also things like ">", which is commonly used as a URL delimiter in plain text and Markdown; the url() function tried to account for this with some heuristic to remove trailing characters, but it didn't always work (see the screenshots below). As RFC 3986 specifies the list of allowed characters in URLs, we can simply make our regex stricter and only mark characters as part of an URL if they match the allowed set. As the number of allowed characters has been reduced, the aforementioned heuristic has been slightly simplified. I've also removed the backslash escapes from the bracket expressions, as POSIX regular expressions do not require them; the only characters that need special handling are ']' and '-', which need to be placed at the start and at the end of the expression, respectively. Signed-off-by: Andrea Pappacoda <andrea@pappacoda.it> Acked-by: Robin Jarry <robin@jarry.cc>
* filters: rewrite colorize in cRobin Jarry2023-01-266-177/+909
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since its introduction, we had multiple issues with the colorize awk script with regard to non-GNU awk compatibility. Also, this script is standalone and the color theme must be hard coded into it. Reading from an external configuration file (aerc's styleset) from a non-GNU awk is close to impossible (and even far from trivial with GNU awk). Rewrite the builtin colorize filter in C to allow getting the color theme from aerc's active styleset. The theme is configured using the existing styleset syntax and attributes under a separate [viewer] section (see examples and man page). Export the active styleset file path to AERC_STYLESET env var when invoking the filter command so that colorize can access it and use it. I have tested compilation (with clang-analyzer and gcc -fanalyzer) and basic operation on FreeBSD, Fedora (glibc) and Alpine (muslibc). More tests would probably be required on MacOSX and older Linux distros. I also added test vectors to give some confidence that this works as expected. The execution with these vectors passed valgrind --leak-check=full without errors. NB: the default theme has changed to be more minimal. Sample stylesets have more colorful examples. The awk -v theme=xxx option is no longer supported. usage: colorize [-h] [-s FILE] [-f FILE] options: -h show this help message -s FILE use styleset file (default $AERC_STYLESET) -f FILE read from filename (default stdin) Signed-off-by: Robin Jarry <robin@jarry.cc> Tested-by: Bence Ferdinandy <bence@ferdinandy.com> Acked-by: Moritz Poldrack <moritz@poldrack.dev>
* filters: rewrite wrap in cRobin Jarry2023-01-2616-479/+1432
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This utility introduced in commit c9524d265793 ("filters: add wrap utility written in go") allows to reflow text to view emails that have very long lines without breaking quotes, lists and indentation. For such a simple task, go produces a binary that is 2.0M bytes on disk. After stripping debugging symbols, it can be reduced to 1.2M bytes. All of this for 267 lines of source code. This is a bit ridiculous, provided people may load this binary into memory multiple times per minute. This tool is a small side-project that seems not suitable for golang. Rewrite it in C. It now only depends on a POSIX libc to run. It is safe to assume that there is one available on all *NIX systems in the world of 2023. The resulting binary is now 27K bytes (15K after stripping). To build it, a C compiler and libc headers are required. These should most likely be available since they are dependencies of the go compiler toolchain. I have tested compilation (with clang-analyzer and gcc -fanalyzer) and basic operation on FreeBSD, Fedora (glibc) and Alpine (musl libc). More tests would probably be required on MacOSX and older Linux distros. I also added test vectors to give some confidence that this works as expected. Update CI with aggressive gcc hardening flags and to run these tests with valgrind --leak-check=full. Command line options are unchanged: usage: wrap [-h] [-w INT] [-r] [-l INT] [-f FILE] Wrap text without messing up email quotes. options: -h show this help message -w INT preferred wrap margin (default 80) -r reflow all paragraphs even if no trailing space -l INT minimum percentage of letters in a line to be considered a paragaph -f FILE read from filename (default stdin) Signed-off-by: Robin Jarry <robin@jarry.cc> Tested-by: Bence Ferdinandy <bence@ferdinandy.com> Tested-by: Maxwell G <gotmax@e.email>
* colorize: add 'terminal' theme which respects termcolorsTim Culverhouse2022-12-211-0/+16
| | | | | | | | | | | | Add a "terminal" theme to colorize script. The "terminal" theme respects the users' configured terminal color scheme. Also links are blue underlined. Usage: colorize -v theme=terminal Signed-off-by: Tim Culverhouse <tim@timculverhouse.com> Acked-by: Robin Jarry <robin@jarry.cc>
* wrap: handle letters as list itemsRobin Jarry2022-12-141-1/+1
| | | | | | | | | | | In addition of digits, handle lower case letters as list items: a) foo b) baz c) bar Signed-off-by: Robin Jarry <robin@jarry.cc> Tested-by: Bence Ferdinandy <bence@ferdinandy.com>
* filters: add wrap utility written in goRobin Jarry2022-12-052-0/+479
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I had started writing this as an awk script but quickly got stuck with obscure code which did not even work properly. I jumped the gun and re did it in go. Bonus, we will not have MacOS's 1987 BSD awk issues. On the other hand, instead of a 20.0K awk script, we now have a 2.2M static go binary. If this makes people scream, I challenge them to do that with BSD awk :) Basically, this takes text from stdin or from a file and wraps long lines on word boundaries. It takes care of not breaking up email quotes nor list items (numbered as well). Also, it is conservative by default and only wraps long lines and lines that end with a space (indicating a format=flowed message). If the AERC_SUBJECT environment variable is defined and contains the word PATCH, the text is not modified at all (i.e. wrap behaves as cat(1)). There are a few command line options to control behavior: Usage of ./wrap: -f string read from file instead of stdin -l int minimum percentage of letters in a line to be considered a paragaph (default 50) -r reflow all paragraphs even if no trailing space -w int preferred wrap margin (default 80) Update docs, makefile and default config file with examples. Add a torture test to ensure it works as expected. Signed-off-by: Robin Jarry <robin@jarry.cc> Tested-by: Bence Ferdinandy <bence@ferdinandy.com>
* colorize: add solarized themeJens Grassel2022-11-301-15/+42
| | | | | | | | | | | | | | | This modifies the colorize script to accept a command line parameter to change the colour theme. Currently only a solarized version is added. Due to the nature of awk the theme has to be defined via the `-v` flag. Due to the `switch` statement only being available in GNU awk we use a `if else` statement to ensure that the default colours are used if either the `THEME` variable is not set at all or set to `default`. Solarized colour scheme: https://ethanschoonover.com/solarized/ Signed-off-by: Jens Grassel <jens@wegtam.com> Signed-off-by: Robin Jarry <robin@jarry.cc>
* colorize: make it compatible with BSD awkRobin Jarry2022-10-231-6/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix the following error seen on MacOS: /usr/bin/awk: syntax error at source line 22 source file header_pattern = >>> @ <<< /^[A-Z][[:alnum:]-]+:/ The @ character in front of regular expressions to pre-compile them seems not in the POSIX specification. Replace them with regular strings and call match() instead of the ~ operator. Also, adjust the url_pattern expression for BSD awk which explicitly states: The awk utility is compliant with the IEEE Std 1003.1-2008 (“POSIX.1”) specification, except awk does not support {n,m} pattern matching. Use [[:lower:]]+ instead of [a-z]{2,6}. Tested with: GNU Awk 5.1.1 awk version 20121220 (FreeBSD) Link: https://pubs.opengroup.org/onlinepubs/9699919799/utilities/awk.html Fixes: https://todo.sr.ht/~rjarry/aerc/96 Signed-off-by: Robin Jarry <robin@jarry.cc> Acked-by: Koni Marti <koni.marti@gmail.com>
* filters/colorize: various fixesRobin Jarry2022-10-201-27/+19
| | | | | | | | | | | | | | | | Diff chunks can occur in the middle of email conversations followed by regular and/or quoted text. Handle that properly. Change diff meta lines inside quotes to bold. Update the meta lines with more combination for renamed, copied and deleted files. Fix the diff_chunk invalid color code, only colorize the chunk characters, not the whole line. Remove redundant variable resets to 0. Signed-off-by: Robin Jarry <robin@jarry.cc> Acked-on-irc-by: Tim Culverhouse <tim@timculverhouse.com>
* filters: fix calendar filter parsingKoni Marti2022-08-221-5/+16
| | | | | | | | | | Fix parsing of colons. Since the field separator is also the colon, it could mess up the parsed fields, i.e. a subject line like "WG: dinner" could end up as "WG" instead of keeping the entire string. Fixes: ab941eb ("filters: posix compliant rewrite of calendar filter") Signed-off-by: Koni Marti <koni.marti@gmail.com> Acked-by: Robin Jarry <robin@jarry.cc>
* filters: make it explicit that encoding is UTF-8q3cpma2022-08-032-0/+2
| | | | | | | | | | Document filter input charset Add w3m filter example to default config Adapt html and html-unsafe filters Fixes: https://todo.sr.ht/~rjarry/aerc/65 Signed-off-by: q3cpma <q3cpma@posteo.net> Acked-by: Robin Jarry <robin@jarry.cc>
* filters: posix compliant rewrite of calendar filterKoni Marti2022-07-241-178/+109
| | | | | | | | | | | | Rewrite of the awk calendar filter to make it posix compliant. Tested with awk --posix (awk -V = GNU Awk 5.1.1). Also added some improvements to readability and formatting. This complements commit 3ef4a3ca051a ("filters: try and make awk scripts posix compliant"). Signed-off-by: Koni Marti <koni.marti@gmail.com> Acked-by: Robin Jarry <robin@jarry.cc>
* filters: try and make awk scripts posix compliantRobin Jarry2022-07-182-16/+16
| | | | | | | | | | \x escape sequences are GNU specific. Use the octal escape code instead. filters/calendar is beyond help. It would need a complete rewrite to make it work with POSIX awk. Signed-off-by: Robin Jarry <robin@jarry.cc> Acked-by: Moritz Poldrack <moritz@poldrack.dev>
* filters/colorize: use /usr/bin/awk shebangRobin Jarry2022-07-111-1/+1
| | | | | | | | /bin is reserved for essential commands that may be used when in single user mode. Link: https://refspecs.linuxfoundation.org/FHS_3.0/fhs-3.0.html#binEssentialUserCommandBinaries Signed-off-by: Robin Jarry <robin@jarry.cc>
* filters: Add missing shebangsTom Schwindl2022-07-112-2/+4
| | | | | | | | | The hldiff and plaintext filter scripts are missing their shebangs. Add those to be correct and consistent. Additionally, remove the vim comment, it's unnecessary. Signed-off-by: Tom Schwindl <schwindl@posteo.de> Acked-by: Moritz Poldrack <moritz@poldrack.dev>
* filters: awk filter to parse text/calendarKoni Marti2022-05-311-0/+328
| | | | | | | | | Implement a filter to read text/calendar (ics) data with awk. Parses multiple events and shows the date recurrence if available. Awk alternative to the python filter. Signed-off-by: Koni Marti <koni.marti@gmail.com> Acked-by: Robin Jarry <robin@jarry.cc>
* Add html "unsafe" filter to work also without danteJens Grassel2022-04-171-0/+16
| | | | | | | | If socksify (from dante) is not installed then the filter uses w3m without it to render an html message part. Signed-off-by: Jens Grassel <jens@wegtam.com> Acked-by: Robin Jarry <robin@jarry.cc>
* show-ics-details.py: fix error with python < 3.9Jens Grassel2022-03-241-16/+16
| | | | | | | | Change the pattern matching into a if/elif construct because pattern matching is not supported on python < 3.9. Signed-off-by: Jens Grassel <jens@wegtam.com> Acked-by: Robin Jarry <robin@jarry.cc>
* Add filter script for ics files.Jens Grassel2022-03-231-0/+89
| | | | | | | | This is a python script for python 3 using the vobject library to show details about an ics file (text/calendar attachment). Signed-off-by: Jens Grassel <jens@wegtam.com> Tested-by: Moritz Poldrack <moritz@poldrack.dev>
* colorize: handle mailto prefixes in urlsRobin Jarry2022-03-181-1/+1
| | | | | | | | | mailto:email@domain.tld is the only exception that does not use the <scheme>:// prefix. Requested-by: Moritz Poldrack <moritz@poldrack.dev> Signed-off-by: Robin Jarry <robin@jarry.cc> Tested-by: Moritz Poldrack <moritz@poldrack.dev>
* filters: fix colorize urls in signaturesRobin Jarry2022-03-031-1/+1
| | | | | | | | | | | | | | | | | When a signature contains a line that starts with an url, the url is not highlighted properly: -- Foobar [38;2;255;255;175mmhttps://foobar.org The trailing m of the signature color start \x1b[38;2;175;135;255m is considered as part of the URL scheme (i.e. mhttps:// instead of https://). Colorize the URLs first before wrapping with the signature color. Fixes: df8c129235d9 ("filters: port colorize to awk") Signed-off-by: Robin Jarry <robin@jarry.cc>
* filters: port colorize to awkRobin Jarry2022-02-211-168/+139
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Python is not available on all systems. Port the colorize filter to awk as it is a more widespread POSIX utility. Users are free to copy the filter into their home dir and tweak the colors to their needs. The highlighted items are: =============== ======= ======= ========= ================= Item Red Green Blue Color =============== ======= ======= ========= ================= quoted text 1 95 175 255 Blue quoted text 2 255 135 0 Orange quoted text 3 175 135 255 Purple quoted text 4 255 95 215 Pink quoted text * 128 128 128 Grey diff meta 255 255 255 White bold diff chunk 205 0 205 Cyan diff added 0 205 0 Red diff removed 205 0 0 Green signature 175 135 255 Purple header 175 135 255 Purple url 255 255 175 Yellow =============== ======= ======= ========= ================= This assumes a terminal emulator with true color support and with a dark/black background. Link: https://github.com/termstandard/colors Signed-off-by: Robin Jarry <robin@jarry.cc>
* filters: restore plaintext awk scriptRobin Jarry2022-02-201-0/+16
| | | | | | | | This script is referenced by some users configuration. Restore it to avoid breaking existing setups. Fixes: bca93cd91536 ("filters: add a more complete plaintext filter") Signed-off-by: Robin Jarry <robin@jarry.cc>
* filters: rename plaintext to colorizeRobin Jarry2022-02-201-0/+0
| | | | | | | | This filter script is not compatible with the previous one. Rename it to avoid issues with existing configs. Fixes: bca93cd91536 ("filters: add a more complete plaintext filter") Signed-off-by: Robin Jarry <robin@jarry.cc>
* filters: add a more complete plaintext filterRobin Jarry2022-02-191-0/+171
| | | | | | Colorize most plain text messages. Signed-off-by: Robin Jarry <robin@jarry.cc>
* config: set a default filter for text/plainRobin Jarry2022-02-191-16/+0
| | | | | | | | | | | | Avoid the following issue when running aerc with the default configuration: No filter configured for this mimetype ('text/plain') Use a very basic sed command to replace the default plaintext filter. Fixes: bb0f1801402e ("config: do not hardcode sharedir") Signed-off-by: Robin Jarry <robin@jarry.cc>
* Strip carriage returns (^M) when filtering emailsDaniel Xu2019-08-202-0/+9
| | | | | | | | | | | | Presumably some email servers will transform newlines into carriage return new lines to better support windows users. I can't prove this but that's the best explanation I have for my hosted email provider (fastmail). Without this patch, I was seeing annoying `^M`s at the end of every filtered line. Signed-off-by: Daniel Xu <dxu@dxuuu.xyz>
* Escape plus symbol in hldiff filter.EdOverflow2019-07-131-1/+1
| | | | | | I was getting errors when using the hldiff filter with aerc because the plus symbol on line 28 wasn't escaped. This commit escapes the plus symbol in the regex on line 28.
* Move contrib -> filtersDrew DeVault2019-06-273-0/+62