1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
|
bogofilter TODO list
**** Solaris: /opt/csw - fix system iconv vs. csw iconv. This fails if for
instance libsqlite3 is taken from /opt/csw --with-libsqlite3-prefix,
because iconv isn't taken from there.
This needs a *general* cleanup of prefixes, because those don't
apply to includes (which probably causes these issues).
**** Documentation: Berkeley DB 4.7, check options (versions for
existing ones, new options), use a list of versions that require a log
format upgrade (or DB format or whatever) for simplicity
**** Database (Berkeley DB): Use auto-recover features of Berkeley DB
4.4+ and give up on our own recovery locking and crash detection.
**** Database (Berkeley DB): Can we use the bulk load feature to our advantage?
**** If insufficient data is present and the default "undecided"
bogosity is added in -p mode, add also a comment stating that
bogofilter needs more training first
**** Add a "reservation lock" (fcntl style on separate file) so that a
writer can prevent new readers from starting, so that busy scoring
systems don't starve registration processes. (Figure out the
details to avoid deadlock.)
**** Drop/fix MAXTOKENLEN: where it is an allocation, it must die.
Where it is a character limit, count characters, not octets, to
support UTF-8.
**** Database (Berkeley DB): Implement Concurrent Data store, quite
similar to Transactional.
**** MIME: Make sure that RFC-2047 decoder runs only once, not recursively.
**** MIME: Implement RFC-2046 section 5.2.2 (message/partial reassembly rules,
Take most headers from enclosing message except Content-*, Subject,
Message-ID, Encrypted, MIME-Version, which are taken from the
enclosed message).
**** Reimplement seeking passthrough mode that got dropped on 2003-08-23
with the switch to bogoreader.*
http://article.gmane.org/gmane.mail.bogofilter.general/9035 and
followups. (MID <20041222105734.GA30574@sela.f4n.org>, by "John"
Subject "Size limit?" on 2004-12-22)
The fseek() code to determine if the input is seekable got removed
when the reader moved out of main.c between 1.66 and 1.67 (CVS) and
has never been in bogoreader.c.
**** New Feature: Token aging. Support for struct data in the wordlists is
already present.
**** New feature: Token merging, based on delta tokens (Andras Salamon,
andras@dns.net on bogofilter-dev, 2005-01-25)
**** Two deletes for kmail? This wouldn't be a patch for bogofilter
itself, but a change to give kmail delete-as-spam and delete-as-
nonspam buttons. Similarly for other MUAs.
**** New Feature: Make it a milter?
**** New Feature: Multiple list file support with weights and rules. Wordlist verfification.
Eric Seppanen:
> Allow use of a variable number of list files, each with their
> own weights and rules.
> Possible uses:
> - hand-maintained "whitelist" or "blacklist" files, with massive
> weighting to override everything else.
> - allow users to use system-wide list files and their own files.
>
Shared-database version based on the autodaemon code,
In the shared-database version (which doesn't yet exist) worldlist
verification to avoid attacks on posters (thanks, Barry!).
Emulate the Vipul's Razor reputation scheme for people reporting tokens?
http://razor.sourceforge.net/
**** What this software is probably heading towards is a scheme in which
there's a general notion of tagged categories (spam being one) with
cluster analysis being applied to categorize which categories a
message belongs to at above 0.9 confidence level.
**** New Feature: Web based tool for wordlist management. Allow message
registration and whitelist management. HTML Templatized for easy
integration with existing web mail systems.
**** New Feature: Add support for a user configurable list of headers that
should be used to ignore (single or multi-line) headers that appear
in the list. The list should be used to ignore headers both during
the message registration and evaluation procedures.
|