| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
| |
only sync to disk one single time when all data is ready.
Rebuild times for /usr/share/man/mandoc.db shrink on my notebook:
In standard mode from 45 seconds to 11 seconds (75% reduction)
In -Q mode from 25 seconds to 3.1 seconds (87% reduction)
For comparison: makewhatis(8): 4.2 seconds
That is, in -Q mode, we are now *faster* than the existing makewhatis(8),
and careful profiling shows there is still a lot of room for improval.
|
|
|
|
| |
It was broken by recent optimizations.
|
|
|
|
|
| |
The concept of an index file is gone since the switch to SQLite.
No functional change.
|
|
|
|
|
| |
The contents can easily be reconstructed from sec, arch, name, form.
Shrinks the database by another 3% in standard mode and 9% in -Q mode.
|
|
|
|
|
| |
This shrinks the database in standard mode by 3%, in -Q mode by 9%,
without loss of functionality.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
for accelerated generation of reduced-size databases.
Implement this by allowing the parsers to optionally
abort the parse sequence after the NAME section.
While here, garbage collect the unused void *arg attribute of
struct mparse and mparse_alloc() and fix some errors in mandoc(3).
This reduces the processing time of mandocdb(8) on /usr/share/man
by a factor of 2 and the database size by a factor of 4.
However, it still takes 5 times the time and 6 times the space
of makewhatis(8), so more work is clearly needed.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It's a pity i spent time during t2k13 writing this; however,
when an entire concept is busted, let us not look back,
There is no such thing as an unreachable page. Even if you are crazy
enough to put a page starting with ".Dt NAMEI 9" into a file man1/cat.1,
we now make sure that it can be found by all of the following:
Nm=namei Nm=cat sec=1 sec=9
It will always be displayed as:
cat(1) - pathname lookup
So you know that you have to type `man cat` to get at it.
That obsoletes the concept of "unreachable manuals" for good.
|
|
|
|
|
| |
This column wasn't helpful because one manpage can have multiple MLINKS.
Use the file name column in the mlinks table, instead.
|
|
|
|
|
| |
They were confusing because a manpage can have MLINKS in different
sections and architectures.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
apropos \( EXPR \) -a 'sec~^NUM$' -a 'arch~^(ARCH|any)$'
in preparation for removal of sec and arch from the mpage table.
Almost no functional change except for the following bonus:
This also makes sure that for cross-section and cross-arch MLINKs,
all of the following work:
apropos -s 1 encrypt
apropos -s 8 encrypt
apropos -s 1 makekey
apropos -s 8 makekey
While here, print error messages about invalid regexps to stderr.
|
|
|
|
|
|
|
|
|
|
| |
in preparation for removing them from the mpages table,
aiming for cleaner and more uniform interfaces.
Database growth is below 4%, part of which will be reclaimed.
As a bonus, this allows searches like:
./obj/apropos An=kettenis -a arch=ppc
./obj/apropos An=kettenis -a sec~[^4]
|
|
|
|
|
| |
that don't necessarily have anything to do with UTF-8.
Just renaming, no functional change.
|
|
|
|
| |
Just like for mandoc(1), provide a -Tutf8 option for people who want that.
|
|
|
|
|
| |
Allocate memory inside, not in the callers.
No functional change.
|
|
|
|
|
| |
not just the first one. This doesn't change how the check is done,
but just which MLINKS are checked.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Berkeley DB to SQLite3: In the .In parser, the logic got inverted.
The resulting NULL pointer access was found by clang;
scan log provided by Ulrich Spoerlein <uqs at FreeBSD>.
The best fix is to simply remove the whole, pointless custom
handler function for .In and let the framework do its work.
Now searching for included header files actually works.
While here, remove the similarly pointless custom .St handler,
fix the return value of the .Fd handler and disentangle the
spaghetti in the .Nm handler.
|
|
|
|
|
| |
and remove pointless local variables;
found in a clang output from Ulrich Spoerlein <uqs at FreeBSD>
|
|
|
|
| |
Fix the loop logic in mlinks_undupe().
|
|
|
|
|
| |
such that the check for source manuals of the same name
can be done for multiple mlinks pointing to the same preformatted mpage.
|
|
|
|
| |
apropos(1) will need it to display its results.
|
|
|
|
| |
Not yet used by apropos(1).
|
|
|
|
|
|
| |
We are still only using one of them for now.
Actually, we are now using a different one,
but the order the mlinks are found is random anyway.
|
|
|
|
| |
Not used yet.
|
|
|
|
|
| |
Consistently use "fsec" and "fform" for info derived from the file name.
No functional change.
|
|
|
|
|
|
|
|
| |
* rename global ohash filenames to mlinks
* rename ofadd() to mlink_add()
* fold fileadd() and inoadd() into mlink_add()
* fold filecheck() into mpages_merge()
Still no functional change.
|
|
|
|
| |
Still a 1:1 relation, no functional change yet.
|
|
|
|
|
|
| |
both because it contains nothing but a subset of the data of the
existing mpages table and because the relationship of mpage and mlink
entries is still 1:1. But all that will eventually change.
|
|
|
|
|
| |
No functional change except that the order of database entries changes,
which doesn't matter anyway.
|
|
|
|
|
| |
Make this more searchable by calling it "inodev".
No functional change.
|
|
|
|
|
|
|
|
| |
table into two tables, on for actual files on disk, one for (often
multiple) directory entries pointing to them. That implies splitting
struct of into two structs, to be called "mpage" and "mlink",
respectively. As a preparation, globally rename "of" and "inos"
to "mpage". No functional change.
|
|
|
|
| |
such that we don't trigger an assertion on a duplicate NAME section.
|
|
|
|
|
|
|
|
|
|
| |
can still be used to write architecture-specific manuals, of course.
So just derive the architecture a man(7) manual belongs to from the
directory where it is located and refrain from warning about each and
every architecture-specific man(7) manual found.
While here, delete some trailing whitespace in the neighbourhood.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
of the path (/man1/ .. /man9/) or the file name suffix (*.1 .. *.9)
indicated a source manual. That missed source manuals with unusual
names in unusual locations.
Instead, as the existing comment right above already suggests, try
the source parsers unless both the path and the file name suffix
unambiguously indicate a preformatted manual (/cat*/*.0).
This change is not expensive in practice because no real-world
system will have large numbers of preformatted pages outside
/cat*/*.0. The only way to make information loss even less probable
would be to try the source parsers on all files, even /cat*/*.0,
which wouldn't buy us much because no real-world system will call
source manuals /cat*/*.0, but it will be expensive in practice,
because many real-world systems have large numbers of preformatted
pages called /cat*/*.0.
|
|
|
|
| |
no functional change
|
|
|
|
| |
string table. Fortunately, they never need UTF-8 translation either.
|
|
|
|
| |
directory or one of its subdirectories.
|
|
|
|
|
| |
so move the str_info structure into that function.
No functional change.
|
|
|
|
| |
so move the statement into the function dbopen().
|
|
|
|
|
| |
is actually reachable by man(1). This check got lost when
the database backend was changed from Berkeley to sqlite.
|
|
|
|
|
| |
Having a mask is sufficient to trigger putmdockey.
Simplify by dropping the flags; no functional change.
|
|
|
|
|
| |
the section was dropped when switching from db to sqlite.
Use the customary format foo(N).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
of the same string within the same manual, so initialize and purge
it for each manual in ofmerge() instead of one single time in main().
There is no point in saving manual names and descriptions in that
table because each of them occurs only once, or very few times.
The is no point in saving section numbers there because they are
so much shorter than the descriptions.
Testing with the complete tree /usr/share/man/ on my notebook shows
that this change slightly reduces memory consumption by about 20%
while there is no measurable difference in execution time.
As a bonus, this allows to delete the functions stradd() and stradds(),
the "next" member from struct str, and the global struct str *words.
While adapting the places in the code using stradd(), i noticed that
parsing of the mdoc(7) .Nd macro was completely broken and that for
formatted manual pages with unusable NAME section, the description
was never set in the struct of. This commit fixes both bugs as well.
|
|
|
|
|
|
|
| |
and ohash_find() twice. As a bonus, this allows to drop hashget().
While here, rename index to slot to match the terminology in the ohash
manual; it also prevents potential clashes with index(3).
Drop the slot variable altogether where it is used only once.
|
|
|
|
|
| |
Also rename straddbuf() to stradds() to be more similar to putkeys().
Just cleanup, no functional change.
|
|
|
|
|
| |
While here, simplify dbopen() and dbclose(): No need for strlcpy()
and strlcat() when dealing with constant strings only.
|
|
|
|
|
|
| |
used as a default page description if no usable NAME section was found
was preserved when moving from db to sqlite, but the code line actually
doing that was removed without replacement. So, put it back.
|
|
|
|
| |
so make the function void; no functional change.
|
|
|
|
|
|
|
|
|
| |
during the switch from db to sqlite; restore these:
* Warn and skip when directory and file name mismatch.
* Warn and skip when finding special files.
* Warning about "mandocdb.db" is useless, it is always present.
* While here, do not hardcode "mandocdb.db", use MANDOC_DB.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
db to sqlite; they are needed to prevent corruption of the database
when paths containing dot, dotdot, or symlinks are given on the
command line. Also make sure the exit-code is really non-zero on
system errors and use mandoc(1) exit codes.
To make all this simpler,
* Drop the "basedir" argument from almost every function and make it
global because it is really state info used all over the place.
* Move "startdir" and "fd" as local vars into set_basedir() because they
are only used for this one purpose, i.e. to move out of basedir again.
While here,
* Clarify the name of path_arg in the main program; in the -C case,
it is not a dir, and anyway there are lots of different dirs around.
* Include missing <stdio.h> needed for perror().
|
|
|
|
|
|
| |
Consistently use the PATH_MAX since it is specified by POSIX,
while MAXPATHLEN is not.
In preparation for using this at a few more places.
|