summaryrefslogtreecommitdiffstats
path: root/mansearch.c
Commit message (Collapse)AuthorAgeFilesLines
* Make the man(1) and apropos(1) options -s and -S much less expensive:Ingo Schwarze2015-01-201-48/+22
| | | | | | | | | | | | | | Do not append an SQL clause looking into the large "keys" table. Instead, filter the result of the SQL query in buildnames() where equivalent data from the much smaller "mlinks" table is already available for free. This is relevant because man(1) uses the equivalent of "-S ${MACHINE}" by default since main.c rev. 1.216, to make sure that manuals for the current architecture are shown. With many ports installed, this patch can speed up man(1) by a factor of more than a hundred. Slowness reported by Theo Buehler <theo at math dot ethz dot ch>, thanks!
* When opening mandoc.db fails, tell the user in which directory.Ingo Schwarze2014-12-061-1/+3
| | | | Improving an unhelpful error message reported by millert@.
* Make makewhatis(8) understand .so links to .gz pages.Ingo Schwarze2014-11-271-12/+7
| | | | | | | | Drop the FORM_GZ annotation in the mpages table; it is conceptually wrong because it ought to be in the mlinks table: An uncompressed .so link file can point to a compressed manual page file and vice versa. Besides, it is no longer needed because mparse_open() handles it all. Sprinkle some KNF while here.
* In man(1) mode, prefer file name matches over .Dt name matches overIngo Schwarze2014-11-181-6/+11
| | | | | | first .Nm entries over other NAME .Nm entries over SYNOPSIS .Nm entries. For example, this makes sure "man ypbind" does not return yp(8). Re-run "makewhatis" to profit from this change.
* In man(1) mode without -a, stop searching after the first manual treeIngo Schwarze2014-11-111-0/+8
| | | | | that contained at least one match in order to not prefer mdoc(1) from ports over mdoc(7). As a bonus, this results in a speedup.
* If a manual page is installed gzip(1)ed, let makewhatis(8) takeIngo Schwarze2014-09-031-4/+9
| | | | | | | note in mandoc.db(5), such that man(1) -w and apropos(1) -w can report the correct filename. This is a prerequisite for letting apropos -a and man support gzip'ed manuals in the future, which doesn't work yet.
* In man(1) mode, change to the right directory before starting the parser,Ingo Schwarze2014-09-011-0/+1
| | | | | | | just like traditional man(1) does, such that .so links have a chance to work. After this point, we don't need the current directory for anything else before exit, so we don't need to worry about getting back and we can safely ignore failure.
* Bugfix: make whatis(1) case-insensitive again.Ingo Schwarze2014-08-211-0/+1
| | | | | The traditional whatis(1) was case-insensitve and it's still documented that way, but that apparently got broken with or after the switch.
* Fully integrate apropos(1) into mandoc(1).Ingo Schwarze2014-08-171-27/+28
| | | | | | | | | Switch the argmode on the progname, including man(1). Provide -f and -k options to switch the argmode. Store the argmode inside struct search, generalizing the flags. Derive the deftype from the argmode when needed instead of storing it. Store the outkey inside struct search instead of passing it alone. While here, get rid of the trailing blanks in Makefile.depend.
* Improve build system and autodetection.Ingo Schwarze2014-08-161-1/+1
| | | | | | | | | * Make ./configure standalone, that's what people expect. * Let people write a ./configure.local from scratch, not edit existing files. * Autodetect wchar, sqlite3, and manpath and act accordingly. * Autodetect the need for -L/usr/local/lib and -lutil. * Get rid of config.h.p{re,ost}, let ./configure only write what's needed. * Let ./configure write a Makefile.local snippet, that's quite flexible.
* Get rid of HAVE_CONFIG_H, it is always defined; idea from libnbcompat.Ingo Schwarze2014-08-101-2/+2
| | | | | | Include <sys/types.h> where needed, it does not belong in config.h. Remove <stdio.h> from config.h; if it is missing somewhere, it should be added, but i cannot find a *.c file where it is missing.
* mmap(2) requires MAP_PRIVATE ^ MAP_SHARED for flags;Ingo Schwarze2014-08-091-1/+2
| | | | found by kristaps@ on Mac OS X
* Absurdly, the return value of sqlite3_column_text()Ingo Schwarze2014-08-051-5/+5
| | | | | | is "const unsigned char *", which causes warnings with GCC on Linux. Explicitly cast to "const char *" to avoid this. Issue noticed by kristaps@.
* If an old SQLite version doesn't provide SQLITE_DETERMINISTIC,Ingo Schwarze2014-08-051-0/+3
| | | | | simply ignore it, as using it is merely an optimization. Issue noticed by kristaps@.
* Sort result pages first by section number, then by name.Ingo Schwarze2014-07-241-0/+20
| | | | | | | | By moving the sort from cgi.c to mansearch.c, we get two advantages: Easier access to the data needed for sorting, in particular the section number, and the apropos(1) command line utility profits as well. Feature requested by deraadt@.
* Fix whatis(1) to correctly match words instead of any substrings.Ingo Schwarze2014-07-121-22/+50
| | | | | While here, also provide an internal mode (MANSEARCH_MAN) to match complete names, to be used by man.cgi(8).
* Merge from OpenBSD - Marc Espie improved the ohash interface:Ingo Schwarze2014-06-201-9/+7
| | | | | | * rename the halloc callback to calloc, provide overflow protection * rename the hfree callback to free, drop the useless size argument * prevent integer overflows in ohash_resize
* Audit malloc(3)/calloc(3)/realloc(3) usage.Ingo Schwarze2014-04-231-3/+3
| | | | | | | * Change eight reallocs to reallocarray to be safe from overflows. * Change one malloc to reallocarray to be safe from overflows. * Change one calloc to reallocarray, no zeroing needed. * Change the order of arguments of three callocs (aesthetical).
* improve SQL style: avoid "SELECT *", be explicit in what columns we want;Ingo Schwarze2014-04-231-4/+6
| | | | suggested by espie@.
* KNF: case (FOO): -> case FOO:, remove /* LINTED */ and /* ARGSUSED */,Ingo Schwarze2014-04-201-12/+12
| | | | | remove trailing whitespace and blanks before tabs, improve some indenting; no functional change
* Garbage collect one pair of needless parentheses in SQL code generation;Ingo Schwarze2014-04-171-7/+7
| | | | | note this doesn't affect performance, SQLite generates the same byte code. While here, make the calls to exprspec() easier to understand.
* Rename the mpages.id column to mpages.pageid. There is no good reasonIngo Schwarze2014-04-161-17/+17
| | | | to call this kid by a different name here than in all other tables.
* Pass the function flags SQLITE_UTF8 (because SQLITE_ANY is deprecated)Ingo Schwarze2014-04-161-2/+4
| | | | | | and SQLITE_DETERMINISTIC when creating deterministic functions; best practice measure suggested by espie@ and jeremy@; as expected by jeremy@, no measurable effect on performance.
* Oops, sorry, revert previous and commit the correct patch:Ingo Schwarze2014-04-151-4/+6
| | | | At the end of mansearch(), fchdir() back to where we started from.
* At the end of mansearch(), fchdir() back to where we started from;Ingo Schwarze2014-04-151-0/+1
| | | | | this is cleaner and helps to not scatter gmon.out files all over the place when profiling.
* Further apropos(1) speed optimization was trickier than anticipated.Ingo Schwarze2014-04-111-0/+48
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Contrary to what i initially thought, almost all time is now spent inside sqlite3(3) routines, and i found no easy way calling less of them. However, sqlite(3) spends substantial time in malloc(3), and even more (twice that) in its immediate malloc wrapper, sqlite3MemMalloc(), keeping track of all individual malloc chunk sizes. Typically about 90% of the malloced memory is used for purposes of the pagecache. By providing an mmap(3) MAP_ANON SQLITE_CONFIG_PAGECACHE, execution time decreases by 20-25% for simple (Nd and/or Nm) queries, 10-20% for non-NAME queries, and even apropos(1) resident memory size as reported by top(1) decreases by 20% for simple and by 60% for non-NAME queries. The new function, mansearch_setup(), spends no measurable time. The pagesize chosen is optimal: * Substantially smaller pages yield no gain at all. * Larger pages provide no additional benefit and just waste memory. The chosen number of pages in the cache is a compromise: * For simple queries, a handful of pages would suffice to get the full speed effect, at an apropos(1) resident memory size of about 2.0 MB. * For non-NAME queries, a large pagecache with 2k pages (2.5 MB) might gain a few more percent in speed, but at the expense of doubling the apropos(1) resident memory size for *all* queries. * The chosen number of 256 pages (330 kB) allows nearly full speed gain for all queries at the price of a 15% resident memory size increase.
* Next speed optimization step for the new apropos(1).Ingo Schwarze2014-04-101-13/+26
| | | | | | | | | | | | | | Split manual names out of the common "keys" table into their own "names" table. This reduces standard apropos(1) search times (i.e. searching for names and descriptions only) by typically about 70% for the full /usr/share/man database. (Yes, that multiplies with the previous optimization step, so both together have reduced search times by a factor of more than six. I'm not done yet, expect more to come.) Even with the minimal databases built with makewhatis(8) -Q, this step still reduces search times by 15-20%. For both cases, database sizes and build times hardly change (+/-2%).
* After careful gprof(1)ing of the new apropos(1), move the descriptionsIngo Schwarze2014-04-091-15/+42
| | | | | | | | | | | | back from the keys table to the mpages table: I found a good way to still use them in searches, without complication of the code. On my notebook, this reduces typical apropos(1) search times by about 40%, it reduces /usr/share/man database size by 6% in makewhatis(8) -Q mode and by 2% in standard mode (less overhead storing pointers to mpages), and it doesn't measurably change database build times (may even be going down by a percent or so because less data is being copied around in ohashes).
* Properly initialize malloc(3)ed memory.Ingo Schwarze2014-03-281-0/+1
| | | | | With this bug fix, partly unitialized memory could sometimes be returned, sometimes causing crashes by bogus free(3)s in apropos(1).
* avoid repetitive code for asprintf error handlingIngo Schwarze2014-03-231-29/+11
|
* The files mandoc.c and mandoc.h contained both specialised low-levelIngo Schwarze2014-03-231-0/+1
| | | | | | | functions used for multiple languages (mdoc, man, roff), for example mandoc_escape(), mandoc_getarg(), mandoc_eos(), and generic auxiliary functions. Split the auxiliaries out into their own file and header. While here, do some #include cleanup.
* in apropos(1) output, sort names and avoid multiple section numbersIngo Schwarze2014-03-171-6/+54
|
* Always compare arch case-insensitively.Ingo Schwarze2014-01-191-0/+2
|
* Get rid of the local keys table, use the new mansearch_const.c.Ingo Schwarze2014-01-191-66/+28
| | | | No functional change.
* Remove the redundant "file" column from the "mlinks" table.Ingo Schwarze2014-01-061-9/+18
| | | | | The contents can easily be reconstructed from sec, arch, name, form. Shrinks the database by another 3% in standard mode and 9% in -Q mode.
* Drop Nd from the mpages table, it is still in the keys table.Ingo Schwarze2014-01-061-6/+2
| | | | | This shrinks the database in standard mode by 3%, in -Q mode by 9%, without loss of functionality.
* Remove the obsolete file name column from the mpages table.Ingo Schwarze2014-01-051-23/+29
| | | | | This column wasn't helpful because one manpage can have multiple MLINKS. Use the file name column in the mlinks table, instead.
* Remove the obsolete sec and arch columns from the mpages table.Ingo Schwarze2014-01-051-3/+3
| | | | | They were confusing because a manpage can have MLINKS in different sections and architectures.
* Reimplement apropos -s NUM -S ARCH EXPR by internally converting it toIngo Schwarze2014-01-051-21/+51
| | | | | | | | | | | | | | | apropos \( EXPR \) -a 'sec~^NUM$' -a 'arch~^(ARCH|any)$' in preparation for removal of sec and arch from the mpage table. Almost no functional change except for the following bonus: This also makes sure that for cross-section and cross-arch MLINKs, all of the following work: apropos -s 1 encrypt apropos -s 8 encrypt apropos -s 1 makekey apropos -s 8 makekey While here, print error messages about invalid regexps to stderr.
* Put section and architecture info into the keys table,Ingo Schwarze2014-01-051-0/+2
| | | | | | | | | | in preparation for removing them from the mpages table, aiming for cleaner and more uniform interfaces. Database growth is below 4%, part of which will be reclaimed. As a bonus, this allows searches like: ./obj/apropos An=kettenis -a arch=ppc ./obj/apropos An=kettenis -a sec~[^4]
* New implementation of complex search criteria using \(, \), -a becauseIngo Schwarze2014-01-041-55/+91
| | | | | | the old implementation got lost in the Berkeley to SQLite switch. Note that this is not just feature creep, but required for upcoming database format cleanup and simplification.
* Experimental feature to let apropos(1) show different keys than .Nd.Ingo Schwarze2013-12-311-4/+62
| | | | | | | | This really takes us beyond what grep -R /usr/*/man/ can do because now you can search for pages by *one* criterion and then display the contents of *another* macro from those pages, like in $ apropos -O Ox Fa~wchar to get an impression how long wide character handling is available.
* Split buildnames() out of mansearch(); the latter function is gettingIngo Schwarze2013-12-311-30/+40
| | | | too long and unwieldy, but will grow more code soon. No functional change.
* Change the mansearch() interface to use the mlinks table in the databaseIngo Schwarze2013-12-271-8/+49
| | | | | | and return a list of names with sections, used by apropos(1) for display. While here, improve uniformity of the interface by allocating the file name dynamically, just like the names list and the description.
* Add an additional mlinks table to the database, redundant for now,Ingo Schwarze2013-12-271-2/+2
| | | | | | both because it contains nothing but a subset of the data of the existing mpages table and because the relationship of mpage and mlink entries is still 1:1. But all that will eventually change.
* Fix another regression introduced when switching from DB to SQLite:Ingo Schwarze2013-10-201-28/+71
| | | | The ~ operator has to do regular expression search, not globbing.
* Fix a regression introduced when switching from DB to SQLite:Ingo Schwarze2013-10-191-5/+33
| | | | The = operator has to do substring search, not word search.
* Some places used PATH_MAX from <limits.h>, some MAXPATHLEN from <sys/param.h>.Ingo Schwarze2013-06-051-7/+6
| | | | | | Consistently use the PATH_MAX since it is specified by POSIX, while MAXPATHLEN is not. In preparation for using this at a few more places.
* Merge whatis.1 into apropos.1 (and remove), add whatis bits to aproposKristaps Dzonsons2012-06-091-16/+24
| | | | (via mansearch), and merge mandocdb.h into mansearch.h (and remove).
* Add a compatibility interface for ohash.Kristaps Dzonsons2012-06-091-0/+4
| | | | | | | | | | This include's espie@'s wholesale src/lib/libc/ohash directory from OpenBSD into compat_ohash.c (with a single copyright/license notice at the top) and src/include/ohash.h as compat_ohash.h. The ohash_int.h part of compat_ohash.c has been changed only in that ohash.h points to compat_ohash.h. Added HAVE_OHASH test (test-ohash.c) to Makefile. In mandocdb.c and mansearch.c, check HAVE_OHASH test for inclusion.