diff options
author | Ingo Schwarze <schwarze@openbsd.org> | 2016-08-01 12:27:15 +0000 |
---|---|---|
committer | Ingo Schwarze <schwarze@openbsd.org> | 2016-08-01 12:27:15 +0000 |
commit | 650e61d7ec09f05839e8744e003a7dc376facf73 (patch) | |
tree | c4439cc21465b7d2b4e171f4cc8c7e83aeef1aaf | |
parent | 5ce556afa6b71ea3e8cb0f4bd06ab4cbee499ef9 (diff) | |
download | mandoc-650e61d7ec09f05839e8744e003a7dc376facf73.tar.gz |
document the new file format
-rw-r--r-- | mandoc.db.5 | 230 |
1 files changed, 151 insertions, 79 deletions
diff --git a/mandoc.db.5 b/mandoc.db.5 index d0ccfb6c..2f688193 100644 --- a/mandoc.db.5 +++ b/mandoc.db.5 @@ -1,6 +1,6 @@ .\" $Id$ .\" -.\" Copyright (c) 2014 Ingo Schwarze <schwarze@openbsd.org> +.\" Copyright (c) 2014, 2016 Ingo Schwarze <schwarze@openbsd.org> .\" .\" Permission to use, copy, modify, and distribute this software for any .\" purpose with or without fee is hereby granted, provided that the above @@ -23,7 +23,7 @@ .Sh DESCRIPTION The .Nm -SQLite3 file format is used to store information about installed manual +file format is used to store information about installed manual pages to facilitate semantic searching for manuals. Each manual page tree contains its own .Nm @@ -34,87 +34,156 @@ for examples. Such database files are generated by .Xr makewhatis 8 and used by +.Xr man 1 , .Xr apropos 1 and .Xr whatis 1 . .Pp -One line in the following tables describes: -.Bl -tag -width Ds -.It Sy mpages -One physical manual page file, no matter how many times and under which -names it may appear in the file system. -.It Sy mlinks -One entry in the file system, no matter which content it points to. -.It Sy names -One manual page name, no matter whether it appears in a page header, -in a NAME or SYNOPSIS section, or as a file name. -.It Sy keys -One chunk of text from some macro invocation. +The file format uses three datatypes: +.Pp +.Bl -dash -compact -offset 2n -width 1n +.It +32-bit signed integer numbers in big endian (network) byte ordering +.It +NUL-terminated strings +.It +lists of NUL-terminated strings, terminated by a second NUL character .El .Pp -Each record in the latter three tables uses its -.Va pageid -column to point to a record in the -.Sy mpages -table. +Numbers are aligned to four-byte boundaries; where they follow +strings or lists of strings, padding with additional NUL characters +occurs. +Some, but not all, numbers point to positions in the file. +These pointers are measured in bytes, and the first byte of the +file is considered to be byte 0. +.Pp +Each file consists of: +.Pp +.Bl -dash -compact -offset 2n -width 1n +.It +One magic number, 0x3a7d0cdb. +.It +One version number, currently 1. +.It +One pointer to the macros table. +.It +One pointer to the final magic number. +.It +The pages table (variable length). +.It +The macros table (variable length). +.It +The magic number once again, 0x3a7d0cdb. +.El .Pp -The other columns are as follows; unless stated otherwise, they are -of type -.Vt TEXT . -.Bl -tag -width mpages.desc -.It Sy mpages.desc -The description line -.Pq Sq \&Nd -of the page. -.It Sy mpages.form -An -.Vt INTEGER -bit field. -If bit -.Dv FORM_GZ -is set, the page is compressed and requires -.Xr gunzip 1 -for display. -If bit -.Dv FORM_SRC -is set, the page is unformatted, that is in +The pages table contains one entry for each physical manual page +file, no matter how many hard and soft links it may have in the +file system. +The pages table consists of: +.Pp +.Bl -dash -compact -offset 2n -width 1n +.It +The number of pages in the database. +.It +For each page: +.Bl -dash -compact -offset 2n -width 1n +.It +One pointer to the list of names. +.It +One pointer to the list of sections. +.It +One pointer to the list of architectures +or 0 if the page is machine-independent. +.It +One pointer to the one-line description string. +.It +One pointer to the list of filenames. +.El +.It +For each page, the list of names. +Each name is preceded by a single byte indicating the sources of the name. +The meaning of the bits is: +.Bl -dash -compact -offset 2n -width 1n +.It +0x10: The name appears in a filename. +.It +0x08: The name appears in a header line, i.e. in a .Dt or .TH macro. +.It +0x04: The name is the first one in the title line, i.e. it appears +in the first .Nm macro in the NAME section. +.It +0x02: The name appears in any .Nm macro in the NAME section. +.It +0x01: The name appears in an .Nm block in the SYNOPSIS section. +.El +.It +For each page, the list of sections. +Each section is given as a string, not as a number. +.It +For each architecture-dependent page, the list of architectures. +.It +For each page, the one-line description string taken from the .Nd macro. +.It +For each page, the list of filenames relative to the root of the +respective manpath. +This list includes hard links, soft links, and links simulated +with .so +.Xr roff 7 +requests. +The first filename is preceded by a single byte +having the following significance: +.Bl -dash -compact -offset 2n -width 1n +.It +.Dv FORM_SRC No = 0x01 : +The file format is .Xr mdoc 7 or -.Xr man 7 -format, and requires -.Xr mandoc 1 -for display. -If bit -.Dv FORM_SRC -is not set, the page is formatted, i.e. a -.Sq cat -page. -.It Sy mlinks.sec -The manual section as found in the subdirectory name. -.It Sy mlinks.arch -The manual architecture as found in the subdirectory name, or -.Qq any . -.It Sy mlinks.name -The manual name as found in the file name. -.It Sy names.bits -An -.Vt INTEGER -bit mask telling whether the name came from a header line, from the -NAME or SYNOPSIS section, or from a file name. -Bits are defined in -.In mansearch.h . -.It Sy names.name -The name itself. -.It Sy keys.bits -An -.Vt INTEGER -bit mask telling which semantic contexts the key was found in; -defined in -.In mansearch.h , -documented in +.Xr man 7 . +.It +.Dv FORM_CAT No = 0x02 : +The manual page is preformatted. +.El +.It +Zero to three NUL bytes for padding. +.El +.Pp +The macros table consists of: +.Pp +.Bl -dash -compact -offset 2n -width 1n +.It +The number of different macro keys, currently 36. +The ordering of macros is defined in +.In mansearch.h +and the significance of the macro keys is documented in .Xr apropos 1 . -.It Sy keys.key -The string found in those contexts. +.It +For each macro key, one pointer to the respective macro table. +.It +For each macro key, the macro table (variable length). +.El +.Pp +Each macro table consists of: +.Pp +.Bl -dash -compact -offset 2n -width 1n +.It +The number of entries in the table. +.It +For each entry: +.Bl -dash -compact -offset 2n -width 1n +.It +One pointer to the value of the macro key. +Each value is a string of text taken from some macro invocation. +.It +One pointer to the list of pages. +.El +.It +For each entry, the value of the macro key. +.It +Zero to three NUL bytes for padding. +.It +For each entry, one or more pointers to pages in the pages table, +pointing to the pointer to the list of names, +followed by the number 0. .El .Sh FILES .Bl -tag -width /usr/share/man/mandoc.db -compact @@ -128,10 +197,16 @@ Window System. The same for .Xr packages 7 . .El +.Pp +A program to dump +.Nm +files in a human-readable format suitable for +.Xr diff 1 +is provided in the directory +.Pa /usr/src/regress/usr.bin/mandoc/db/dbm_dump/ . .Sh SEE ALSO .Xr apropos 1 , .Xr man 1 , -.Xr sqlite3 1 , .Xr whatis 1 , .Xr makewhatis 8 .Sh HISTORY @@ -140,7 +215,7 @@ A manual page database first appeared in .Bx 2 . The present format first appeared in -.Ox 5.6 . +.Ox 6.1 . .Sh AUTHORS .An -nosplit The original version of @@ -148,9 +223,6 @@ The original version of was written by .An Bill Joy in 1979. -An SQLite3 version was first implemented by -.An Kristaps Dzonsons Aq Mt kristaps@bsd.lv -in 2012. The present database format was designed by .An Ingo Schwarze Aq Mt schwarze@openbsd.org -in 2014. +in 2016. |