


SWISH++(4)		  File Formats		       SWISH++(4)



NAME
     SWISH++ index file	format

SYNOPSIS
     long      num_words;
     off_t     word_offset[ num_words ];
     long      num_files;
     off_t     file_offset[ num_files ];
	       word index
	       file index

DESCRIPTION
     The index file format used	by SWISH++  is	as  shown  above.
     Every  word_offset	is an offset into the word index pointing
     at	the first character of a  word	entry;	similarly,  every
     file_offset is an offset into the file index pointing at the
     first character of	a file entry.

     The index file is written as it is	so that	it can be  mapped
     into  memory  via	the  mmap(2)  Unix  system  call enabling
     ``instantaneous'' access.

  Word Entries
     Every word	entry in the word index	is of the form:

	  word0{file-index rank	}...0

     that is: a	null-terminated	word  followed	by  one	 or  more
     file-index/rank  integer pairs followed by	a null byte.  The
     file-index/rank integers are in  ASCII,  not  binary,  since
     this  takes less space.  Every integer is followed	by a sin-
     gle space character (ASCII	32 decimal).  The  file-index  is
     an	index into the file_offset table.

  File Entries
     Every file	entry in the file index	is of the form:

	  path-name file-size file-title0

     that is: the pathname for	a  file	 relative  to  where  the
     indexing  was  performed  (unless	absolute paths were used)
     followed by the file's size in  bytes  followed  by  by  the
     file's  title  followed by	a null byte.  All the information
     is	in ASCII.

     For an HTML file, the title is what is between  <TITLE>  ...
     </TITLE>  pairs.	If  a file is not an HTML file,	or is but
     does not have a title, the	title is  simply  the  file  (not
     path) name.

CAVEAT
     Generated index files are machine-dependent  (size	 of  data



SWISH++		 Last change: February 27, 1998			1






SWISH++(4)		  File Formats		       SWISH++(4)



     types and byte-order).

SEE ALSO
     index(1), search(1)

AUTHOR
     Paul J. Lucas <pjl@best.com>
















































SWISH++		 Last change: February 27, 1998			2



