.	\" $Id: http-analyze.man,v 2.15 1999/11/01 18:31:46 stefan Exp $
.	\"
.	\" manpage for http-analyze
.	\" Copyright  1996-1999 by Stefan Stapelberg/RENT-A-GURU, <stefan@rent-a-guru.de>
.	\"
.if n \{\
.	nr LL 78n
.	nr )O 0
.	po \n()Ou
.\}
.de (E
.if t \{\
.fp 9 SC
.fp 10 SI
.fp 11 SB
.fp 12 SX
.ev 1
.fp 9 SC
.fp 10 SI
.fp 11 SB
.fp 12 SX
.ev
.sp .3v
.\}
.RS 10
.ta 17n 25n
\.ta 17n 25n 33n 41n 49n
.ft SC
.ps -2p
.vs -2p
.nf
..
.de CW
.ft 9
.it 1 }N
.if \\n(.$ \&\\$1 \\$2 \\$3 \\$4 \\$5 \\$6
..
.de CR
.}S 9 1 \& "\\$1" "\\$2" "\\$3" "\\$4" "\\$5" "\\$6"
..
.de )E
'br
'fi
'vs
'ps
.ft 1
.RE
..
.de NE
.br
.ie !"\\$1"" .ne \\$1
.el .ne 20v
..
.de Ex
.if !"\\$4"1" \{\
\&Example:
.(E
.\}
\&\\$1\t\\$2\t\\$3
.if "\\$4"" .)E
..
.de (P
.sp .7v
.RS \\$1
.ft SC
.ps -2p
.vs -2p
.nf
..
.de )P
'br
'fi
'vs
'ps
.ft 1
.RE
.if !"\\$1"0" \{\
.	sp\\n(PDu
.	ne1.1v
.	}E
.\}
..
.TH http-analyze 1 "" "Version 2.4"
.SH NAME
.B http-analyze
\- a fast log analyzer for web servers
.SH SYNOPSIS
.B http-analyze
.RB [\| \-{hdmBVX} \|]
.RB [\| \-3aefgnqvxyM \|]
[\|\f3\-b\fP \f2bufsize\fP\|]
[\|\f3\-c\fP \f2cfgfile\fP\|]
.br
.ie t .ti +\w'\f3http-analyze\fP\ 'u
.el .ti +2n
[\|\f3\-i\fP \f2newcfg\fP\|]
[\|\f3\-l\fP \f2libdir\fP\|]
[\|\f3\-o\fP \f2outdir\fP\|]
[\|\f3\-p\fP \f2prvdir\fP\|]
[\|\f3\-s\fP \f2subopt,...\fP\|]
.br
.ie t .ti +\w'\f3http-analyze\fP\ 'u
.el .ti +2n
[\|\f3\-t\fP \f2num,...\fP\|]
[\|\f3\-u\fP \f2time\fP\|]
[\|\f3\-w\fP \f2hits\fP\|]
[\|\f3\-F\fP \f2logfmt\fP\|]
[\|\f3\-L\fP \f2lang\fP\|]
[\|\f3\-C\fP \f2chrset\fP\|]
.br
.ie t .ti +\w'\f3http-analyze\fP\ 'u
.el .ti +2n
[\|\f3\-I\fP \f2date\fP\|]
[\|\f3\-E\fP \f2date\fP\|]
[\|\f3\-G\fP \f2suffix,...\fP\|]
[\|\f3\-H\fP \f2idxfile,...\fP\|]
[\|\f3\-O\fP \f2vname,...\fP\|]
.br
.ie t .ti +\w'\f3http-analyze\fP\ 'u
.el .ti +2n
[\|\f3\-P\fP \f2prolog\fP\|]
[\|\f3\-R\fP \f2docroot\fP\|]
[\|\f3\-S\fP \f2srvname\fP\|]
[\|\f3\-T\fP \f2TLDfile\fP\|]
[\|\f3\-U\fP \f2srvurl\fP\|]
.br
.ie t .ti +\w'\f3http-analyze\fP\ 'u
.el .ti +2n
[\|\f3\-W\fP \f2\&3Dwin\fP\|]
[\|\f3\-Z\fP \f2showdom\fP\|]
.RI [\| logfile \|[...]]
.SH DESCRIPTION
.B http-analyze
analyzes the logfile of a web server and creates a detailed summary of the
servers's access load in graphical, tabular, and three-dimensional form.
The analyzer does this by
.PD 0
.RS
.IP \(bu 3 0
reading all logfiles specified on the command line,
.IP \(bu 3 0
saving all unique (different) URLs, hostnames, referrer URLs and user agents,
.IP \(bu 3 0
accounting for hits (successful requests), files sent, files cached,
data sent, etc.,
.IP \(bu 3 0
and finally creating a statistics report for the period detected in the
logfile(s).
.RE
.PD
.P
The resulting statistics report is a comprehensive view of the server's logfile.
The server writes a logfile entry for every response on behalf of a request
from a browser or a forwarding system such as proxy servers.
To understand the meaning of the terms in the report, you need a
little knowledge about the type of data your web server records
in its logfile.
.SS "LOGFILE FORMATS"
.P
.B "NCSA Common Logfile Format (CLF)"
.P
The basic logfile format supported by allmost all servers is the
.IR "NCSA Common Logfile Format" .
It \%contains the following information for each request (hit):
.(P
dns-name - auth-user [date] "clf-request" clf-status ct-length
.)P
where the fields have following meaning:
.TP 14
\f(SCdns-name\f1
The IP number of the system accessing the web server.
If there is an entry in the
.I "Domain Name System (DNS)"
for this IP number and the web server is configured to do DNS lookups,
the corresponding hostname is logged instead.
.TP 14
\f(SC\-\f1
Unused.
.TP 14
\f(SCauth-user\f1
The username provided by the client if authentication was required.
.TP 14
\f(SC[date]\f1
The date of the access in format \s-1\f(SC[DD/MMM/YYYY:HH:MM:SS\ \(+-ZZZZ]\f1\s0.
.TP 14
\f(SC"clf-request"\f1
The request in format \s-1\f(SC"method\ URI\ proto"\f1\s0, where
.I method
is one of
.BR GET ,
.BR HEAD ,
.BR POST ,
.BR PUT ,
.BR BROWSE ,
.BR OPTIONS ,
.BR DELETE " or"
.BR TRACE ;
.I URI
is the
.IR "Uniform Resource Identifier" ", and"
.I proto
is the HTTP version number.
.TP 14
\f(SCclf-status\f1
The (numerical) response code from the server.
.TP 14
\f(SCct-length\f1
This is either the size of the document or the data
actually sent over the wire.
.sp 1v
.P
Following is an example for an entry in
.IR "NCSA Common Logfile Format" :
.(P
car.4rent.de - - [01/Aug/1999:00:00:02 +0100] "GET /doc.html HTTP/1.1" 200 393
.)P
.NE 10v
.sp 1v
.B "W3C Extended Logfile Format (ELF)"
.P
The
.I "W3C Extended Logfile Format (ELF)"
is basically
.I "NCSA CLF"
plus user-agent and referrer URL information.
.B http-analyze
supports two variants of this extended format:
.IR DLF " and " ELF .
.P
The
.I DLF
format adds the referrer URL and the user-agent in this order
with or without surrounding double quotes:
.(P
CLF "referrer_URL" "user_agent"
CLF referrer_URL user_agent
.)P 0
.P
This is an example for an entry in
.I DLF
format (wrapped on two lines for readability):
.(P
car.4rent.de - - [01/Aug/1999:00:00:02 +0100] "GET /doc.html HTTP/1.1" 200 393
"http://inet-tv.net/hot.html" "Mozilla/4.05 (X11; I; IRIX64 6.4 IP30)"
.)P
.P
The
.I ELF
format also adds the referrer URL and the user-agent,
but in the opposite order and without the double quotes:
.(P
CLF user_agent referrer_URL
.)P
.P
This is an example for an entry in
.I ELF
format (wrapped on two lines for readability):
.(P
car.4rent.de - - [01/Aug/1999:00:00:02 +0100] "GET /doc.html HTTP/1.1" 200 393
Mozilla/4.05 (X11; I; IRIX64 6.4 IP30) http://inet-tv.net/index.html
.)P
.P
The
.I ELF
variant is the preferred method to pass referrer URL and user-agent
information.  When this format is used,
.B http-analyze
searches backwards for the protocol specification of the referrer URL
(to be precise, it looks for the colon in
.BR http: )
and then for the preceeding blank.
This ensures that broken referrer URLs which contain
blanks or double quotes are handled correctly.
.P
To select either logfile format, edit the configuration file of your
web server and define the fields to be logged. See the web server's
documentation for information how to customize logging.
.sp 1v
.P
.B "Automatic detection of the logfile format"
.P
.B http-analyze
tries to automatically detect the correct logfile format by analyzing the
first few entries of a logfile (this works only if your server records a
hyphen (`\-') for empty referrer URL or user-agent fields).
If
.B http-analyze
detects referrer URL and user-agent information, it assumes the
.I ELF
variant of the
.IR "W3C Extended Logfile Format" .
To process the
.I DLF
variant, specify the logfile format explicitely using the option
.BR \-F .
.sp 1v
.B "Logfile data used by http-analyze"
.P
The statistics report shows a summary of the information which has
been recorded into the logfile by the web server.
For each logfile entry
.B http-analyze
processes the origin (sitename) and date of the request, the request method,
the URL of the requested object, the server's response on behalf of the
request, the size of the requested object and \%optionally the user-agent
and the referrer URL if sent by the client.
.P
Note that
.B http-analyze
does not recognize visitors, email addresses of users visting your server,
the path a user took through your web site, the last page visited by a user
before leaving your site nor anything else not recorded in the server's logfile.
Although hostnames are recorded for each request, they must not necessarily
correspond to the real system actually used by a visitor \- the request
could be forwarded through a dialup service for example.
Furthermore, no request may get logged by your server at all while someone
is surfing through cached copies of parts of your site depending on the
configuration of his/her browser ...
.bp
.SS "BASIC OPERATION"
.P
By default,
.B http-analyze
creates a
.I "full statistics report"
for a whole month, which contains \%complete details for the
period determined by the timestamps of the first and last
logfile entry processed.
It is therefore extremly important to always feed all logfiles
for a whole month into
.BR http-analyze ,
no matter how frequently you rotate (save) the logfiles.
.P
The recommended way of providing an up-to-date statistics report
for a web server is to have a script running
.B http-analyze
automatically on a regular base, say twice per day, and have it
process the current logfile of the web server from the beginning
of the current month until today.
At the first of a new month, the logfile should be saved elsewhere
and the web server should be \%restarted to create a new logfile for
the new month. Then run
.B http-analyze
on the old (saved) logfile to create a final statistics report for
the previous month.
A history file is used to produce a summary for the last 12\ months
on the main page of the statistics report without having to analyze
logfiles for those older periods again.
.P
If you rotate the logfile more often to be able to compress them
\- for example, once per day \-, you must uncompress and concatenate all
separate logfiles for the whole month into one, chronologically ordered
data stream, which the can be processed by
.BR http-analyze .
.NE 10v
.sp 1v
.B "Full statistics report"
.P
Due to technical reasons, a full statistics report will not be created
before the second day of a new month, although the totals for the first
day of the new month on the summary main page of the report will be updated.
A full statistics report contains a detailed summary including the \%following
items (see the section
.I "Interpretation of the results"
for an explanation of the terms):
.RS 2
.IP \(bu 3
the number of hits, files sent/cached, pageviews, sessions and
the amount of data sent
.PD 0
.IP \(bu 3 "" 0
the total amount of data requested, transferred, and saved by
caching mechanisms
.IP \(bu 3 "" 0
the total number of unique URLs, sites, sessions, browser types
and referrer URLs
.IP \(bu 3 "" 0
the total number of all response codes other than Code 200 (\f2OK\fP)
.IP \(bu 3 "" 0
the total number of requests which required authentication
.IP \(bu 3 0
the average load per week, day, hour, minute and second
.IP \(bu 3 0
the Top 7 days, 24 hours, 5 minutes and 5 seconds
.IP \(bu 3 0
the Top 30 most commonly accessed URLs (hits, files, pageviews,
sessions, data sent)
.IP \(bu 3 0
the 10 least frequently accessed URLs (hits, files, pageviews,
sessions, data sent)
.IP \(bu 3 0
the Top 30 client domains, browser types, and referrer hosts
.IP \(bu 3 0
an overview and a detailed list of all files, sitenames,
browser types and referrer URLs
.IP \(bu 3 0
a list of all Code 404 (\f2Not Found\fP) responses
.PD
.RE
.sp 1v
.P
.B "Short statistics report"
.P
Since analyzing the complete logfile for a whole month increases
processing time on heavily accessed web servers, you can instruct
.B http-analyze
to create a
.I "short statistics report"
for the current day only.  In this mode,
.B http-analyze
updates only the daily totals for the current month in the
.I "Hits by Day"
section of the report and saves the results in a history file.
If the analyzer is then run a second time to update the
.IR "short statistics report" ,
it skips all logfile entries from the beginning of the month
until it detects any entries for the current day, which are
then processed to produce an up-to-date
.I "Hits by Day"
section in the statistics report.
.P
In
.IR "short statistics mode" ,
.B http-analyze
needs only a fraction of processing time required for a
.IR "full statistics report" ,
but it updates only a very small part of the statistics report
so that this should be considered an additional feature rather
than a replacement for the
.IR "full statistics mode" .
The recommended way for using this feature is to have
.B http-analyze
generate a
.I "full statistics report"
once per day or week, while generating an up-to-date
.I "short statistics report"
as often as once per hour or day.
.NE 10v
.SS "USER INTERFACES"
.P
Two user interfaces exists for access to the statistics report:
a conventional interface suitable for any browser and a
frames-based interface which requires JavaScript.
.sp .7v
.B "The conventional interface"
.P
The conventional interface appears as in version\ 1.9 if
JavaScript is disabled in your browser or the option
.B \-g
was specified at invocation of
.BR http-analyze .
If JavaScript is enabled, the following separate windows are used
for different parts of the report to allow for easy navigation:
.TP
.I "The Main window"
This window is used for most parts of the report such as the
yearly, monthly, daily and weekly summaries, the
.I "Top N"
lists and the overviews.
Hotlinks in the
.I "Top N"
most often point to the corresponding page,
which is then displayed in the
.I "Viewer window"
if the link is followed, while hotlinks in the overviews point
to the detailed lists, which will show up in the
.IR "List window" .
.TP
.I "The Navigation window"
If JavaScript is enabled in your browser and a summary for a year or
a month is loaded into the main window, a small window containing a
navigation panel will pop up.
If JavaScript is disabled, the navigation links appear at the
bottom of the monthly summary pages.
In the latter case, use the
.I Back
button of your browser for navigation.
.TP
.I "The List window"
This window is used for the detailed lists of URLs, sites, browser types
and referrer URLs.
A separate window for those (often large) lists causes them to be
loaded only once if you follow any link in the
.I "Main window"
while the
.I "List window"
is still open.
.TP
.I "The Viewer window"
This window is used for external pages which are loaded by following
the hotlinks in the statistics report. This way, you can visit the pages
referred to in the report without \%having to go forth and back between
the report itself and the pages listed there.
.TP
.I "The 3D window"
This window is used for the 3D (VRML) model of the statistics.
If you have JavaScript enabled, the window's size will be set to
the smallest possible size so that the 3D model fits onto the screen
or to the dimensions specified with the
.B 3DWinSize
directive.
.P
.ie "\*(.T"nps" \{
.PI ../Snapshot/gui01.eps 5.5i 0 wz "\f2Conventional Interface (JavaScript-enabled)\fP"
.\}
.el .bp
.NE 10v
.P
.B "The frames-based interface"
.P
The frames-based interface requires a JavaScript-enabled browser.
It contains the following frames and windows:
.TP
.I "The Navigation frame"
This frame contains navigation buttons and text.
You can specify its width using the
.B NavigFrame
directive in the configuration file.
.TP
.I "The Main frame"
This frame is used for most parts of the report such as the yearly,
monthly, daily and weekly summaries, the
.I "Top N"
lists and the overviews.
Hotlinks in the
.I "Top N"
lists point most often to the corresponding page,
which is displayed in the
.I "Viewer window"
if the link is followed, while hotlinks in the overviews
point to the detailed lists, which show up in the
.IR "List window" .
.TP
.I "The List window"
This window is used for the detailed lists of URLs, sites,
browser types and referrer URLs.
A separate window for those (often large) lists causes them to be
loaded only once if the links in the
.I "Main window"
are followed and the
.I "List window"
is still open.
.TP
.I "The Viewer window"
This (separate) window is used for external pages which are loaded by
following the hotlinks in the statistics report. This way, you can visit
the pages referred to in the report without \%having to go forth and back
between the report and the pages listed there.
.TP
.I "The 3D window"
This window is used for the 3D (VRML) model of the statistics.
Depending on the \%setting of the
.B 3DWindow
directive in the configuration file, this is either a separate
window (\f2external\fP) or a new frame (\f2internal\fP) inside the
.I "Main frame"
(actually, two frames are created which replace the former
.I "Main frame"
when the 3D model is being displayed).
In case of a separate (external)
.IR "3D window" ,
you can specify its \%dimensions using the
.B 3DWinSize
directive.
.if "\*(.T"nps" \{
.sp 1v
.P
.PI ../Snapshot/gui02.eps 5.5i 0 wz "\f2Frames-based interface\fP"
.\}
.NE 10v
.P
.B "The 3D model"
.P
The 3D model requires a VRML 2.0 plug-in such as CosmoPlayer
from Cosmo Software (\f(SChttp://cosmosoftware.com/\f1).
Using this plug-in, which is available for Silicon Graphics,
Windows and Macintosh platforms, you can \(d>walk\(d< or \(d>fly\(d<
through the model and view the scene from all sides.
If you look at the models, don't forget to touch the buddha appearing in
our 3D logo on top of the statistics report in the yearly summary pages!
.P
The 3D model contains two
.I scenes
(models): one shows the hits, files, cached files, sites and the amount
of data sent by day and the other one shows the server's access load by
weekday and hour.  To view the second scene click on the
.I "scene switch"
on the right top of the model.  To navigate through the 3D space, use the
pre-defined
.I Viewpoints
(camera positions) and CosmoPlayer's
.IR "Navigation panel" .
For customization use the CosmoPlayer pop-up menu.
.P
.if "\*(.T"nps" \{
.PI ../Snapshot/gui03.eps 5.5i 0 wz "\f2The 3D model (first scene)\fP"
.\}
.NE 7v
.P
The 3D representation of hits by weekday and hour in the second scene
allow easy identification of the time your server has been most busy
serving requests.
.if "\*(.T"nps" \{\
In the figure below, most hits did occur on Friday between 16:00 and 17:00.
.PI ../Snapshot/gui04.eps 5.5i 0 wz "\f2The 3D model (second scene)\fP"
.\}
.NE 10v
.SS "INTERPRETATION OF THE RESULTS"
.B http-analyze
creates a summary of the information found in the server's logfile.
The analyzer counts the requests, saves the unique URLs, sitenames,
browser types and referrer-URLs and creates a comprehensive
statistics report.  The following terms are used in this report:
.TP
.B Hits
(color: green) A hit is any response from the web server on behalf
of a request sent by a browser, such as text (HTML) files, images,
applets, audio/movie clips and even error messages.
For example, if a page is requested which contains two inline images,
the server would generate three hits: one hit for the HTML page itself
and two hits for the images.
If an invalid URL is requested, the server would respond with a Code 404
(\f2Not Found\fP) status code, which is also a response accounted for as
a hit.
.TP
.B Files
(color: blue) If the server sends back a file for this request,
this is accounted for as a Code 200 (\f2OK\fP) response.
Such a response is classified as a
.IR "file sent" .
Again, file here means any kind of a file, no matter whether it contains
text (HTML documents) or binary data (images, applets, movies, etc.).
Note that if you would configure the web server to only log accesses
to HTML files, but not images nor any other binary data, the number
of files would directly correspond to the number of documents served.
.TP
.B "Cached"
(color: yellow) A
.I "cached file"
is a Code 304 (\f2Not Modified\fP) response.
This response is generated by the server if a document hasn't changed
since the last time it was transferred to the site requesting it.
If the browser has access to a local copy of a document requested by
a user \- either through its local disk cache or through a caching
server \-, it sends out a
.IR "conditional request" ,
which asks for the document to be sent only if it has been changed
since it was requested the last time.
If the document hasn't been change since then, the server sends back a
.I "Code 304"
response to inform the browser that it can use its local copy.
.sp .7v
While this caching mechanism can significantly reduce network traffic,
it causes an inaccuracy in the statistics report regarding the number
a file is requested by someone because of two reasons:
First, the browser can be configured to send conditional requests
.IR "every time" ,
.IR "once per session" " or " never
if a cached file is requested.
Second, online services, ISPs, companies and many other organizations
use so-called caching servers or proxies, which itself fulfill requests
if the file is found in the cache.
Since proxies can serve hundreds to thousands of users, requests from
certain sites could be caused by thousands of users requesting a cached
file or by just one person with his/her browser configured to not cache
anything at all.
.sp .7v
The ratio between
.IR "files sent" " and " "cached files"
therefore reflects the efficiency of caching mechanisms \- but only
for those requests which were handled by your web server.
.TP
.B "Pageviews"
(color: magenta) The
.I pageview
mechanism can be used to separate requests for text or HTML files from
all other types of requests.  If a filename pattern has been defined,
.B http-analyze
classifies all URLs matching this pattern as pageviews (text files),
which allows to estimate the number of \(d>real\(d< text documents
transmitted by your web server.
Filename patterns may be defined using the option
.B \-G
or the
.B PageView
directive in the configuration file.  The suffix
.B \.html
is pre-defined already.
.TP
.B "KBytes transferred"
(color: orange) This is the amount of data sent during the whole
summary period as reported by the server. Note that some servers
record the size of a document instead of the actual number of bytes
transferred.
While in most cases this is the same, if a user interrupts the
transmission by pressing the browser's stop button before the
page has been received completely, some servers (for example all
Netscape web servers) log the size of the file instead the amount
of data transmitted actually.
.TP
.B "KBytes requested"
This is the amount of data requested during the whole summary period.
.B http-analyze
computes this number by summing up the values of
.IR "KBytes transferred" " and " "KBytes saved by cache"
(see below).
.TP
.B "KBytes saved by cache"
The amount of data saved by various caching mechanisms.
This value is computed by multiplying the number of
.I "cached files (Code 304)"
responses with the size of the corresponding file.  Because
.B http-analyze
can determine the size of a file only if the file has been transmitted
at least once in the same summary period, the values for
.IR "KBytes saved by cache" " and " "KBytes requested"
are just approximations of the real values.
.TP
.B "Unique URLs"
The total number of
.I "unique URLs"
is the sum of all different URLs (files) on your web server,
which have been requested at least once in the corresponding
summary period.
.TP
.B "Referrer URLs"
If a user follows a link to your web site and his/her browser sends the URL
of the page containing the link to the server, this URL is logged as the
.I "referrer URL"
(the location referring to your document).
Note that the browser does not necessarily send a referrer URL and
even if it does, a proxy server may alter or delete it before
forwarding the request to a web server.
Such requests appear under
.I Unknown
in the referrer URL list.
.TP
.B "Self-referrer URLs"
As soon as the browser detects any inline objects (images, applets, etc.)
in a page just loaded, it sends out separate requests for those objects.
If the objects reside on the same server as the page referring to them,
the corresponding referrer URLs contain the URL of the page on your server.
Such requests are called
.IR "self-referrer URLs" .
If configured correctly,
.B http-analyze
separates all self-referrer URLs from the rest of the referrer URLs
in the report.
This allows to separate accesses, which actually originated by using
inline objects in a text page, from the remaining (external) accesses.
.TP
.B "Unique sites"
This is the number of all different hostnames or IP addresses found
in the logfile.
Each different hostname is counted only once per period, so this
number shows how many systems did send requests to your server.
.NE 5v
.TP
.B "Sessions"
(color: red) Similar to unique sites, this is the number of
different hostnames or IP addresses accessing the server during
a certain
.IR time-window ,
which defaults to one day for backward compatibility.
Accesses from a known hostname outside this time-\%window get accounted
for as a new
.IR session .
You can increase or decrease the time-\%window for sessions using the option
.BR \-u " or the " Session
directive in the configuration file.
For example, if you set the time-window to 2\ hours, all accesses from
the same host in less than 2\ hours are accounted for as the same session,
while any access more than 2\ hours apart from the first one is accounted
for as a new session.
.TP
.B "Request Method"
The browser uses a certain method to request a document from a web server.
For example, documents, images, applets, etc. are usually requested using the
.B GET
method.
Other often used methods are the
.B HEAD
method to request more information about a document such as its size
without have the server send its actual content, and the
.B POST
method, a special way to transfer user input from forms into CGI scripts.
.sp .7v
Although all logfile entries with a valid request method are accounted for
as hits, only URLs requested using either the
.BR GET " or the " POST
method are processed further.
The remaining hits are summarized under
.IR "Request Methods other than GET/POST" .
.TP
.B "Response Codes"
In reply of a request from a browser, the server sends back a status code
such as a Code 200 (\f2OK\fP) or Code 404 (\f2Not Found\fP) response.
Similar to the request methods, the analyzer will account any valid
response code as a hit, but it will only process those URLs, which did
cause a Code 200 (\f2OK\fP), Code 304 (\f2Not Modified\fP), or Code 404
(\f2Not Found\fP) response from the server.
All other responses are summarized in the monthly summary page under
.IR "Other Response Codes" .
See the current HTML specification at \f(SChttp://www.w3.org/\f1
for information about all valid response codes and their \%meaning.
.B http-analyze
recognizes HTTP/1.1 responses according to RFC\|2616.
.TP
.B Unresolved
A system identifies itself to a web server using an
.IR "IP number" .
Depending on the configuration, the web server might perform a
DNS lookup to resolve the IP number into a hostname.
If no hostname has been assigned to this IP number, only the
IP number is logged.
Such requests are accounted for under
.I Unresolved
in the country list of the statistics report.
Since some systems intentionally have no hostname, a percentage
of up to 35% for unresolved IP numbers is absolutely normal.
.sp .7v
If the country list shows only 100% unresolved IP numbers,
either enable the DNS lookup in your web server or have a DNS resolver
utility preprocess the logfile before feeding the data into
.BR http-analyze .
For our Commercial Service Licensees, we offer a fast DNS resolver
utility with negative caching and a history mechanism.
Visit the \%support site at \f(SChttp://support.netstore.de/\f1
for more information.
.sp 1v
.P
.B "What the report does NOT show ..."
.P
Due to the nature of the HTTP protocol used for communication
between the browser and the server and due to the type of information
available in the server's logfile, the analyzer can not:
.RS
.IP \(bu 3
.PD 0.2v
identify a person as a visitor of your server,
.IP \(bu 3
count the number of visitors of your server,
.IP \(bu 3
find out the email address of a visitor,
.IP \(bu 3
track the path a visitor takes through your site,
.IP \(bu 3
measure the time a visitor sees a page of your server,
.IP \(bu 3
determine the last page someone saw before leaving your site,
.IP \(bu 3
inform you about the sudden death of the visitor while looking at your homepage,
.IP \(bu 3
nor show any other information not recorded in the server's logfile.
.PD
.RE
.sp 1v
.P
Even if you classify certain URLs as
.I pageviews
or use a specific time-window to count
.IR sessions ,
this does in no way tell you anything about the number of real
visitors of your server.
.P
However, if you use an appropriate server structure with files
grouped by its content or if you use the
.B HideURL
directive to group unstructered files together, the statistics report does
show you at least a trend or a tendency.
Following the numbers for some time, you soon get a feeling which documents
are most interesting for the visitors of your site.
.bp
.SS "OUTPUT FILES"
.P
A statistics report is created in the current directory
or in the output directory specified at invo\%cation of
.BR http-analyze .
All output files are placed into separate subdirectories to reduce the
number of directory entries per report.  Those subdirectories are named
.BI www YYYY,
where
.I YYYY
is the year of the period covered by the report.
.P
The analyzer can be instructed to place files with \(d>private\(d< data such as
overviews and detailed lists of files, hosts, browser types, and referrer URLs
in a separate (\(d>private\(d<) subdirectory.
The web server then can be configured to request authentication for access
of files in this directory.  See also the option
.BR \-p " and the " PrivateDir
directive in the configuration file.
.P
.B Note:
for protection of the whole report, you would configure your web server
to request authentication for any file in the statistics output directory.
A separate private area is needed only if you want to secure certain lists
while granting access to the rest of the statistics report.
.P
The following list shows all output files of a full statistics report in a
.BI www YYYY
directory:
.TP
.B index.html
is the main page for the year and contains the total numbers of
.IR hits ", " "files sent" ,
.IR "cached files" ", " pageviews ,
.IR sessions " and " "data sent"
per month in tabular and graphical form for the last 12 months.
At the end of the year, this file contains the values for the whole year,
while the values for the last 12 months then will be continued in the
index file for the new year.  This page is displayed in the
.IR "Main window" .
.TP
\f3stats\fP\f2MMYY\fP\f3\.html\fP and \f3totals\fP\f2MMYY\fP\f3\.html\fP
contain the total summary for the month
.IR MM " of year " YY
in tabular form.
The file \f3totals\fP\f2MMYY\fP\f3\.html\fP is the frames version of the
report in \f3stats\fP\f2MMYY\fP\f3\.html\fP.
In the conventional interface, this page is displayed in the
.IR "Main window" .
.TP
\f3jsnav.html\fP and \f3nav\fP\f2MMYY\fP\f3\.html\fP
navigation panels for JavaScript-enabled browsers, shown in the
.IR "Navigation window" .
.TP
.BI days MMYY \.html
contains the numbers of
.IR hits ", " "files sent" ,
.IR "cached files" ", " pageviews ,
.IR sessions " and " "data sent"
per day for the month
.IR MM " of year " YY .
This report is displayed in the
.IR "Main window" .
.TP
.BI avload MMYY \.html
contains a graphical representation of the
.I "average hits"
per weekday/hour and the
.IR "top seconds" ", " minutes ,
.IR hours ", and " days
of the period.  This list appears in the
.IR "Main window" .
.TP
.BI country MMYY \.html
contains the list of all countries the visitors of your web server
came from.  This information is determined by analyzing the
.I "top-level domain (TLD)"
of the hostname assigned to a system in the
.IR "Domain Name System (DNS)" .
The country report is displayed in the
.IR "Main window" .
.sp .4v
.NE 5v
Note that the country list is meaningful only for hostnames with
ISO two-letter domains.
All other domains 
.RB ( .com ,
.BR .org ", " .net ", etc.)"
are used by organizations world-wide, so they are not assigned a country,
but listed literally in the charts.  The ISO country code for the U.S. is
.BR \.us ,
by the way, not
.BR \.com "\ \.\.\."
.TP
\f3\&3Dstats\fP\f2MMYY\fP\f3\.html\fP, \f3\&3Dstats\fP\f2MMYY\fP\f3\.wrl.gz\fP, \f3\&3Dstats\fP\f2YYYY\fP\f3\.html\fP, \f3\&3Dstats\fP\f2YYYY\fP\f3\.wrl.gz\fP
are pre-requisite files for the 3D statistics models in the
.IR "Virtual Reality Modeling Language (VRML)" .
Those models are created if the option
.B \-3
is given at invocation of
.BR http-analyze .
To view those models, you need a VRML\|2.0 compatible plug-in such as the free
.I CosmoPlayer
from Cosmo Software, which is currently available for Silicon Graphics,
Windows and Macintosh systems.  See \f(SChttp://cosmosoftware.com/\f1
for more information.  All 3D models are displayed in the
.IR "3D window" ,
so that you can compare them with the graphs in the conventional report.
.NE 5v
.TP
\f3topurl\fP\f2MMYY\fP\f3\.html\fP, \f3topdom\fP\f2MMYY\fP\f3\.html\fP, \f3topuag\fP\f2MMYY\fP\f3\.html\fP, \f3topref\fP\f2MMYY\fP\f3\.html\fP
These files contain the
.I "Top Ten"
lists (actually it's
.IR "Top N" ", where " N
is a configurable number) of the
.IR files ", " sites ,
.IR "browser types" " and " "referrer URLs" .
The URLs shown in
.BI topurl MMYY \.html
are either the real URLs requested by the visitor or an
.I item
(arbitrary text) you choosed to collect certain file names under (see the
.B HideURL
directive in the configuration file).
.sp .5v
The domain names shown in
.BI topdom MMYY \.html
are either the second-level domains of the hosts accessing your server
if the DNS name is available or an item you choosed to collect certain
hostnames under (see the
.B HideSys
directive in the configuration file). Unresolved IP numbers show up as
.IR Unresolved .
.sp .5v
The file
.BI topuag MMYY \.html
contains a list of all different user agents, which have been used
by visitors to access your web site.
The user agent information is an identification sent by the browser and
logged by the web server. Although the format for this identification
is well-defined, it isn't obeyed by any browser.  If possible,
.B http-analyze
reduces the name of the user agent in the
.I "Top lists"
to the browser type including the first digit of its version number.
If it is not possible to isolate the browser type from the user agent,
the full identification string as sent by the browser is stored.
.sp .5v
A referrer URL is the URL of the page containing a link to your web site,
which has been followed by someone to reach your site.
Note that for manually entered URLs no referrer URL gets logged.
Also, some browsers do not send a referrer URL or send a faked one.
Entries without a referrer URL are collected under
.I Unknown
in the referrer list.
The list of referrer URLs is displayed in the
.IR "Main window" .
.TP
\f3files\fP\f2MMYY\fP\f3.html\fP, \f3sites\fP\f2MMYY\fP\f3.html\fP, \f3agents\fP\f2MMYY\fP\f3.html\fP, \f3refers\fP\f2MMYY\fP\f3.html\fP
Those files contain a complete overview of the
.IR files ", " sites ,
.I "browser types" " and " "referrer URLs" ,
similar to the
.I "Top\ N"
lists.
.TP
\f3lfiles\fP\f2MMYY\fP\f3.html\fP, \f3lsites\fP\f2MMYY\fP\f3.html\fP, \f3lagents\fP\f2MMYY\fP\f3.html\fP, \f3lrefers\fP\f2MMYY\fP\f3.html\fP
Those files contain the detailed lists of all
.IR files ", " sites ,
.IR "browser types" " and " "referrer URLs" ,
similar to the previous lists, but sorted by item (if any) and hits.
On frequently accessed sites those lists can become rather large,
so they are shown in the separate
.IR "List window" .
.TP
.BI rfiles MMYY \.html
contains all invalid URLs which caused the server to respond with a
.I "Code 404 (Not found)"
status.  If there are large number of hits for certain files the
server couldn't find, it's probably due to missing inline images
or other HTML objects embedded in other pages.
This report is displayed in the
.IR "Main window" .
.TP
.BI rsites MMYY \.html
contains the list of reverse domains.
This report is displayed in the
.IR "Main window" .
.TP
.BR frames.html ", " header.html
This two files are required for the frames-based user interface.
All other files are shared with the ones for the non-frames UI.
In the frames-based UI, the
.I Main
window is inside the frame, while the
.I List
window is still an external window.
The
.I "3D window"
may be inside the frame or an external window (see the
.B 3DWindow
directive).
.TP
.B gr-icon.png
This small icon showing the graph from the main page is displayed
on the main page under the base directory for each statistics report.
.bp
.SH OPTIONS
.TP
.B \-h
print a short help list explaining the meaning of the options.
Use
.B \-hh
to print an even more detailed help.
.TP
.B \-d
.I "(daily mode)"
generate a short statistics report for the current day only.
If a history file exists, the values for the previous days will be read
from this history file and the corresponding logfile entries are skipped.
If no history exist, the whole logfile will be processed and a history
file will be created (unless
.B \-n
is also given).
.TP
.B \-m
.I "(monthly mode)"
generate a full statistics report for a whole month.
In this mode, the values from the history file are used only
to create a summary page for the last 12 months.
The timestamps from the logfile entries feed into
.B http-analyze
always take preceedence over any records in the history unless the option
.B \-e
is specified.
.TP
.B \-B
create buttons only and exit.
The analyzer copies or links the required files and buttons
from the central directory
.B HA_LIBDIR
into the output directory specified by
.BR \-o .
.TP
.B \-V
.I "(version)"
print the version number of
.B http-analyze
and exit.
.TP
.B \-X
print the URL to file a bug report.
Use command substitution or cut & paste to pass this URL to your
favourite browser, complete the form and submit it.
.TP
.B \-3
create a VRML\|2.0-compliant 3D model of the statistics in addition
to the regular statistics report.
You need a VRML\|2.0 compliant plug-in such as
.I CosmoPlayer
from Cosmo Software to view the resulting model.
.TP
.B \-a
ignore all requests for URLs which required authentication.
If your statistics report is publicly available, you probably
do not want to have \(d>secret URLs\(d< listed in the report.
See also the
.B AuthURL
directive in the configuration file.
.TP
.B \-e
use the history file even in full statistics (\fB\-m\fP) mode.
If this option is specified and you analyze the logfiles for
several months in one run,
.B http-analyze
uses the results recorded in the history file for previous months
and skips all logfile entries up to the first day of a new month
not recorded in the history (usually the current month).
This option is useful if you rotate your logfile once per quarter and want
.B http-analyze
to skip all entries for previous months which have been processed already.
.TP
.B \-f
create an additional frames-based user interface for the statistics report.
This interface requires JavaScript.
.TP
.B \-g
.I "(generic interface)"
create a conventional (non-frames) user interface for the statistics
report without the optional JavaScript-based navigation window.
By default, the conventional interface includes JavaScript enhancements
for window control, which only become active if the user has enabled
JavaScript in his/her browser.
Use this option only to completely disable JavaScript enhancements in
the report even if the user has enabled JavaScript in the browser.
.TP
.B \-n
.I "(no update)"
do not update the history file.
Since the history is used in the statistics report to create the
main summary page with the results of last 12 months, this option
must be used to not mess up the statistics report when analyzing
logfiles for previous months (before the last one).
.TP
.B \-q
do not strip arguments to CGI scripts.
By default,
.B http-analyze
strips arguments from CGI URLs to be able to lump them together.
If your server creates dynamic HTML files through a CGI script,
they are reduced to the URL of the script.
If
.B \-q
is specified, those argument lists are left intact and CGI URLs
with different arguments are treated as different URLs.
Note that this only works for requests to scripts,
which receive their arguments using the
.BR GET ,
but not the
.B POST
method.  See the section
.I "Interpretation of the results"
for an explanation of the request methods and the
.B StripCGI
directive.
.NE 5v
.TP
.B \-v
(verbose) comment ongoing processing.
Warnings are printed only in verbose mode.
Use this option to see how
.B http-analyze
processes the logfile.  If
.B \-v
is doubled, a dot is printed for each new day in the logfile.
.TP
.B \-x
list each image URL literally rather than lumping them together
under the item
.IR "All images" .
Without this option,
.B http-analyze
collects all requests for images
.I "(*.gif, *.png, *.jpg, *.ief, *.pcd, *.rgb, *.xbm, *.xpm, *.xwd, *.tif)"
under the item
.I "All images"
to avoid cluttering up the lists with lots of image URLs.
If
.B \-x
is given, each image URL is listed literally unless matched by an explicit
.B HideURL
directive in the configuration file.
.TP
.B \-M
MS\ IIS-Mode: use case-insensitive matching for URLs.
This violates the standard, but is necessary for logfiles
produced by IIS servers to correctly identify unique URLs.
.TP
.BI \-b " bufsize"
defines the size of the I/O buffer for reading the logfile (default: 64KB).
Usually, the best size for I/O buffers is reported on a per-file base
by the operating system, but some OS report the logical blocksize instead.
If
.B "http-analyze\ \-v"
reports a \(d>Best buffer size for I/O\(d< less than or equal to 8\|KB,
you should specify a size of 16\|KB for pipes and up to 64\|KB
for disk files to increase the processing speed.
.TP
.BI \-c " cfgfile"
use
.I cfgfile
as the configuration file.
A configuration file allows you to define the behaviour of
.B http-analyze
and to define the \(d>look & feel\(d< of the statistics report.
See the section
.I "Configuration File"
for a description of possible settings, which are called
.I directives
in the following text.
.TP
.BI \-i " newcfg"
create a new configuration file named
.IR newcfg .
If an old configuration file was also specified using the
.B \-c
option, older settings are retained in the new file.
Any command line options take preceedence over old configuration file
entries and will be transformed into the corresponding directive if
possible.  For example, specifying the output directory using the option
.BI \-o " outdir"
will produce an entry
.BI OutputDir " outdir"
in the new configuration file.
.TP
.BI \-l " libdir"
use
.I libdir
as the central library directory where
.B http-analyze
looks for the pre-requisite files, buttons, and license information
(usually \f(SC/usr/local/lib/http-analyze\f1).
This location can also be specified using the environment variable
.BR HA_LIBDIR .
.TP
.BI \-o " outdir"
use
.I outdir
instead of the current directory as the output directory for the
statistics report.
.B http-analyze
checks automatically for the required files and buttons in
.IR outdir .
If they are missing or out of date, the analyzer copies them from
.B HA_LIBDIR
into the output directory.  See also the
.BR OutputDir " and the " BtnSymlink
directives.
.TP
.BI \-p " prvdir"
defines the name of a \(d>private\(d< directory for the detailed lists of
.IR files ", " sites ,
.IR browsers " and " "referrer URLs" .
Because
.I prvdir
must reside directly under the output directory,
its name may not contain any slashes ('\f(SC/\f1').
A private directory for detailed lists may be useful
to restrict access to those lists if the rest of the
statistics report is publicly available.
Note that for restricting access to the complete statistics report,
you do \fBnot\fP need to place the detailed lists in a private directory.
See also the
.B PrivateDir
directive.
.bp
.TP
.BI \-s " subopt,..."
suppress certain lists in the report.  See also the
.B Suppress
directive.
.I subopt
may be:
.sp .2v
.RS 10
.ta 12n
.vs +1p
.nf
\f(SCAVLoad\f1	to suppress the average load report (top seconds/minutes/hours),
\f(SCURLs\f1	to suppress the overview and list of URLs/items,
\f(SCURLList\f1	to suppress the list of URLs/items only,
\f(SCCode404\f1	to suppress the list of Code 404 (\f2Not Found\f1) responses,
\f(SCSites\f1	to suppress the overview and list of client domains,
\f(SCRSites\f1	to suppress the overview of reverse client domains,
\f(SCSiteList\f1	to suppress the list of all client domains/hostnames,
\f(SCAgents\f1	to suppress the overview and list of browser types,
\f(SCReferrer\f1	to suppress the overview and list of referrers URLs,
\f(SCCountry\f1	to suppress the list of countries,
\f(SCPageviews\f1	to suppress pageview rating (cached files are shown instead),
\f(SCAuthReq\f1	to suppress requests which required authentication,
\f(SCGraphics\f1	to suppress images such as graphs and pie charts,
\f(SCHotlinks\f1	to suppress hotlinks in the list of all URLs,
\f(SCInterpol\f1	to suppress interpolation of values in graphs.
.fi
.vs
.RE
.NE 5v
.TP
.BI \-t " num"
defines the size of certain lists.
.I num
is either a positive number or the value 0 to suppress the corresponding list.
You specify the list by appending one of the following characters to the
number shown here as '\f2#\fP' (note that the characters are case-sensitive):
.sp .5v
.in +3n
.ta 12n
.nf
\f2#\f1\|\f(SCU\f1	\f2#\f1 is the number of entries in the Top URL list (default: 30),
\f2#\f1\|\f(SCL\f1	\f2#\f1 is the number of entries in the Least URL list (default: 10).
\f2#\f1\|\f(SCS\f1	\f2#\f1 is the number of entries in the Top domain list (default: 30),
\f2#\f1\|\f(SCA\f1	\f2#\f1 is the number of entries in the Top agent/browser list (default: 30),
\f2#\f1\|\f(SCR\f1	\f2#\f1 is the number of entries in the Top referrer URL list (default: 30),
\f2#\f1\|\f(SCd\f1	\f2#\f1 is the number of entries in the Top days table (default: 7),
\f2#\f1\|\f(SCh\f1	\f2#\f1 is the number of entries in the Top hours table (default: 24),
\f2#\f1\|\f(SCm\f1	\f2#\f1 is the number of entries in the Top minutes table (default: 5),
\f2#\f1\|\f(SCs\f1	\f2#\f1 is the number of entries in the Top seconds table (default: 5),
\f2#\f1\|\f(SCN\f1	\f2#\f1 is the size of the navigation frame (default: 120 pixels)
.fi
.in -3n
.sp .5v
You can specify more than one
.I num
with a single
.B \-t
option by separating them with commas as in
\&\s-1\f(SC\-t\ 20U,0L,20S\f1\s0.
See also the
.B Top*
directives in the configuration file.
.TP
.BI \-u " time"
defines the time-window for counting
.IR sessions ".  See"
.IR Sessions " in the section " "Inter\%pre\%tation of the results"
for an explanation of this term.
.TP
.BI \-w " hits"
sets the noise-level to
.IR hits .
If a noise-level is defined, all URLs, sites, agents and referrer URLs
with hits below this level are collected under the item
.I Noise
in the
.I "Top N"
lists and overviews to avoid cluttering up those lists.
See also the
.B NoiseLevel
directive.
.TP
.BI \-I " date"
skip all logfile entries until this day (exclusive).
The date may be specified as
.I DD/MM/YYYY " or " MM/YYYY ,
where
.I MM
is the number or the name of a month. Note that in full statistics mode,
.I DD
defaults to the first day of the month if absent. If you specify any
other day in this mode, unpredictable results may occur.
For example, \&\s-1\f(SC\-I\ Feb\f1\s0 restricts the analysis to the
February of the current year.
.TP
.BI \-E " date"
skip all logfile entries starting from this day on (inclusive).
The date format is the same as in
.BR \-I .
To restrict analysis to a certain period, specify the starting date using
.B \-I
and the first date to be ignored using
.BR \-E .
For example, \&\s-1\f(SC\-I\ Jan/99\ \-E\ Feb/99\f1\s0
restricts the analysis to January\ 1999.
.TP
.BI \-F " logfmt"
the logfile format to use. Valid keywords for
.I logfmt
are
.B auto
for auto-sensing the logfile format,
.B clf
for the
.IR "Common Logfile Format" ,
or
.BR dlf " and " elf
for the two variants of the
.IR "W3C Extended Logfile Format" .
See also the section
.I "Logfile Formats"
above.
.TP
.BI \-L " lang"
use the language
.I lang
for warning messages and for the statistics report.
See also the directive
.B Language
and the section
.I "Multi-National Language Support"
for more information about localization of
.BR http-analyze .
.TP
.BI \-C " chrset"
force use of
.I chrset
for the browser's encoding when displaying the statistics report.
This is needed for languages which require special character
sets such as Chinese.  See also
.B HTMLCharSet
and the section about
.IR "Multi-National Language Support" .
.TP
.BI \-G " pattern,..."
defines additional pageview patterns.
All URLs matching one of the
.I patterns
are classified as pageviews (text files).  If
.I pattern
starts (doesn't start) with a slash (`\f(SC/\f1'), it is treated
as a prefix (suffix) each URL is compared with.
The suffix
.B \.html
is pre-defined by default.
You can add 9 more patterns here, for example
.BR \.shtml ", " \.text " and " /cgi-bin/ .
To specify more than one suffix with a single
.B \-G
option, use commas to separate them.  See also the
.B PageView
directive.
.TP
.BI \-H " idxfile,..."
defines additional directory index filenames.
The name
.B index.html
is pre-defined by default.
.B http-analyze
truncates URLs containing an index filename so that they merge with `/'
(their \(d>base URL\(d<).  For example,
.IR /dir/index.html " is truncated to " /dir/ .
You can add up to 9 more names for directory index files, for example
.BR Welcome.html " or " home.html .
To specify more than one name with a single
.B \-H
option, use commas to separate them.  See also the
.B IndexFiles
directive.
.TP
.BI \-O " vname,..."
defines additional (virtual) names for this server to be classified as
.IR "self-referrer URLs" .
The server's primary name (from \f3-S\fP or \f3-U\fP) is pre-defined already.
If
.I vname
doesn't include a protocol spcifier, two URLs with the
\&\s-1\f(SChttp\f1\s0 and the \&\s-1\f(SChttps\f1\s0 protocol
specifier are added for each name.
To specify more than one server name with a single
.B \-O
option, use commas to separate them.  See also the
.B VirtualNames
directive.
.TP
.BI \-P " prolog"
use
.I prolog
as the prolog file for a yearly VRML model (optional).  The file
.B 3Dprolog.wrl
is included in the distribution as an example. Note that the resulting
VRML model for a whole year may be suitable only for viewing on a fast
system such as a workstation.
The monthly VRML models do not need a prolog file and can be
viewed on any platform without problems.
See also the
.B VRMLProlog
directive.
.TP
.BI \-R " docroot"
restrict logfile analysis to the given Document Root.  If
.I docroot
is prefixed by a `\f(SC!\f1', analysis takes place for all directories except
.IR docroot .
If
.I docroot
does not start with a slash (`\f(SC/\f1'), it is interpreted as the name of a
virtual server, which is matched against the normally unused second
field of a logfile entry.
Intented for use with virtual servers with a separate Document Root
or for which the hostname is recorded in the second field of a
logfile entry.  See also the
.B DocRoot
directive.
.TP
.BI \-S " srvname"
use
.I srvname
for the server name. If no server name is defined,
.B http-analyze
uses the hostname of the system it is running on.
The server name must be a full qualified domain name, not an URL.
See also the
.B ServerName
directive.
.TP
.BI \-T " TLDfile"
use
.I TLDfile
for the list of valid top-level domains (TLDs).
This list currently includes all ISO two-letter country domains,
the well-known domains
.BR \.net ", " \.int ,
.BR \.org ", " \.com ,
.BR \.edu ", " \.gov ,
.BR \.mil ", " \.arpa ,
.BR \.nato ,
and the new
.I CORE
top-level domains
.BR \.firm ", " \.info ,
.BR \.shop ", " \.arts ,
.BR \.web ", "
.BR \.rec ", and " \.nom .
The length of a top-level domain in the TLD file may not exceed 6\ characters.
Since
.B http-analyze
uses its built-in defaults if no TLD file is specified,
you rarely will need this option.  See also the
.B TLDFile
directive and the \%sample TLD file included in the distribution.
.TP
.BI \-U " srvurl"
defines
.I srvurl
as the URL of the server to be used for hotlinks in URL lists.
Useful if the report for your web server is published on another server.
Also necessary for virtual servers to have
.B http-analyze
generate correct hypertext links in the report.
See also the
.B ServerURL
directive.
.TP
.BI \-W " 3Dwin"
defines the window for the VRML model.
The keyword
.I 3Dwin
may be either
.BR extern " or " intern
for display of the VRML model in a new, external window or in the
lower half of the main frame respectively (meaningful only in the
frames-based interface).
.TP
.BI \-Z " showdom"
defines
.I showdom
as the number of components in a domain name which make up
the organizational part.
This is usually the
.IR "second-level domain" ,
so that the last two components of the domain name (for example,
\f(SCcompany.com\f1) are used as the organizationial part.
However, some countries prefer to use
.IR "third-level domains" ,
so that the hostnames use 4 or more components, where the last 3
are used for the organizational part (as in \f(SCcompany.co.uk\f1).
To recognize such third-level domains,
.I showdom
can be set to the value 3.
Hostnames with exactly 3 components will still be reduced
to their second-level domain if
.I showdom
is set to 3.
.TP
.I logfile(s)
This are the name(s) of the logfile(s) to process.
If more than one file is given, they are processed in the order
in which their names appear on the command line.
.B http-analyze
checks for the existance of all files before processing them.
If a `\-' is specified as the filename, standard input is read.
If no file is given, the analyzer either processes the default
logfile specified in the configuration file or the standard input.
.sp 1v
.P
.B "Typical Usage"
.P
This is an example for the typical use of
.B http-analyze
on Unix systems:
.(P
$ http-analyze -v3f -o /usr/web/htdocs/stats /usr/ns-home/logs/access.log
.)P
.sp 1v
.P
On Windows systems, open a DOS window, change into the directory where
you did install
.B http-analyze
and run a command similar to the following:
.(P
C:> http-analyze -v3f -o c:\eweb\ehtdocs\estats c:\eprograms\emsiis\eaccess.log
.)P
.P
Note that on Windows systems,
.B http-analyze
searches for the required buttons and files in the subdirectory
\f(SCfiles\f1 of the current directory it is running in.
Therefore, if you get error messages about missing buttons
make sure you did change into the directory the analyzer is
installed in (by default the installation directory is
\f(SCC:\ePrograms\eRENT-A-GURU\ehttp-analyze2.4\f1).
.bp
.SS "CONFIGURATION FILE"
You can define server-specific configuration settings for
.B http-analyze
in an
.IR "analyzer configuration file" .
To have the analyzer use such a configuration file, specify its name
with the option
.BI \-c " cfgfile"
or the environment variable
.BR HA_CONFIG .
Note that command line options always take preceedence over
settings in a configuration file.
.P
If the option
.BI \-i " newcfg"
is specified,
.B http-analyze
creates a configuration template in the file
.IR newcfg .
Any other command line options present will be transformed into
its appropriate definitions in the new configuration file.
The settings then can be customized further by manually editing
the configuration definitions using a standard text editor.
.P
To update an old configuration file into a new format,
specify its name using the option
.B \-c
in addition to
.BR \-i .
This will instruct the analyzer to retain any settings from the old file.
.P
The configuration file contains a single directive per line.
Except for
.BR IndexFiles ", " PageView ,
.BR AddDomain ", " VirtualNames ,
.BR Ign* ", and " Hide* ,
each directive may appear only once in the configuration file.
Following a directive field there are one or two value fields, which
must be separated from the directive and each other by one or more tabulators.
Blanks are considered part of the string in an optional third field only.
All directive names are case-insensitive.
\%Comment lines starting with a hash character (\f(SC#\f1) are ignored.
.sp .7v
.TP 4
.BI 3DWinSize " width\|\(mu\|height"
Defines the size of the 3D window (default: 520\|\(mu\|420 pixels).
.Ex 3DWinSize 540x450
.TP 4
.BI 3DWindow " keyword"
Defines the 3D window the VRML model is displayed in (same as option
.BR \-W ).
The
.I keyword
may be either
.BR extern " (default) or " intern
for display of the VRML model in a new, external window or in the
lower half of the main frame respectively.
.Ex 3DWindow intern
.TP 4
.BI AddDomain " domain\ string"
Add entries to the domain table causing certain
.I domains
to be allocated to the mock domain
.IR string .
Wildcards in
.I domain
are ignored.
This directive is useful to collect certain hostnames (for example
the hosts of world-wide operating online services), under some
.I string
(item) instead of the country associated with the top-level-domain.
.Ex AddDomain .compuserve.com CompuServe 0
.Ex AddDomain .aol.com\0\0\0\0\0 AOL 1
.)E
.TP 4
.BI AuthURL " boolean value"
Defines whether accesses which required authentication should be skipped.
By default, such URLs appear in the report just like ordinary URLs.
If
.B AuthURL
is set to
.IR Off ", " No ,
.IR None ", " False ", or " 0
the analyzer skips authenticated requests in the logfile,
so that they will be suppressed from the statistics report.
.Ex AuthURL No
.TP 4
.BI BtnSymlink " boolean value"
Creates symbolic links to the required buttons and files in
.B HA_LIBDIR
instead of copying them into the output directory.
If you are going to analyze a large number of virtual servers which
reside on the same host, you can probably save disk space by avoiding
copies of all buttons and files into each output directory.
Note that this directive can be used only on systems which support
symbolic links.
.Ex BtnSymlink Yes
.NE 10v
.TP 4
\f3CustLogoW\fP\ \f2image\ srvurl\fP and \f3CustLogoB\fP\ \f2image\ srvurl\fP
Defines images for use as customer logos in the statistics report.
This feature is available only in the commercial version of the analyzer.
.I image
is the name of the image file relative to the output directory
.B OutputDir
and
.I srvurl
is the URL to be followed if the user clicks on the image.
To use your own logos create two images \- one for use on
white backgrounds (\f3CustLogoW\fP) and another one for use
on black backgrounds (\f3CustLogoB\fP).
The images should be approximately 72\|\(mu\|72 pixels in size
and must be placed into the buttons subdirectory of the central
libdir (\f3HA_LIBDIR/btn\fP).
Next time a report is generated, the analyzer copies those logos
into the output directory and includes them in the report.
.Ex "CustLogoW\0\0btn/mycompany_sw.png" http://www.mycompany.com/ "" 0
.Ex "CustLogoB\0\0btn/mycompany_sb.png" http://www.mycompany.com/ "" 1
.)E
.TP 4
.BI DefaultMode " mode"
The default operation mode of
.BR http-analyze .
The value field contains either the keyword
.B daily
for short statistics mode or
.B monthly
for full statistics mode (see also options
.BR \-d " and " \-m ).
If left undefined, the default is full statistics mode (\f3monthly\fP).
.Ex DefaultMode daily
.TP 4
.BI DocRoot " docroot"
Restricts logfile analysis to the given Document Root (same as option
.BR \-R ).
If
.I docroot
is prefixed by a `\f(SC!\f1', analysis takes place for all directories except
.IR docroot .
If
.I docroot
does not start with a slash (`\f(SC/\f1'), it is interpreted as the name
of a virtual server, which is matched against the normally unused second
field of a logfile entry.
Useful for virtual servers with a separate Document Root.
.B Note:
Do not define this directive to analyze the whole server.
Explicitely setting
.B DocRoot
to `/' (the default) only increases processing time.
.Ex DocRoot /customer/ "" 0
.Ex DocRoot www.customer.com "" 1
.)E
.TP 4
.BI HTMLCharSet " chrset"
Force use of
.I chrset
for the browser's encoding when displaying the statistics report
(same as option
.BR \-C ).
This is needed for languages which require special character
sets such as Chinese.  See also the section about
.IR "Multi-National Language Support" .
.Ex HTMLCharSet iso-8859-1
.TP 4
\f3HTMLPrefix\fP\ \f2prefix\fP and \f3HTMLTrailer\fP\ \f2trailer\fP
The HTML
.IR prefix " and " trailer
to be inserted into the statistics output files at the top and bottom
of the page.  If defined, the
.B HTMLPrefix
string must include the \f(SC<BODY>\f1 tag.
To read the HTML code from a file, specify its name as the
.IR prefix " or " trailer .
.Ex HTMLPrefix "<BODY BGCOLOR=""#FF0000"">" "" 0
.Ex HTMLTrailer "<A HREF=""/intern/"">Back</A> to the internal page." "" 1
.)E
.TP 4
\f3HeadFont\fP\ \f2fontlist\fP, \f3TextFont\fP\ \f2fontlist\fP and \f3ListFont\fP\ \f2fontlist\fP
The fonts to use for headers, for regular text, and for the detailed lists.
If unset, the analyzer uses a list of common serif-less fonts for headers
and regular text and a monospaced (fixed) font for the detailed lists.
To force the navigator's default for fonts, use the keyword
.B default
as the fontname.
.Ex HeadFont "Helvetica,Arial,Geneva,sans-serif" "" 0
.Ex TextFont "Helvetica,Arial,Geneva,sans-serif" "" 1
.Ex ListFont "Courier,monospaced" "" 1
.)E
.TP 4
\f3HeadSize\fP\ \f2size\fP, \f3TextSize\fP\ \f2size\fP, \f3SmallSize\fP\ \f2size\fP and \f3ListSize\fP\ \f2size\fP
The font sizes for headings (navigator default, usually 3),
regular text (default: 2), small text (default: 1) and
lists (default: 2).
.B TextSize
replaces the former
.BR FontSize ,
which is still recognized for backward compatibility with
older configuration files.
.Ex HeadSize 4 "" 0
.Ex SmallSize 2 "" 1
.)E
.TP 4
.BI HideAgent " agent\ string"
Hide a browser type under an arbitrary
.I string
(item).
Needed only for a certain browser whose vendor still can't spell
its name correctly.
Only the leading part of the browser type is compared against
.IR agent ,
so no wildcards are needed in the second field.
.Ex "HideAgent" "Mozilla/4.0 (compatible; MSIE 4.\0\0\0" "MSIE 4.*" 0
.Ex "HideAgent" "Mozilla/3.0 (compatible; MSIE 3.\0\0\0" "MSIE 3.*" 1
.)E
.TP 4
.BI HideRefer " referrer\ string"
Hide certain referrer URLs under an arbitrary
.I string
(item).
Useful to map different referrer URLs for a given host to a common name.
Since only the leading string of the referrer URL is compared against
.IR referrer ,
there is no need to specify wildcards.
As in
.BR HideAgent ,
a wildcard suffix is removed from the string, while a wildcard prefix is
taken literal.
.sp .7v
If the second argument contains a string in square brackets, this defines
the CGI parameter which specifies the search key for search engines.
In this case, the search key will be extracted from the argument list
and prominently displayed after the name of the search engine/web server.
See also the configuration file template produced by
.B "http-analyze\ \-i"
for more examples hot to use the
.B HideRefer
directive.
.Ex "HideRefer" "http://www.altavista.com/" "AltaVista [q=]" 0
.Ex "HideRefer" "http://lycospro.lycos.com/" "Lycos [query=]" 1
.Ex "HideRefer" "http://www.excite.com/\0\0\0" "Excite [search=]" 1
.Ex "HideRefer" "http://www.dino-online.de/" "Dino Online [query=]" 1
.)E
.TP 4
.BI HideSys " hostname\ string"
Hide a
.I hostname
under an arbitrary
.I string
(item).
The string may contain blanks. If the first character of
.I string
is a `\f(SC[\f1', this item is suppressed in the
.I "Top N"
lists.
Hidden items are accounted for separately, but in the summary they
are collected under the description defined with this directive.
You may use the wildcard character `*' as either a prefix
or as a suffix of the
.I hostname
(as in
.BR *\.host\.com " and " 192\.168\.12\.* ),
bot not as both.
Hostnames are case-insensitive.
.sp .7v
When building the list of countries,
.B http-analyze
determines the country from the top-level domain given in
.IR hostname .
If
.I hostname
is an IP number, you can optionally define the top-level domain
to be accounted for by appending it in square brackets to the
.I string
as shown in the last example below.
.Ex HideSys *\.mycompany.com "MY COMPANY" 0
.Ex HideSys 192\.168\.12\.* "MY COMPANY [US]" 1
.)E
.TP 4
.BI HideURL "url string"
Hide an
.I URL
under an arbitrary
.I string
(item).
The string may contain blanks. If the first character of
.I string
is a `\f(SC[\f1', this item is suppressed in the
.I "Top N"
lists.
Hidden items are accounted for separately, but in the summary they
are collected under the description defined with this directive.
You may use the wildcard character `*' as either a prefix
or as a suffix of the
.I URL
(as in
.BR *\.map " and " /subdir/* ),
bot not as both.
URLs are case-sensitive as required by the HTTP standard.
If the option
.B \-M
is specified, URLs will become case-insensitive for
compatibility with non-compliant web servers.
Note that images are hidden automatically under
.I "All images"
by default unless
.B \-x
is specified.
.Ex HideURL "*.map\t" "[All image maps]" 0
.Ex HideURL /robots.txt "[Robot control file]" 1
.Ex HideURL /newsletter/* "MyCompany Monthly Newsletter" 1
.Ex HideURL /products/* "MyCompany Products" 1
.Ex HideURL /~delta-t/ "DELTA-t Homepage" 1
.Ex HideURL /~delta-t/* "DELTA-t more pages" 1
.)E
.NE 5v
.TP 4
\f3IgnURL\fP\ \f2url\fP and \f3IgnSys\fP\ \f2hostname\fP
Ignore entries with a specific URL or accesses from a certain system.
You may use the wildcard character `*' as either a prefix or as a suffix
of the URL or the hostname (as in
.BR *\.png ", " /subdir/file*
and
.BR *\.host\.com ),
but not as both.
Note that all logfile entries are compared against this list while
.B http-analyze
reads the logfile opposed to the
.BR HideURL " and " HideSys
directives, which are looked up for when all entries have been
reduced to the set of unique URLs and hostnames, respectively.
Therefore, many
.BR IgnURL "/" IgnSys
definitions will significantly increase processing time of
.BR http-analyze .
.Ex IgnURL *\.gif,*\.png,*\.jpg,*\.jpeg "" 0
.Ex IgnURL /stats/ "" 1
.)E
.TP 4
.BI IndexFiles " idxfile\|[,idxfile\|...\|]"
Defines additional directory index filenames (same as option
.BR \-H ).
The name
.B index.html
is pre-defined by default.
.B http-analyze
truncates URLs containing an index filename so that they merge with `/'
(their \(d>base URL\(d<).  For example,
.IR /dir/index.html " is truncated to " /dir/ .
You can add up to 9 more names for directory index files.
Note that each name requires another table lookup, which may
significantly increase processing time.
.Ex IndexFiles Welcome.html,home.html,index.htm
.TP 4
.BI Language " lang"
Use the language
.I lang
for warning messages and for the statistics report (same as option
.BR \-L ).
See the section
.I "Multi-National Language Support"
for more information about localization of
.BR http-analyze .
.Ex Language de
.TP 4
.BI LogFile " filename"
The name of the server's logfile.
If you define a default name for the logfile, this file is processed
if no other filenames are explicitely specified on the command line.
If no logfile is specified,
.B http-analyze
always reads
.IR stdin .
.Ex LogFile /usr/ns-home/logs/access
.TP 4
.BI LogFormat " logfmt"
Use this logfile format. Valid values for
.I logfmt
are
.B auto
for auto-sensing the logfile format,
.B clf
for the
.IR "NCSA Common Logfile Format" ,
or
.BR dlf " and " elf
for the two supported variants of the
.IR "W3C Extended Logfile Format" .
See the section
.I "Logfile Formats"
for a detailed description of those formats.
.Ex LogFormat clf
.TP 4
.BI MSIISmode " boolean value"
Use case-insensitive string comparison for URLs.
Needed for MS\ IIS which makes no difference between upper- and
lower-case characters.
MS users may regard this as an enhancement, while for the rest of
the world this is just a violation of the RFC\|2616 HTTP standard
and should be ignored.
.Ex MSIISmode Yes
.TP 4
.BI NavWinSize " width\|\(mu\|height"
Defines the size of the navigation window which pops up in the
conventional interface if JavaScript is enabled.
Useful if the browser displays scrollbars when using the default size
of 420\|\(mu\|190 pixels.
.Ex NavWinSize 440x200
.TP 4
.BI NavigFrame " size"
Defines the size of the navigation frame in pixels.
Useful if the browser displays scrollbars when using the default size
of 120 pixels.
.Ex NavigFrame 140
.TP 4
.BI NoiseLevel " hits"
Sets the noise-level to
.IR hits .
If a noise-level is defined, all URLs, sites, agents and referrer URLs
with hits below this level are collected under the item
.I Noise
in the
.I "Top N"
lists and overviews to avoid cluttering up those lists.
.Ex NoiseLevel 7
.TP 4
.BI OutputDir " directory"
The name of the directory where the output files of the statistics
report should be created (same as option
.BR \-o ).
By default, the output directory is the current directory.
.Ex OutputDir /usr/web/htdocs/stats
.NE 5v
.TP 4
.BI PageView " pattern\|[,pattern\|...\|]"
Defines additional pageview patterns (same as option
.BR \-G ).
All URLs matching one of the
.I patterns
are classified as pageviews (text files).  If
.I pattern
starts (doesn't start) with a slash (`\f(SC/\f1'), it is treated
as a prefix (suffix) each URL is compared with.  The suffix
.B \.html
is pre-defined by default. You can add 9 more patterns here, for example
.BR \.shtml ", " \.text " and " /cgi-bin/ .
Note that each pattern requires another table lookup, which may
significantly increase processing time.
.Ex PageView \.shtml,\.text,/cgi-bin/
.TP 4
.BI PrivateDir " prvdir"
Defines the name of a \(d>private\(d< directory for the detailed lists of
.IR files ", " sites ,
.IR browsers " and " "referrer URLs"
(same as option
.BR \-p ).
Because
.I prvdir
must reside directly under the output directory,
its name may not contain any slashes (`\f(SC/\f1').
A private directory for detailed lists may be useful
to restrict access to those lists if the rest of the
statistics report is publicly available.
Note that for restricting access to the complete statistics report,
you do \fBnot\fP need to place the detailed lists in a private directory.
.Ex PrivateDir lists
.TP 4
.BI RegInfo " customer_name registration_ID"
Defines the customer's name and the registration ID, which are both
shown on the main page in the summary report.
.Ex RegInfo "MyCompany\0\0" 3745JMJZ00000311300000682344
.TP 4
.BI ReportTitle " title"
The document title to use in the statistics report.
.Ex ReportTitle "Access Statistics for MyCompany"
.TP 4
.BI ServerName " srvname"
The official name of the server (same as option
.BR \-S ).
If no server name is defined,
.B http-analyze
uses the hostname of the system it is running on.
The server name must be a full qualified domain name, not an URL.
.Ex ServerName www.mycompany.com
.TP 4
.BI ServerURL " srvurl"
The URL of the server to be used for hotlinks in URL lists (same as option
.BR \-U ).
Useful if the report for your web server is published on another server.
Also necessary for virtual servers to have
.B http-analyze
generate correct hypertext links in the report.
.Ex ServerURL http://www.mycompany.com
.TP
.BI Session " time"
The time-window for counting
.IR sessions .
All unique hosts accessing your server more than once inside
this time-window are accounted for as the same session.
If the distance between two adjacend accesses from the same
host is greater than the time-window, the accesses from this
host are accounted for as different sessions.
.Ex Session "4 hours"
.TP 4
.BI ShowDomain " number"
Defines the number of components in a domain name which make up
the organizational part (same as option
.BR \-Z ).
This is usually the
.IR "second-level domain" ,
so that the last two components of the domain name (for example,
\f(SCcompany.com\f1) are used as the organizationial part.
However, some countries prefer to use
.IR "third-level domains" ,
so that the hostnames use 4 or more components, where the last 3
are used for the organizational part (as in \f(SCcompany.co.uk\f1).
To recognize such third-level domains,
.I ShowDomain
can be set to the value 3.
Hostnames with exactly 3 components will still be reduced
to their second-level domain if
.I ShowDomain
is set to 3.
.Ex ShowDomain 3
.TP 4
.BI StripCGI " boolean value"
Do not strip arguments to CGI scripts (same as option
.BR \-q ).
By default,
.B http-analyze
strips arguments from CGI URLs to be able to lump them together.
If your server creates dynamic HTML files through a CGI script,
they are reduced to the URL of the script.
If
.B StripCGI
is set to
.IR Off ", " No ,
.IR None ", " False " or "
.IR 0 ,
those argument lists are left intact and CGI URLs
with different arguments are treated as different URLs.
Note that this only works for requests to scripts,
which receive their arguments using the
.BR GET ,
but not the
.B POST
method.  See the section
.I "Interpretation of the results"
for an explanation of the request methods.
.Ex StripCGI No
.TP 4
.BI Suppress " subopt,..."
Suppress certain lists in the report (same as
.BR \-s ).
.I subopt
may be one of:
.sp .2v
.in +5n
.ta 12n
.vs +1p
.nf
\f(SCAVLoad\f1	to suppress the average load report (top seconds/minutes/hours),
\f(SCURLs\f1	to suppress the overview and list of URLs/items,
\f(SCURLList\f1	to suppress the list of URLs/items only,
\f(SCCode404\f1	to suppress the list of Code 404 (\f2Not Found\f1) responses,
\f(SCSites\f1	to suppress the overview and list of client domains,
\f(SCRSites\f1	to suppress the overview of reverse client domains,
\f(SCSiteList\f1	to suppress the list of all client domains/hostnames,
\f(SCAgents\f1	to suppress the overview and list of browser types,
\f(SCReferrer\f1	to suppress the overview and list of referrers URLs,
\f(SCCountry\f1	to suppress the list of countries,
\f(SCPageviews\f1	to suppress pageview rating (cached files are shown instead),
\f(SCAuthReq\f1	to suppress requests which required authentication,
\f(SCGraphics\f1	to suppress images such as graphs and pie charts,
\f(SCHotlinks\f1	to suppress hotlinks in the list of all URLs,
\f(SCInterpol\f1	to suppress interpolation of values in graphs.
.fi
.vs
.in -5n
.sp .2v
.Ex Suppress Country,Interpol "" 0
.)E
.TP 4
.BI TLDFile " filename"
Use
.I filename
for the list of top-level domains (same as option
.BR \-T ).
This list includes all ISO two-letter country domains,
the well-known domains
.BR \.net ", " \.int ,
.BR \.org ", " \.com ,
.BR \.edu ", " \.gov ,
.BR \.mil ", " \.arpa ,
.BR \.nato ,
and the new
.I CORE
top-level domains
.BR \.firm ", " \.info ,
.BR \.shop ", " \.arts ,
.BR \.web ", "
.BR \.rec ", and " \.nom .
The length of a domain in the TLD file may not exceed 6\ characters.
Since
.B http-analyze
uses its built-in defaults if no TLD file is specified,
you rarely will need this directive.
.Ex TLDFile /usr/local/lib/http-analyze/TLD
.NE 7v
.TP 4
.BI TblFormat " tblname specifier"
Defines the layout of tables in the statistics report.
The argument
.I tblname
may be one of:
.sp .2v
.in +5n
.ta 12n
.vs +1p
.nf
\f(SCMonth\f1	for the statistics of the last 12 months (main page)
\f(SCDay\f1	for the daily statistics in the short and full summaries
\f(SCLoad\f1	for the average load by weekday, hour, minute, second
\f(SCCountry\f1	for the list of countries
\f(SCTopTen\f1	for all \f2Top N\f1 lists
\f(SCOverview\f1	for all overviews
\f(SCLists\f1	for all detailed lists (preformatted text)
\f(SCNotFound\f1	for the list of \f2NotFound\f1 responses
.fi
.vs
.in -5n
.sp .5v
The
.I specifier
string defines the items to be shown in the table:
.sp .2v
.in +5n
.ta 12n
.vs +1p
.nf
\f(SCn\f1, \f(SCN\f1	an index number or label (don't touch!)
\f(SCh\f1, \f(SCH\f1	the number of \f2hits\f1
\f(SCf\f1, \f(SCF\f1	the number of \f2files sent\f1
\f(SCc\f1, \f(SCC\f1	the number of \f2cached files\f1
\f(SCp\f1, \f(SCP\f1	the number of \f2pageviews\f1
\f(SCs\f1, \f(SCS\f1	the number of \f2sessions\f1
\f(SCk\f1, \f(SCK\f1	the amount of \f2data sent\f1 in Kbytes (integer value)
\f(SCB\f1	the amount of \f2data sent\f1 in bytes (float value)
\f(SCL\f1	a dynamically created label (don't touch!)
.fi
.vs
.in -5n
.sp .5v
If a format specifier is used in upper-case, the value displayed
in the report will include the percentage for this number.
.Ex TblFormat Month "n h f c p s k" 0
.Ex TblFormat Day "n H F C P S k" 1
.Ex TblFormat Country "N H F P S k L" 1
.)E
.TP 4
\f3Top\fP{\f3Days,Hours,Minutes,Seconds,URLs,Sites,Agents,Refers\fP}, \f3LeastURLs\fP
Defines the size of certain
.I "Top N"
tables and lists.
If set to zero, the corresponding list will be suppressed.
.Ex TopURLs 20 "" 0
.Ex LeastURLs 0 "" 1
.Ex TopDays 14 "" 1
.)E
.TP 4
.BI VirtualNames " vname,..."
The list of additional (\(d>virtual\(d<) names for this server
to be classified as
.IR "self-referrer URLs" .
The server's primary name (from
.BR ServerName " or " ServerURL )
is pre-defined already. If
.I vname
doesn't include a protocol specifier, two URLs with the \f(SChttp\f1
and the \f(SChttps\f1 \%protocol specifier will be added for each name.
Since self-referrers are suppressed from the list of referrer URLs,
the remaining entries give a good impression about external pages
referring to some document on your site.
.Ex VirtualNames www2.mycompany.com,mycompany.com "" 0
.Ex VirtualNames www.customer.com,customer.com "" 1
.Ex VirtualNames http://www.other.com,https://secure.other.com "" 1
.)E
.TP 4
.BI VRMLProlog " file"
The name of a prolog file for a yearly VRML model (same as option
.BR \-P ).
Pathnames not beginning with a `/' are relative to 
.BR OutputDir .
If a prolog file is given, an additional yearly model with all
12\ monthly models embedded as inlines is created.
See the section
.I "Output files"
for further information about this yearly model.
.Ex VRMLProlog 3Dprolog.wrl
.NE 10v
.SH "MULTI-NATIONAL LANGUAGE SUPPORT"
.P
.B http-analyze
supports
.I "Multi-National-Language-Support (MNLS)"
according to the
.I "X/Open Porta\%bility Guide (XPG4)"
and the
.IR "System V Interface Definition (SVR4)" .
For systems without MNLS support, a simple native implementation is used.
See the file \f(SCINSTALL\f1 included in the distribution for information
about installation of the appropriate MNLS support for your system.
The option
.B \-V
displays the type of MNLS support compiled into a binary.
.P
All text strings and messages of
.B http-analyze
are contained in a separate message catalog, which is read
at start-up of the program.
If a message catalog is installed in the system, you can select the
language to be used for warning messages and for the statistics
report by \%setting the appropriate
.IR locale .
This can be done by defining the \f(SCLANG\f1 (\f2XPG4/SVR4 MNLS\fP)
or the \f(SCHA_LANG\f1 (\f2native MNLS\fP) environment variable or
by using the option
.BR \-L .
When using
.BR \-L ,
the analyzer switches to the specified language when it has
recognized the option.
If no message catalog exists for the specified locale,
.B http-analyze
uses built-in messages in english language.
.P
Certain languages require a specific character set to be used
by the browser when displaying the statistics report.
This can be defined using the option
.BR \-c " or the " CharSet
directive.
The following table summarizes the most common combinations
of languages and character sets.
Note that the name of the locale is system-specific (for example,
.B de
could be
.B de-iso8859
on some systems.
.P
.sp .2v
.in +5n
.ta 2i 3i
.nf
\f2Country\f1	\f2Locale\f1	\f2Encoding\f1
Standard C	\f(SCC\f1	\f(SCus-ascii\f1
Arabic Countries	\f(SCar\f1	\f(SCiso-8859-6\f1
Belarus	\f(SCbe\f1	\f(SCiso-8859-5\f1
Bulgaria	\f(SCbg\f1	\f(SCiso-8859-5\f1
Czech Republic	\f(SCcs\f1	\f(SCiso-8859-2\f1
Denmark	\f(SCda\f1	\f(SCiso-8859-1\f1
Germany	\f(SCde\f1	\f(SCiso-8859-1\f1
Greece	\f(SCel\f1	\f(SCiso-8859-7\f1
Spain	\f(SCes\f1	\f(SCiso-8859-1\f1
Mexico	\f(SCes_MX\f1	\f(SCiso-8859-1\f1
Finland	\f(SCfi\f1	\f(SCiso-8859-1\f1
France	\f(SCfr\f1	\f(SCiso-8859-1\f1
Switzerland	\f(SCfr_CH\f1	\f(SCiso-8859-1\f1
Croatia	\f(SChr\f1	\f(SCiso-8859-2\f1
Hungary	\f(SChu\f1	\f(SCiso-8859-2\f1
Iceland	\f(SCis\f1	\f(SCiso-8859-1\f1
Italy	\f(SCit\f1	\f(SCiso-8859-1\f1
Israel	\f(SCiw\f1	\f(SCiso-8859-8\f1
Japan	\f(SCja\f1	\f(SCShift_JIS\f1 or \f(SCiso-2022-jp\f1
Korea	\f(SCko\f1	\f(SCEUC-kr\f1 or \f(SCiso-2022-kr\f1
Netherlands	\f(SCnl\f1	\f(SCiso-8859-1\f1
Belgium	\f(SCnl_BE\f1	\f(SCiso-8859-1\f1
Norway	\f(SCno\f1	\f(SCiso-8859-1\f1
Poland	\f(SCpl\f1	\f(SCiso-8859-2\f1
Portugal	\f(SCpt\f1	\f(SCiso-8859-1\f1
Russia	\f(SCru\f1	\f(SCKOI8-R\f1 or \f(SCiso-8859-5\f1
Sweden	\f(SCsv\f1	\f(SCiso-8859-1\f1
Chinese	\f(SCzh\f1	\f(SCbig5\f1
.fi
.in -5n
.sp .7v
.P
Since the message catalogs are independent from the base software,
more languages may become available without having to re-compile
or re-install the software.
Please visit the homepage of
.B http-analyze
for up-to-date information about the available languages.
For more information about localization, see
.IR environ(5) " and " setlocale(3)
in the online manual.
.NE 10v
.SH EXAMPLES
After successful compilation of
.B http-analyze
you can test-run the analyzer before installing it permanently.
Just create a subdirectory for the output files and run
.B http-analyze
on either one of the sample logfiles included in the distribution
(as shown below) or use your web server's logfile.
For example, to create a full statistics including a frames-based
interface and a 3D VRML model in the subdirectory
.BR testd ,
use the following commands:
.(P 4
$ cd http-analyze2.4
$ mkdir testd
$ http-analyze -vm3f -o testd files/logfmt.elf
http-analyze 2.4 (IP22; IRIX 6.2; XPG4 MNLS; PNG)
Copyright 1999 by RENT-A-GURU(TM)
Generating full statistics in output directory `testd'
Reading data from `files/logfmt.elf'
Best blocksize for I/O is set to 64 KB
Hmm, looks like Extended Logfile Format (ELF)
Start new period at 01/Jan/1999
Creating VRML model for January 1999
Creating full statistics for January 1999
\&\.\.\. processing URLs
\&\.\.\. processing hostnames
\&\.\.\. processing user agents
\&\.\.\. processing referrer URLs
Total entries read: 8, processed: 8
Clear almost all counters at 03/Jan/1999
Start new period at 01/Feb/1999
No more hits since 02/Feb/1999
Creating VRML model for February 1999
Creating full statistics for February 1999
\&\.\.\. processing URLs
\&\.\.\. processing hostnames
\&\.\.\. processing user agents
\&\.\.\. processing referrer URLs
\&\.\.\. updating `www1999/index.html': last report is for February 1999
Total entries read: 3, processed: 3
Statistics complete until 28/Feb/1999
$ 
.)P
To view the statistics report, start your browser and open the file
.BR testd/index.html .
.P
For permanent installation of
.BR http-analyze ,
issue a \&\f(SCmake\ install\f1 to copy the required files into
the appropriate directory.
The executable is usually installed in \f(SC/usr/local/bin\f1,
while the required buttons and files are placed under
\f(SC/usr/local/lib/http-analyze\f1 unless this has been
changed by defining the
.B HA_LIBDIR
make macro during installation.
.P
Note that you do not need to install files in a new statistics output
directory anymore if they have been installed in
.BR HA_LIBDIR ;
this is now done automatically by
.B http-analyze
if it runs the first time on this output directory.
.P
Following are some more examples, which assume that the analyzer has
been installed permanently.
The first command processes an archived logfile
.I logYYYY/access.MM
from the server's log directory to create a report for January\ 1999
in the directory
.BR /usr/web/htdocs/stats :
.(P 4
$ cd /usr/ns-home/logs
$ http-analyze -vm3f -o /usr/web/htdocs/stats log1999/access.01
.)P
.P
The next command uncompresses the logfiles for a whole year and
feeds the data via a pipe into the analyzer, which then creates a
statistics report for this period.
All options are passed to the analyzer through a customized
configuration file specified with
.BR \-c :
.(P 4
$ gzcat log1998/access.[01]?.gz | http-analyze -c /usr/httpd/analyze.conf -
.)P
.NE 10v
.P
The following command creates a configuration file template with the name
.BR sample.conf .
Any additional options will be transformed into the appropriate directives
in the new configuration file.
In this example, the server's name specified with
.B \-S
is transformed into a
.B ServerName
directive and the output directory specified with
.B \-o
is transformed into a
.B OutputDir
directive.
All other directives are set to their respective default value.
To further customize any settings, use a standard text editor.
.(P 4
$ http-analyze -i sample.conf -S www.myserver.com -o /usr/web/htdocs/stats
.)P
.sp .7v
.P
To update an old configuration file into the new format while retaining
any old settings, specify its name when creating the new file.
Again, command line options may be used to alter certain settings;
they take preceedence over definitions in the old configuration file.
The following command reads the file
.B oldfile.conf
and transforms its content into a new file named
.BR newfile.conf :
.(P 4
$ http-analyze -c oldfile.conf -i newfile.conf 
.)P
.NE 10v
.SS "REGULAR INVOCATION VIA CRON"
.P
Although
.B http-analyze
can be run manually to process logfiles, it usually is executed
automatically on a regular base.  On Unix systems you use the
.I cron(1)
utility, while Windows systems provide a similar functionality with the
.I AT
command.
To have your statistics report updated automatically, use the following scheme:
.RS 4
.IP 1) 4
Install a cron job which calls
.B "http-analyze\ \-m3f"
to create a full statistics report once per hour or twice per day
depending on the processing load caused by analyzing the logfile.
Note that the full statistics report is created for the first time
at the second day of a new month.
.IP 2) 4
Optionally install a cron job which calls
.B "http-analyze\ \-d"
more often to create a short statistics report.
Although this will only update the
.I "Hits by day"
section of the report, the advantage of the short
statistics mode is that
.B http-analyze
needs only a fraction of the time required to create a full statistics report.
However, this is only needed if the total time needed to create full
statistics reports requires more than 15\ minutes.
.IP 3) 4
Install a shell script which rotates (saves) the server's logfile,
restarts the web server, and then creates the final summary for
this period.  Have
.I cron
execute this script at 00:00 on the \f3first day\f1 of a new month.
See the script
.B rotate-httpd
for an example how to do this for several virtual web servers at once.
.IP 4) 4
Because of delays in execution of the script which rotates the logfile,
heavy used servers sometimes writes a few entries for the new month in
the old logfile.
.B http-analyze
usually detects and ignores such \(d>noise\(d< appearing at the end of
a logfile.
However, to initialize the files for the new month, you should run
.B "http-analyze \-m3f"
on the logfile for the current month immediately after the statistics
for the previous month have been generated.
.RE
.P
Note that all cron jobs must run with the user ID of the owner
of the output directory except for
.BR rotate-httpd ,
which must run with the user ID of the server user.
This is a sample
.IR crontab (1)
for the scheme described above:
.(P 4
# Generate a full statistics report twice per day at 01:17 and 13:17
17  1,13 * * *  /usr/local/bin/http-analyze -m3f -c /usr/httpd/analyze.conf
.sp .5v
# Generate a short statistics report each hour except at 01:17 or 13:17
17  2-12 * * *  /usr/local/bin/http-analyze -d -c /usr/httpd/analyze.conf
17 14-23 * * *  /usr/local/bin/http-analyze -d -c /usr/httpd/analyze.conf
.sp .5v
# Rotate the logfiles at the first day of a new month at 00:00
0 0 1 * *       /usr/local/bin/rotate-httpd
.)P
.SH "PERFORMANCE CONSIDERATIONS"
.P
The processing time needed to create full statistics reports depends
on many factors:
.RS 6
.IP \(bu 2
The size of the I/O buffer (reported by
.BR http-analyze " when " \-v
is given) should be as big as possible.
For example, a buffer size of 64\|KB can significantly reduce
disk activity when reading the logfile.
.PD 0
.IP \(bu 2
If many
.B Ign*
directives are defined, the analyzer must compare each logfile entry
against each entry in the corresponding
.B Ign*
list.
The recommended way to suppress certain parts of the web server in
the statistics report is to have the server not record any accesses
to those areas in the logfile.  Similar, many
.B Hide*
directives may also require additional table lookups,
although this will happen only once for each unique
(different) URLs, sitename, browser type or referrer URL.
.IP \(bu 2
If
.B StripCGI
is set to
.BR No ,
this will require more memory.
.IP \(bu 2
Some systems impose a memory limit on a per-process base (see
.IR ulimit(1) " and " setrlimit(3) ).
There are no unusual requirements regarding main memory needed by
.B http-analyze
\- to be precise that means \(d>the bigger, the better\(d< \-,
but you should make sure that about 5-10\|MB is available for
processing of a medium-size logfile.
.PD
.RE
.SH "TROUBLESHOOTING"
.P
If you discover any problems using the analyzer you may find the verbose
mode helpful.  Each
.B \-v
option increases the verbosity level. In verbosity level 1,
.B http-analyze
comments ongoing processing; in level 2 it indicates progress by
printing a dot for each new day discovered in the logfile.
In level 3, a debug message for each logfile entry parsed
successfully is printed and in level 4 an even more detailed
message appears on standard error.
Furthermore, compiling
.B http-analyze
without the macro
.I NDEBUG
includes various assertion checks in the executable.
.(P 2
$ http-analyze -vvvm3f -o testd files/logfmt.elf
http-analyze 2.4 (IP22; IRIX 6.2; XPG4 MNLS; PNG)
Copyright 1999 by RENT-A-GURU(TM)
Generating full statistics in output directory `testd'
Reading data from `files/logfmt.elf'
Best blocksize for I/O is set to 64 KB
Hmm, looks like Extended Logfile Format (ELF)
      1 01/Jan/1999:16:37:25 [298971279], req="GET /", sz=280 <- OK (Code 200), PAGEVIEW
Start new period at 01/Jan/1999
      2 01/Jan/1999:16:38:39 [298971355], req="GET /def/", sz=910 <- OK (Code 200), PAGEVIEW
      3 02/Jan/1999:16:39:39 [299060697], req="GET /abc/", sz=910 <- OK (Code 200), PAGEVIEW
\&\.\.\.
.)P
.P
.B "Filing bug-reports"
.P
If you want to file a bug report, use the option
.B \-X
to have
.B http-analyze
generate an URL of a bug reporting form with some information
already filled in.
You can pass this URL to your favourite browser using cut\|&\|paste
or \- on Unix systems \- using command substitution as in:
.(P 2
$ netscape `http-analyze -X`
.)P
.P
This address a bug report form on \f(SChttp://support.netstore.de/\f1
with the following information filled in already:
.RS 6
.IP \(bu 2
the customer's name as specified in the registration
.PD 0
.IP \(bu 2
the registration ID with licensing information (Personal/Commercial License)
.IP \(bu 2
the version number of
.B http-analyze
.IP \(bu 2
the platform the program was compiled for.
.PD
.RE
.P
Using this interface to submit report bugs will ensure proper handling
and timely response.
Please note that although we gladly accept bug reports from everyone,
only Commercial Service Licensees are entitled to request technical
assistance or open a support call.
.NE 10v
.SH "REGISTRATION"
.P
.B http-analyze
is available through our web site for evaluation purposes.
In the evaluation version an \(d>unregistered version\(d<
button will show up in the statistics report.
To replace this button with the Netstore\*R logo of the free version
for personal and educational use, just click on the \(d>unregistered
version\(d< button to follow the link to our online registration form
on our web site and register for a free, non-commercial version.
.SS "NON-COMMERCIAL VERSION"
.P
After registration you will receive a registration ID and two registration
images as replacements for the \(d>unregistered version\(d< button by email.
In the free version, the Netstore\*R logo, a copyright note and a link to
the homepage of
.B http-analyze
appears in the statistics report, which must be left intact according
to the license under which this software is made available to you.
.SS "COMMERCIAL VERSION"
.P
If you use
.B http-analyze
for commercial purposes such as providing statistics services for your
customers, you must buy a
.I "Commercial Service License"
available from RENT-A-GURU\*R and its authorized resellers.
You will receive a registration ID and two registration images as
replacements for the \(d>unregistered version\(d< button by email
from our office.
.P
In the commercial version, the Netstore\*R logo, the copyright note and
the link to the homepage of
.B http-analyze
are supressed from the statistics report \- except for the logo and
copyright note, which appears only once on the main page and inside
the navigation frame. On all other pages, your company's name is shown.
Additionally, you can add your company's logo to the report using the
.BR Cust\%LogoW " and " Cust\%LogoB
directives in the configuration file, which are enabled in the
commercial version only.
Except for this feature and the individual support for Commercial Service
Licensees, both versions of the software have \%identical functionality.
.SS "BRANDING THE SOFTWARE"
.P
For all license types, you have to brand your copy of
.B http-analyze
with the registration ID and the registration images.
The registration ID may be set either in a system-wide file (usually
\f(SC/usr/local/lib/http-analyze/REGID\f1) or via the
.B RegInfo
directives in an analyzer configuration file.
The latter method requires specification of the configuration
file each time
.B http-analyze
is invoked.
If you create a system-wide registration file, the registration
information applies to all virtual servers being analyzed.
.P
To brand the software, detach the registration images
we sent to you from the email.
After detaching them, there should be two files
\f(SCfree-netstore_s[bw].png\f1 for the free version and
\f(SCcomm-netstore_s[bw].png\f1 for the commercial version.
Next, define the
.B HA_LIBDIR
environment variable if you did choose another directory for the central
libdir rather than the default (\f(SC/usr/local/lib/http-analyze\f1).
For example, if you can't become
.IR root ,
you would choose a directory for which you have write permissions,
install the analyzer files there and then use the
.B HA_LIBDIR
variable to pass its name to
.BR http-analyze .
Finally, brand the software by executing the following command as root:
.(P 4
# http-analyze -r "\f2Customer Name\fP" \f2regID\fP \f2type\fP
Registration information saved in file `/usr/local/lib/http-analyze/REGID'
# 
.)P
where
.I "Customer Name"
is the name of the organization this license is registered for,
.I regID
is the registration ID of the license and
.I type
is either the keyword \f(SCfree\f1 or \f(SCcomm\f1 according to
the type of the license.
Now run the analyzer to have the new buttons appear in the statistics report.
.P
Note that running the analyzer the first time will install or update
any older buttons and files in the statistics output directory automatically;
there is no need to run some helper application as it was the case in
previous versions of
.BR http-analyze .
.NE 10v
.SH "YEAR 2000 COMPLIANCE"
.P
All versions 2.X and above of
.B http-analyze
are fully Year 2000 compliant.
There will be no problems with date-related functions after the
year 1999 as long as the operating system itself is Year\ 2000
compliant also.
Year 2000 compliant means, that the software does not produce
errors in date-related data or calculations or experience loss of
functionality as a result of the transition to the year 2000.
This Year 2000 compliance statement is not a product warranty.
.B http-analyze
is provided under the terms of the license agreement included
in each distribution.
.P
Please see \f(SChttp://www.netstore.de/Supply/http-analyze/year2000.html\f1 for
more information about the Year 2000 compliance real-time tests we did run with
.BR http-analyze .
.sp .7v
.SS "DATE USAGE IN HTTP-ANALYZE"
.P
The analyzer depends on the timestamp found in the logfile entries
produced by a web server.
For the
.I "NCSA Common Logfile Format" " and the " "W3C Extended Logfile Format"
a Year 2000 compliant date format was choosen from the beginning on.
This unique date format is \- and ever was \- required by
.B http-analyze
to be able to generate a statistics report, so there are no problems
unless those caused by your Operating System (see below).
.P
To retain compatibility with previous versions of the log analyzer,
.B http-analyze
generates two-digit years in some output filenames.
However, those files are placed in a subdirectory \%containing the year in
four digits, which makes all output filenames fully Year 2000 compliant.
.P
The date format in the
.BR \-I " and " \-E
options allows specification of a year using only two digits.
.B http-analyze
interprets values greater and equal to 69 in 1900 and values
lower than 69 in 2000.
This way, the analyzer covers the whole range of the time
representation in modern Operating Systems.
However, any year can always be specified unambiguously by using four digits.
.sp .7v
.SS "DATE USAGE IN THE OPERATING SYSTEM"
.P
Rumors has it that some systems don't recognize the Year\ 2000 as a
leap year.  Although
.B http-analyze
computes leap years for itself correctly, it maps dates into weekdays
using the
.I localtime(3)
function, which might fail if the OS doesn't recognize the Year 2000
as a leap year.
.P
Actually, there is a date-related function in modern operating systems,
which may cause \%problems after the year 2037. For those interested in
the technical details, here's why:
.P
In operating systems the date is often represented in seconds since
a certain date. For example, in Unix systems the date is represented
as seconds since the birth of the OS at January, 1st 1970. This value
is stored in a
.I "signed long"
(4-byte) data object, so it can represent as much as 2147483648 seconds,
which equals 35791394 minutes = 596523 hours = 24855 days = 68 years.
Therefore, most clocks in traditional Unix systems will overflow at
January, 1st 2038 if the OS is not updated before this date.
Since
.B http-analyze
uses several data structures depending on the operating system's idea
of the time (for example, the
.I tm_year
variable contains the years since 1900), the software has to be updated
also before the year 2038 in order to take advantage of the time
representation in future OS versions.
.NE 10v
.SH "ENVIRONMENT VARIABLES"
Environment variables might work only in the Unix version of
.BR http-analyze .
.P
.nf
.ie n \{\
.	ta 18n
\.	ta 18n\}
.el \{\
.	ta |1.4i
\.	ta |1.4i\}
\f(SBHA_LIBDIR\f1	name of the library directory (default: \&\s-1\f(SC/usr/local/lib/http-analyze\f1\s0)
\f(SBHA_CONFIG\f1	name of the configuration file for \f3http-analyze\f1 (no default)
\f(SBLANG\f1	language to use if XPG4 MNLS support is compiled in (see \f3\-V\fP)
\f(SBHA_LANG\f1	language to use if native MNLS support is compiled in (see \f3\-V\fP)
.br
.fi
.SH FILES
.P
The following required files are installed in the library directory
as defined by the environment variable
.B HA_LIBDIR
or the hard-coded default defined at compile-time.
See also the section
.I "Statistics Report"
above for the names of the HTML output files.
.P
.nf
.ie n \{\
.	ta 18n
\.	ta 18n\}
.el \{\
.	ta |1.4i
\.	ta |1.4i\}
\f2btn/*.png\f1	buttons files used in the statistics report
.br
\f2TLD\f1	list of all top-level-domains
.br
\f2ha2.0_*.png\f1	\f3http-analyze\f1 logos for your web site (for black and white bg)
\f2logfmt.[cde]lf\f1	sample logfiles in CLF, DLF and ELF format
.br
\f2\&3D*\f1	required files for VRML model
.fi
.SH "SEE ALSO"
.nf
.ie n \{\
.	ta 18n
\.	ta 18n\}
.el \{\
.	ta |3i
\.	ta |3i\}
\f2rotate-httpd\f1	shell script to rotate the web server's logfiles
.br
\f2http://www.netstore.de/Supply/http-analyze/\f1	homepage of \f3http-analyze\f1
.br
\f2http://support.netstore.de/\f1	support site of \f3http-analyze\f1
.fi
.SH NOTES
.P
Logfile entries must be sorted in chronological order (ascending date)
when feed into the analyzer.
If
.B http-analyze
detects logfile entries from an older month between newer ones,
it prints a warning and skips all entries up to the date of the
last entry processed.
To sort the data from several different logfiles into a chronologically
sorted data stream, we provide a utility \f(SCha-sort\f1 to our Commercial
Service Licensees.
.P
To increase response time of web servers, DNS lookups are often disabled.
In this case
.B http-analyze
does not see any hostname, but only numerical IP addresses.
To resolve the IP addresses into hostnames, we provide a very fast
DNS resolver \f(SCipresolve\f1 to our Commercial Service Licensees,
which does negative caching and saves all data in a history file.
.P
Please visit our support site at \f(SChttp://support.netstore.de/\f1
for more information about the available helper applications.
.SH COPYRIGHT
Copyright \(co 1996-1999 by Stefan Stapelberg, RENT-A-GURU\*R,
<stefan@rent-a-guru.de>
.P
Please see the file
.B LICENSE
included in the distribution for the license terms under which this
program is made available to you in the free, non-commercial version.
.P
.ps -2p
.vs -2p
RENT-A-GURU\*R is a registered trademark of Martin Weitzel,
Stefan Stapelberg, and Walter Mecky.
.br
Netstore\*R is a registered trademark of Stefan Stapelberg.
.vs
.ps
.SH CREDITS
.P
Thanks to the numeruous users of
.B http-analyze
for their valuable feedback.
Special thanks to Lars-Owe Ivarsson for his suggestions to optimize
the parser algorithm and for the code he provided as an example.
Many thanks also to Thomas Boutell (\f(SChttp://www.boutell.com/\f1)
for his great GD library for fast image creation, without
.B http-analyze
couldn't produce such fancy \%graphics in the statistics report.
