# $Id: README,v 1.38 2001/04/23 15:21:32 jerome Exp $
ScanErrLog v2.00 - February 20th 2001
(C) Jerome Alet <alet@unice.fr> 2000-2001
You're welcome to redistribute this software under the
terms of the GNU General Public Licence version 2.0
or, at your option, any higher version.

You can read the complete GNU GPL in the file COPYING
which should come along with this software, or visit
the Free Software Foundation's WEB site http://www.fsf.org

WARNINGS:
=========

Sample reports are not distributed anymore, because it's easy to test
ScanErrLog online at:

	http://cortex.unice.fr/~jerome/scanerrlog/TRY-ME.html

Since version 1.5, you now have to download the jaxml XML generation Python
module. You can download its latest version freely from:

	http://cortex.unice.fr/~jerome/jaxml/

You need at least jaxml-2.22.

To be able to produce the report in PDF format, you have to install
the ReportLab's Python module.
You can download it freely from:

	http://www.reportlab.com/

The latest official release of ReportLab, 1.06 at this time, works just fine,
but any older version may only work partially or not at all.

---------------------------------------------------------------
Nota Bene: Since 2.00 you don't need the jahtml module anymore.
---------------------------------------------------------------

COMPLETE INSTALL:
=================

WARNING: 1, 2, and 4 are now mandatory. 3 is optional.

1 - If you don't have Distutils installed (e.g. python version <= 1.5.2)
then first download it from:

	http://www.python.org/sigs/distutils-sig/

then follow the installation instructions for Distutils and install
it on your system.

2 - If you don't have the jaxml module installed, then download its
latest version from:

	http://cortex.unice.fr/~jerome/jaxml/

then follow the installation instructions for jaxml and install
it on your system.

3 - If you don't have the ReportLab module installed, and you want
to produce reports in PDF format, then download its latest version from:

	http://www.reportlab.com/

then follow the installation instructions for ReportLab and install
it on your system.

4 - Download the latest ScanErrLog version from:

	http://cortex.unice.fr/~jerome/scanerrlog/

Extract it:

	gzip -d scanerrlog-x.xx.tar.gz | tar -xf -

	where x.xx is scanerrlog's latest version number.

Go to scanerrlog's directory:

	cd scanerrlog-x.xx

Just type:

	python setup.py install

You may need to be logged in with sufficient privileges (e.g. root)

This will generally install scanerrlog.py in /usr/local/bin or
an equivalent path depending on your system.

If you want to launch ScanErrLog as a CGI script, please consider
looking at the ScanErrLog.html file included in this package to
see a sample HTML form to do it. Then you may want to copy
scanerrlog.py to your web server's cgi-bin directory and allow the
execution of python CGI scripts. Refer to your web server's
documentation for details.

You can launch scanerrlog.py either directly from the command line,
or as a CGI script, or import it in your own python program and use
(or subclass) the ApacheErrorLog class it defines. In the latter case
take care of ensuring that scanerrlog.py is in your python path before
importing it (e.g. do a sys.path.append('/usr/local/bin') before the
import scanerrlog)

You can test ScanErrLog online at:

	http://cortex.unice.fr/~jerome/scanerrlog/TRY-ME.html

Voil !

HINTS:
======

Producing the same report in different formats is now quickier than
before, thanks to the --continue option:

    * launch ScanErrLog on your error_log file with the --continue option.
    * then for each new format you want of the same report, just
      launch ScanErrLog with the --continue option on an empty file
      in the same directory as the error_log file.

This will make ScanErrLog parse the error_log file only one time, but
produce as many same reports as you want, saving on the processing time and CPU.
Note however that due to the use of the QuickSort algorithm, messages with the same
number of occurences may be ordered differently from one pass to another.

DOCUMENTATION:
==============

ScanErrLog v2.00 (C) 2000 Free Software Foundation

This Python module allows people to parse Apache error_log files from
one of different possible sources (filename, stdin, python file object),
and present their datas in decreasing number of occurences of error
messages.

This is particularly useful if you want to quickly solve the most
annoying problems web surfers encounter visiting your site.

If you run this module directly, it will parse each file which name was
passed on the command line.

If you don't pass any argument on the command line, then scanerrlog will
read an error_log from stdin if you've piped some file or command to its
standard input, or it will print its documentation if you've not.

You can also use it as a CGI script, but you'll not be able to
modify the pattern and outputfile used, and the input filename
should not begin with / or contain .. in its name, all for
security reasons. The names you may use for your CGI variables
are: continue, date, withoutheader, title, limit, exclude, format and
inputfile.
if continue, date or withoutheader exist in your form, these options
will be set to TRUE whatever value they have. See ScanErrLog.html for
a sample form to launch ScanErrLog as a CGI script.


e.g.:

    ./scanerrlog.py

prints scanerrlog's documentation (what you are reading now)


    ./scanerrlog.py /var/log/httpd/error_log /var/log/httpd/error_log.1

will read datas from the specified files.


    ./scanerrlog </var/log/httpd/error_log

will read datas from standard input

You can pass some options on the command line:

options:
	-c | --continue		useful if you want to parse the same file
				many times (e.g. every week): the current
				state and statistics of the file are saved
				in a file named ScanErrLog.stats in the
				same directory, so you don't have to reparse
				the beginning of the file each time. You
				should use this option either to tell
				ScanErrLog to save the statistics or to reuse
				the saved ones.
				Without this option the file is completely
				parsed again, even if you've got an old
				statistics file saved in the same directory.
				WARNING: this option is incompatible with
				the parsing of multiple files.
	-d | --date		include in the final report the date when
				each message appeared for the last time.
				this option is mutually exclusive with
				the --pattern option.
	-e | --exclude e	e is a slash separated list of
				messages severity. All messages with
				a severity listed in e are excluded
				from the final report. By default all
				messages are included. For example,
				e can be: info/debug to exclude all
				messages which severity is info or
				debug.
	-f | --format f		output format for the report, f can be
				any of:
				    'html', 'pdf', 'text', 'xml'
				the default format is 'html'.
	-h | --help		displays this help screen.
	-l | --limit lim	selects messages only if their number of
				occurences equals or exceeds lim.
				lim's default value is 1, meaning all
				messages are included in the final report.
	-n | --nocumulate	don't cumulate counts for all the files
				passed on the command line. the old
				-c | --cumulate option is now the default.
				if the following option -o is not used,
				then -n implies -w because all reports
				will be in the same file (stdout).
	-o | --outputfile f	save the report in the file f.
				if -n is used, then the filename will
				be n.f where n is an integer incremented
				for each new file and starting at 1.
	-p | --pattern regexp	select only the lines which match regexp.
				the default regexp is:
	^(httpd: |\B)\[([^\[\]]+)\] \[([^\[\]]+)\] (?:\[([^\[\]]+)\] )?
				which selects all Apache logged messages,
				but not errors from CGI scripts for example.
				to work correctly, your regexp should consume
				all characters from the beginning of the
				error line up to the beginning of the real
				error message.
				this option is mutually exclusive with
				the --date option.
	-t | --title t		sets the report title.
	-v | --version		displays ScanErrLog's version number.
	-w | --withoutheader	suppress the header of the HTML report.
				useful if you want to include the report
				directly into another HTML document.


Warning: some options may not work with all report formats.

A fifth possibility is to import this module into another python
program and use the ApacheErrorLog class it defines.

ScanErrLog comes with ABSOLUTELY NO WARRANTY
This is free software, and you are welcome to redistribute it under
certain conditions; refer to the Gnu General Public License for details.
You'll find the GNU GPL in the file COPYING which should came along
with this software or at http://www.gnu.org

Please e-mail bugs to: alet@unice.fr (Jerome Alet)
