# $Id: README,v 2002/06/12 11:59:42 doug Exp $

httpdstats is a perl program that can be used to generate simple statistical summaries of Apache access log files. The program is intended to be used on small, infrequently visited, web servers, where a daily summary of the access log is all that is needed.


The out of the tarball version of httpdstats assumes that perl is in /usr/bin/perl. You may need to modify the first line of the program if perl is elsewhere. httpdstats assumes that the GetOpt::Long module is available.

To make httpdstats, type 'make'.

To install httpdstats, type 'make install'.

The cron job is placed in the /etc/cron.daily directory; ie. anacron is assumed. Instead, you may wish to edit root's crontab.

Using httpdstats with logrotate

The httpdstats cron job assumes that you are using logrotate to rotate the httpd access_log logs every night. This is not the default behavior of logrotate -- which normally rotates weekly or monthly. To fix this, add the line


to the /var/log/httpd/access_log entry in /etc/logrotate.d/apache.



        URLs for both requests and referrers can be presented
        without arguments and restricted to a maximum size.
        The agent can be restricted to a simple name.


        Can now read the standard "combined" log format, as well as the 
                "common" log format
        Two new partitions: referrer and agent


        Reports are now done through a roll-your-own description format,
                rather than the restricted set allowed before.
        Time partitioning of requests added.
        Protocol partitioning of requests added.
        Host name lookup allowed.
        Host names sort in a natural domain order.
        Escape sequences in URLs are de-escaped.
        Ignore domains can handle IP addresses, as well as host names.
        Thanks to Jon Masters <> for pointing out
                a bug to do with spaces in URLs.


        Added requests-total and bytes-total summaries.
        Added help and subject command line switches.
        Added ignore-domains option for more accurate reporting.
        Config reading command line conflict bug-fix.
        All output options default to 'no', rather than the original odd mix.
        Thanks to Tim Gurney <> for suggestions.

