Operating W3Olista with Custom URLs or from the Command Line

Few people want to browse their statistics, and then there are sysadmins that to not want to pass on the full processing power available with browsing to the users. And of course you will want to include links to brand-new statistics to your pages, without your visitors having to enter all search criteria into the Query Form on their own.

There are two ways to do this. First, you can link to a specially constructed URL, and second, you can have the reports created offline from the command line, writing W3Olistas output to a file where it can then be accessed as usual.

If you have not already read the Introduction to W3Olista, please do so before proceeding with this text. The Intro will tell you some details you must know before you can go on.

Custom URLs

A Custom URL is a specially constructed URL where the commands to W3Olista are embedded in the Extra Pathname Information. You simply select all needed commands, separate them with slashes, and add this information to the basic URL of W3Olista. See below for all the commands that the program accepts.

The Command Line

On the command line, you give all the commands as parameters to the program, separated by spaces. W3Olista will print its results to standard output, so you will have to redirect its output to a file using the '>' operator. See below for all the commands you can give W3Olista.

Considerations

You may ask yourself why there are two possibilities to create the same report. There is a simple answer to this question: Using Custom URLs burns much more CPU power as offline-reports. If you put a Custom URL onto your pages, it will cause W3Olista to scan the logfiles for every request of the statistics. But if you create the report once a day and write it to a file, the logs are only scanned once. In addition, the Custom URLs need their time, and many users won't want to wait a couple of minutes to access your statistics.

Even if you are on a fast machine, and you have daily logfiles, I recommend using Custom URLs only in rare cases, like limited statistics for a single day. For more sophisticated statistics, you're much better off with offline reports from the command line.

Command Reference

This is a listing of all the commands you can give W3Olista, either as part of the Custom URL or on the command line. Some commands are only recognized in some contexts, so please read carefully. Commands are case-insensitive except where noted. Most commands are given as an equation of the form item=value. There are only a few commands that don't take a value; you already know some of these exceptions: cgi, html and form.

First Section: Required

At least one of these commands must appear in order to make the program do something (at least something different than producing an error message).

Listings

You can choose any combination of HostList, FileList and DateList with a single request, but the Entry Listing is exclusive.
HostList=<switch>
Selects whether or not to print a listing of the hosts that have accessed your server. <switch> can have the following values:
FileList=<switch>
Basically the same as the HostList. Selects whether or not to print a directory tree of the accessed files. <switch> can have the following values:
DateList=<switch>
Again similar to HostList. Selects whether or not to print a directory tree of the accessed files. <switch> can take the following values:
EntryList=<switch>
This one's a little different from the other two above. This setting selects wheter or not to print the individual logfile entries. <switch> can take the following values:

Timespan to Report

You should also give the program a timespan it shall produce a report on. If you don't, the report will be on today. This is done with three commands:
Report=<date>
Selects the date(s) on which to produce the report. >date< can have the following values:
FromDate=<startdate>
This command only has a meaning if Report=Dates. It sets the starting date of a report (inclusive). <startdate> is accepted in a variety of formats: If you don't specify a final date at which to stop reporting with the following command, only this single day is reported.
ToDate=<enddate>
Only has a meaning if Report=Dates. It sets the date at which to stop reporting (inclusive, the given date is reported). If you don't supply this command, then FromDate is copied, and just the single day is reported. <enddate> accepts the same date formats as above.
Example: Report=Last7Days is equivalent to Report=Dates FromDate=-7 ToDate=-1.

Second Section: Useful

Well, there are defaults for these values, but you really should set at least some of them.

Section three: Advanced settings

Section four: Hostname Resolution

ResolveAddr=<switch>
If set to Yes, the program will try to resolve all numerical IP addresses found in the log file (though if you're analyzing proxy log files, numerical addresses in URLs are not resolved). This feature must be enabled at compilation with the RESOLVEADDR in the Makefile.
All following directives in this section are only effective if this parameter is yes.
ResolveCacheFile=<file>
Gives the full path name of a file featuring address/hostname pairs in a format similar to /etc/hosts. This file will be scanned for unknown IP addresses before bothering your DNS server (being much faster, of course). The idea is to keep resolved addresses (and also addresses we know to be unresolvable) across program runs. This file must exist at program start, but may be empty (to start, create an empty file with touch filename).
ResolveCacheFileReadOnly=<switch>
Usually the program will add the results of its lookups to the given cache file. This is prevented if you set this parameter to No; then the cache file will be opened read-only. Only one instance of W3Olista may write to the cache file, so if you want to run multiple instances simultaneously, all but one must have this parameter set to No.
ResolveCacheLogIP=<switch>
Many IP addresses that appear unresolved in log files aren't resolvable at all, and we can save much time if we also remember our lookup failures. So the default behaviour is to cache unresolvable addresses in the cache file, too, unless you prevent this by setting this parameter to No.
The trouble with caching unresolvable addresses is that this may only be a temporary state, or just that the DNS registration hasn't yet found its way through the net (not to talk about temporarily unreachable servers which know the right name). So a utility is provided with which you can delete all unresolved addresses from the cache from time to time.
ResolveCacheLookUp=<switch>
This parameter defaults to Yes. If you set it to No, the program will only try to look up the IP address in its cache file, but if that fails, it doesn't do the 'real' lookup. So with this parameter being No, you could run the program on a machine which is not connected to the net, using a cache file created somewhere else.

Examples

These are a few examples that should help you swallowing the dry explanations above.

The Command Line

This is a simple invocation of W3Olista from the command line. We assume that the program is globally accessible (somewhere in your search path).
  olista html Report=Yesterday HostList=Sum FileList=Yes > Statistik.html
This command creates statistics on yesterday, including a host summary and the full directory tree. The results are then written to Statistik.html in the current directory. A little more complex is
  olista html Report=Dates FromDate=-10 ToDate=-1 FileList=Yes LinkType=FileLink Server=www.uni-frankfurt.de > MoreStats.html
This produces statistics on the last ten days (up to yesterday). A full directory tree is printed, where each item is a link to the real page on the given HTTP server.
You still need something more complex? Well, then try to cope with two entries from my crontab file (if you don't know about cron, then ignore the first couple of colums).
15  2   *   *   *   rm -f $HOME/WWW/StatLastWeek.html ; $HOME/c/olista/olista html Report=Last7Days infile001=/~fp exhost001=rbi.informatik.uni-frankfurt.de Sort=by-alpha Link=FileLink HostList=Yes HostDetail=1 FileList=Yes DateList=Summary Server=www.uni-frankfurt.de > $HOME/WWW/StatLastWeek.html
30  *   *   *   0   rm -f $HOME/WWW/EverythingFile.html ; $HOME/c/olista/olista html Report=Everything infile001=/~fp exhost001=rbi.informatik.uni-frankfurt.de Sort=by-alpha Link=FileLink HostList=Yes HostDetail=1 FileList=Yes DateList=No Server=www.uni-frankfurt.de > $HOME/WWW/EverythingFile.html

Examples of Command URLs

These are some examples of command URLs. We assume that /~fp/cgi/olista is the redirection rule for the program, and that your server is www.uni-frankfurt.de, which runs on Port 83. What you see here are real URLs that point to our server, but they're linked to pre-prepared documents. Please don't try to access the live document since our server's quite loaded.
http://www.informatik.uni-frankfurt.de/~fp/cgi/olista/html/Report=Yesterday/HostList=Sum/FileList=Yes/Link=NoLink/CountUnique=Yes/exhost001=uni-frankfurt.de/infile001=$~fp
This produces a host summary listing and a complete directory tree of the accesses to my own pages (/~fp), excluding all requests from our own domain (uni-frankfurt.de).
http://www.uni-frankfurt.de/~fp/cgi/olista/html/Report=Yesterday/HostList=Sum/FileList=Yes/Link=FileLink/ServerPort=80/CountUnique=Yes/exhost001=uni-frankfurt.de/infile001=$~fp
This is nearly the same as above, but with each entry linking to the real page. Our document server runs on a different port than the script server, hence I have to give the port number manually.


Frank Pilhofer <fp -AT- fpx.de> Back to the Homepage
Last modified: Tue Nov 7 16:45:24 1995