Friday, October 23, 2009

18.14 Web Page Access Logging




I l@ve RuBoard










18.14 Web Page Access Logging




18.14.1 Problem



You want
to know more about a page than just the number of times
it's been accessed, such as the time of access and
the host from which the request originated.





18.14.2 Solution



Maintain a hit log rather than a simple counter.





18.14.3 Discussion



The hitcount table records only the count for each
page registered in it. If you want to record other information about
page access, use a different approach. Suppose you want to track the
client host and time of
access for each request. In this case, you need a log for each page
rather than just a count. But you can still maintain the counts by
using a multiple-column index that combines the page path and an
AUTO_INCREMENT sequence column:



CREATE TABLE hitlog
(
path VARCHAR(255) BINARY NOT NULL,
hits BIGINT UNSIGNED NOT NULL AUTO_INCREMENT,
t TIMESTAMP,
host VARCHAR(64),
PRIMARY KEY (path,hits)
);


To insert new records, use this query:



INSERT INTO hitlog (path, host) VALUES(path_val,host_val);


For example, in a
JSP page, hits can be logged like
this:



<c:set var="host">
<%= request.getRemoteHost ( ) %>
</c:set>
<c:if test="${empty host}">
<c:set var="host">
<%= request.getRemoteAddr ( ) %>
</c:set>
</c:if>
<c:if test="${empty host}">
<c:set var="host">
UNKNOWN
</c:set>
</c:if>

<sql:update dataSource="${conn}">
INSERT INTO hitlog (path, host) VALUES(?,?)
<sql:param><%= request.getRequestURI ( ) %></sql:param>
<sql:param value="${host}" />
</sql:update>


The hitlog table has the following useful
properties:




  • Access times are recorded automatically in the
    TIMESTAMP column t when you
    insert new records.


  • By linking the path column to an
    AUTO_INCREMENT column hits, the
    counter values for a given page path increment automatically whenever
    you insert a new record for that path. The counters are maintained
    separately for each distinct path value. (For more
    information on how multiple-column sequences work, see Recipe 11.15.)


  • There's no need to check whether the counter for a
    page already exists, because you insert a new row each time you
    record a hit for a page, not just for the first hit.


  • If you want to determine the current counters for each page, select
    the record for each distinct path value that has
    the largest hits value:

    SELECT path, MAX(hits) FROM hitlog GROUP BY path;









    I l@ve RuBoard



    No comments: