Understanding Hits, Page Views and User Sessions
When analyzing web site traffic, there are a number of measures
that are used to report on activity and volume of visitors.
Because of this, there is often confusion as to what the various
measures actually mean, and how they are calculated. This
document explains the key methods for measuring traffic to
a web site, what the differences are, and the associated statistics
that appear in WebTrends reports.
When reading this document, it is important to have a high
level understanding of how web sites deliver information to
a browser. When a user clicks on a link, or types in a URL
in the address line in their browser, they are actually sending
a request to a server to send specific information, contained
on a page. Part of this request is the IP address (or return
address) of the user's computer, so the server knows where
to send the page. The page may contain various elements, or
files, such as HTML text, graphic images, such as .gif, .jpg,
.bmp, audio or video files, etc. As the server responds to
this request, it writes a summary of the action into a log
file. WebTrends products read these log files, and analyze,
summarize and report on the contents in an easy to understand
manner.
Methods for Measuring Web Site Activity
The three most common measurements of web site activity are
hits, page views and user sessions. Following is a description
of each.
Total Hits is the total number of files that are requested
from the server. This includes all graphics, audio/video files,
and other supporting files, as well as the actual html page
itself. Total Hits includes all requests in the count whether
or not the files were successfully retrieved. Total Successful
Hits, on the other hand, are only those files that were successfully
served.
Page Views, or Page Impressions is the number of pages
viewed. Pages are files with extensions such as .htm, .html,
.asp (and a few others). (With WebTrends, you can see (and
edit) the full list by clicking Options |Web Log Analysis
| File Types, and then Document File Extensions.) Impressions,
therefore, are a count of the number of pages viewed and do
not include the supporting graphic files. Thus, by definition,
you should have more total hits than page views. For instance,
if a site has 1 web page with 5 graphics on it, every time
a user visited that page, it would be reported that 6 hits
and 1 page view or impression occurred.
User Sessions is the number of unique users who visited
a web site during a certain time period. Measuring user sessions
is more complicated than measuring hits or page views. The
user session statistic can be seen as equivalent to "Unique
Visits," which, unless every visitor only sees one page,
will be less than the number of page views/impressions.
Methods for Counting User Sessions
The most accurate way to count user sessions is for the site
to require that every visitor use a unique username/password
combination before entering the site. This would ensure that
the log file contained information that uniquely identified
every user. WebTrends uses this information in the "authenticated
user" tables.
Obviously, requiring every visitor to have a username and
password is not going to be viable in every situation. Therefore,
many web sites use cookies to uniquely identify their visitors.
Cookies are pieces of software code that reside on the hard
drive of the client (or requesting) computer that contain
information that identifies the computer to the server. There
are problems with using cookies, however, when trying to track
unique user sessions. First, some people may refuse to accept
cookies. Second, cookies can be erased from the client hard
drive. This could result in double counting unique visitors
during a period if the visitor deleted her cookie between
visits. Finally, there is no way to know if the client computer
is a shared computer between many unique visitors.
The final way to track user sessions is through the IP address
of the visitor. Every record in the log file contains an IP
address, as this is how the server knows where to send the
information that has been requested. The limitation to counting
unique IP addresses, however, is that many Internet Service
Providers and companies use various methods that skew the
analysis. Some organizations use dynamic ISP addressing where
an IP address can be determined dynamically when a user logs
in, through the use of firewalls, or by a load-balancing device.
Others, such as AOL, filter all data so it comes through an
intermediate proxy server. In this case, the web server sends
the requests not to the individual requestor, but to the proxy
server of the ISP. The information is then sent on to the
actual visitor, but with the source address of the proxy server.
Calculating user session information, therefore, involves
a number of assumptions. User identification is based on authentication,
cookies, or IP address. Those users that either have a unique
cookie to identify themselves or authenticate on your server
reflect an accurate count of visitors, as they are independent
of IP address or proxy server use. But as noted before, it
isn't necessarily viable in all situations to make users authenticate
through username/password, and not every user will accept
an identifying cookie in their browser, so the third option
of basing user counts on IP addresses is used.
To count a user from an IP address a number of assumptions
must be made. The first assumption is to count a user for
a particular IP as new/unique if the server has no record
of activity for a certain amount of time(30 minutes is the
default in WebTrends products, but this can be modified).
Remember that the Internet functions as a series of requests.
The server sees each of these requests as separate and distinct.
Analysis software, such as WebTrends products, analyzes and
reports on these distinct requests in a meaningful manner.
So the first assumption used is that if we detect a series
of requests being sent to a particular IP address within a
defined time frame, we count these requests as a single user
session. If there are no requests within a particular time
frame, the next time a request comes in to send information
to that particular IP address, we count that request as a
new user session.
The limitations with this scheme are twofold. The first limitation
lies with the use of dynamic IP's or proxy servers at ISPs,
as discussed above. If User A visits a site and immediately
leaves, but User B comes to the site within the time frame
defined, using the same source IP address, both visitors will
be counted as 1 visitor. If, on the other hand, User A visits
the site then goes and gets a cup of coffee, or attends a
meeting which exceeds the defined time frame (ie 40 minutes),
only to return to the site and pull up a second page, she
would be counted as 2 users.
All log file analysis software that counts users has to work
under some set of assumptions similar to those described above.
User Sessions do, however, give a good idea of how many people
are visiting the site and are the only successful way to track
individual visits using current technology.
Top of Page
|