Web Characterization Activity
Characterization Metrics
Editor:
Jim Pitkow, Xerox PARC
Last Updated:
Friday, December 18, 1998
This document contains a taxonomy of WWW specific metrics that can be characterized. The metrics are broken into the following areas:
Metrics for Client and Proxy Characterizations
Realm |
Metric |
|
|
Basic Facts |
|
Classification of users (educational, home, ISP, or corporate) |
|
Access method of users (LAN, modem, mobile, or wireless) |
|
Users, response rate, and attrition rate. |
|
Date and duration of study |
|
Description and rational for any cleaning, filtering, etc. of data |
|
Sampling methodology |
|
|
|
Distributions |
|
Entire Study ¾ User Centric |
|
Files transferred per user |
|
Unique files transferred per user |
|
Pages transfer per user |
|
Unique pages transfer per user |
|
Sites visited per user |
|
Unique sites visited per user |
|
Reoccurrence rates for files, pages, and sites per user |
|
Entire Study ¾ Web Centric |
|
Embedded images per page |
|
Mime-type percentage breakdown (e.g., html, jpg, ps, etc.) |
|
Protocol percent breakdown (e.g., http, shttp, gopher, etc.) |
|
Hyperlinks per page |
|
Sessions ¾ General |
|
Sessions per user |
|
Sessions ¾ Temporal |
|
Length of sessions per user |
|
Inter-session time per user (session to session time) |
|
Sessions ¾ Paths |
|
Length of sessions per user |
|
Stack distance per user |
|
Per Session ¾ Temporal |
|
Inter-request time per user (request to request time) |
|
Intra-request time per user (request to render time) |
|
Length of visit per site per user |
|
Per Session ¾ Paths |
|
Length of visit per site per user |
|
|
Metrics for Server Characterizations
Please referrer to the following paper for the source of many of the metrics:
Realm |
Metric |
|
|
Basic Facts |
|
Domain classification of server (.com, .edu, etc.) |
|
Description of content contained on site |
|
Cost to access material on site (free, pay-for-view, etc.) |
|
Type of service provider (single server, virtual hosting, etc.) |
|
Birth and modification history of server (major revisions of content) |
|
Date and duration of study |
|
Description and rational for any cleaning, filtering, etc. of data |
|
|
|
Site Composition (one month) |
|
Number of users |
|
Number of files and page requests per user |
|
Number of search engine hits |
|
Number of files serviced |
|
Number of pages serviced |
|
Number of CGI/dynamic content serviced |
|
Bytes transferred |
|
Byte latency |
|
Total number of files on server |
|
Total number of pages on server |
|
Documents by Traffic graph ( x% documents account for y% of traffic) |
|
|
|
Growth Rates |
|
Number of users |
|
Number of files and page requests per user |
|
Number of files serviced |
|
Number of pages serviced |
|
Number of CGI/dynamic content serviced |
|
Bytes transferred |
|
Byte latency |
|
Number of files on server |
|
Number of bytes on server |
|
Doubling period for all of the above metrics |
|
|
|
Distributions |
|
Entire Server - User Centric |
|
Files transferred per user |
|
Unique files transferred per user |
|
Pages transfer per user |
|
Unique pages transfer per user |
|
Reoccurrence rates for files and pages per user (assumes longitudinal tracking capabilities) |
|
Entire Server ¾ Web Centric |
|
Embedded images per page |
|
Mime-type percentage breakdown (e.g., html, jpg, ps, etc.) |
|
Hyperlinks per page |
|
Longitudinal Sessions ¾ General |
|
Sessions per user |
|
Longitudinal Sessions ¾ Temporal |
|
Length of sessions per user |
|
Inter-session time per user (session to session time) |
|
Longitudinal Sessions ¾ Paths |
|
Length of sessions per user |
|
Per Session ¾ Temporal |
|
Inter-request time per user (request to request time) |
|
Intra-request time per user (request to render time) |
|
Length of visit at site per user |
|
Per Session ¾ Paths |
|
Stack distance per user |
|
Length of visit at site per user |
|
|
Metrics for WWW Characterizations
To be completed