This Briefing Package proposes a phase II Activity as a continuation of the work performed by the nearly terminated Web Characterization Group (WCG) as part of the W3C HTTP-Next Generation (NG) Project.
Since the WCG was chartered in July 1997, it has successfully completed its role within the NG Project, providing data to the Protocol Design Group (PDG), and developing representative testbed scenarios. The intent of phase II is based on the experience gathered by the WCG to broaden the scope of the Web characterization and provide information and test scenarios for the W3C Membership and the Web community in general about the Web and how it is being used now and in the near future. By better understanding the Web, we believe that W3C and its Membership is better suited to evolve the Web and to ensure its long term interoperability and robustness.
An important result of WCG is the identification of the three key groups in the characterization work and how they interact:
The format for this Activity is to let the interaction between the reduced data consumers and bulk data providers take place through an Interest Group, with a new Web Characterization Working Group (WCG-II) functioning as the mediator, provider of analysis tools and disseminator of characterization information.
Specifically, we propose to:
The Web Characterization Group (WCG), chaired by Jim Pitkow, Xerox PARC, was originally chartered in August of 1997 as part of the HTTP-NG Project with the intent of providing a set of realistic user scenarios to be used within the HTTP-NG testbed. Specifically, the WCG aimed to fulfill four primary goals for the HTTP-NG Project:
The WCG has completed the first three goals and begun working on the fourth. The group currently consists of members from academia, industry, and W3C member organizations. The principal members are Boston University's Oceans Group, Harvard College's Vino Group, INRIA, Microsoft, Netscape, Virginia Tech's Network Research Group, and Xerox PARC's Webology Group.
While working within the scope of the HTTP-NG Project it became clear that there is a need and interest for Web characterization information within the W3C Membership and in the Web Community in general. The purpose of this briefing package is to propose a framework in which the Web characterization work can continue while broadening the scope to include a larger group of bulk data providers as well as reduced data consumers.
The scope of the Web Characterization Activity is to gain further understanding of how the Web is evolving and how fast changes can propagate in a globally distributed environment.
Efficient techniques for establishing and maintaining trust and privacy of individuals and groups of people is essential for the long term stability of the Web. However, we do not consider providing technical solutions for establishing privacy policies within the scope of this activity - this is better provided by activities like P3P and DSig. As technical solutions evolve, they will be deployed as fit by this Activity.
The scope of this Activity is to characterize the Web as a distributed system and not individuals using the Web. Especially, the Activity will make no effort to identify individual users or to disclose data that can lead to the identification of individual users. Also, it will make no effort to identify groups of people according to race, religion, national or ethnic classification, nor to political, or sexual orientation, see also the section on IPR
The results of the Activity is expected to be of interest to a relatively large set of groups including but not limited to:
Groups within W3C as well as technology designers and ISPs are expected to be able to draw immediate benefit from the results produced by the Activity. Advances in these markets clearly translate to benefits for the Web Community. The advertising and market research groups will benefit, as the diverse methods and tools for measuring Web usage now can be brought into more focus. Academia also stands to profit by obtaining representative data (something that is typically very difficult and done on an ad-hoc basis).
The output of the Activity will provide the W3C Membership and the Web Community in general with an important feedback mechanism that provides information about how new techniques and solutions propagate on the Web and how they affect the way the Web is being used. It may also provide information about existing performance bottlenecks, usability problems etc. which when identified can result in more focused solutions with higher chance of faster deployment.
The true value of this proposed Activity relies on being able to take regular, representative samples over a relatively long period so that the dynamics of the Web can be modelled and reflected back into the evolution of the Web.
This Activity is intended to last 12 months from November 5, 1998.
The Activity will be kicked off by the Web Characterization Workshop, November 5, 1998 in Boston, MA, with the intent of bringing together both W3C Members and Web characterization experts. The results of the Workshop will be the identification of organizations that wish to and could participate in the Working Group, and the formation of the Interest Group.
More information can be found in the Workshop invitation and program.
The WCG-II is intended to work using a request/response based model similar to the one used between the HTTP-NG PDG and the WCG. Requests will be formally issued by the Interest Group and by W3C Activities and the WCG-II will respond with realistic time lines for when and how results can be made available.
The WCG-II will start its work by formally soliciting requests for characterization data needed by other W3C Working Groups and Activities. The solicitation process is intended to occur at six-month intervals, enough time for the Working Group to understand and respond to the requests of the other W3C Groups. Requests from the Interest Group will be dealt with on a case by case basis.
The focus of the WCG-II is expected to include the following tasks:
WCG-II participation is defined in the Member Resources section.
The role of the Interest Group is to be a public discussion forum for bulk data providers and reduced data consumers, and to provide requests and feedback to the Working Group. It is expected that the tools and dissemination mechanism produced by the Working Group will benefit from a feedback mechanism with its immediate users, as well as their continuous review.
Furthermore, the intent of the IG is to establish connections to the Web Community as well as the Internet Community and to provide recommendations and guide lines for Web specifications and implementations. Developments like SURGE can have direct impact on products like caches, Web servers, and proxies; as well as measurement tools and benchmarks like Webstone, SpecWeb96, and WebBench2.0.
Participation in the Interest Group will be open to W3C members as well as non-W3C members. The role of the Interest Group will be to help focus the Working Group, monitor the progress of the Working Group, provide critical review of the work, and help filtering questions and issues presented to the Working Group.
The Working Group will communicate using an archived mailing list and a set of Web pages. The mailing list archives will be made available to the Membership. The group will have a teleconference twice a month, in which members, editors, the Chair and W3C staff representatives will take part. The frequency of the meetings can be increased at the discretion of the Chair, and the availability of resources. Minutes of phone conferences will be posted either on the mailing list or on a Web page.
The Interest Group will maintain its communications using an archive, public mailing list and a set of Web pages. All communication between the Interest Group and the Working Group should take place on the Interest Groups Web pages and/or mailing list.
|October 7, 1998||Advisory Committe ballot closes|
|October 21, 1998||Director's Decision|
|November 5, 1998||W3C Workshop (member and non-member) on Web characterizations, testbed simulators, and automatic characterization methodologies||Abstract reviewed Workshop with attendance limited to 50 participants|
|November 15, 1998||Identification and inclusion of new members into the Working Group||Milestone|
|November 15, 1998||Workshop minutes||Report to the Membership|
|December 5, 1998||Summary of Workshop outlining research opportunities and proposed solutions||W3C Note|
|December 5, 1998||Solicitation to other W3C Working Groups||Memo to other W3C Working Groups|
|December 5, 1998||Initial repository containing data sets, tools, and bibliography of Web characterization research||Initial online repository available to the public|
|January 5, 1998||Initial proposal for automatic Web characterization||W3C Note|
|January 5, 1998||Closing date for contracting with other W3C Working Groups||Milestone|
|February 5, 1998||Initial proposal for refined testbed||W3C Note|
|March 25, 1998||Fulfillment of contracts with other W3C Working Groups||Report to the Membership|
|April 5, 1998||Six month status report||Report to the Membership|
|April 15, 1998||Solicitation to other W3C Working Groups||Memo to other W3C Working Groups|
|May 15, 1998||Closing date for contracting with other W3C Working Groups||Milestone|
|August 5, 1999||Refined testbed||W3C Note w/ Software|
|September 5, 1999||Completed repository containing data sets, tools, and bibliography of Web characterization research||Online repository available to the public|
|October 25, 1999||Fulfillment of contracts with other W3C Working Groups||Report to the Membership|
|November 5, 1999||Twelve month status report||Report to the Membership|
The privacy concern surrounding log files should be taken into account during this Activity. Specifically, the tools developed should take into account the respect for individual privacy.
The intellectual property rights, IPR (e.g. copyright, patents, and trademarks) for existing software (as well as the log files as such, which count as databases in some legislations, and therefore are subject to protection) should be respected, but all software and other materials produced within the scope of this Activity should be subject to existing W3C IPR policy.
Participants in this Activity are required to inform the Chair and other participants prior to joining the group about any IPR concerns, claims and limitations that they might raise, at the beginning of the project.
Participants in the Interest Group, as well as the Working Group, are expected to make log files and relevant information available for analysis on their sites, and share generated data sets free of charge and on an equitable basis.