(A sub-section of the NoteIndex) [Maritza is drafting this section. Input from others welcome.]

Verification

Given the usability aspect of the Working Group's goals, it is useful to gather data from representative user groups to validate the recommendations made. The following are suggestions of methods to evaluate the usability of the security context information as it is presented to the user. The results of these methods can be used to support the claims made, to guide the design of future recommendations, or to indicate areas where more attention is necessary.

The suggested methods are modeled after previously conducted user studies, reference to these will be included where applicable.

Preparation for Verification: Identifying a User for Relevant Use Cases

Relevant use cases should be chosen from the list written for the working group. The chosen use cases should be representative of the tasks where the usability of the security context information is in question. Once the set is decided on, personas or profiles should be created for the target user in each.

The process for identifying the target user should include taking a use case and making a note of what security information is relevant to different levels of users for the given task. Dependent on the use case, there may be a few levels of relevant information given the user's final goal and level of expertise; not all users require the same amount of context information.

Knowledge of who the target user is will help evaluate whether the amount of effort to extract the information is appropriate according to the user's level of experience.

An example where this distinction is useful is when a user needs to be concerned with who issued a certificate, or some other detail that isn't relevant to all users, but should be available on when necessary.

Aspects to Verify

The user test verification should evaluate whether the meaning of the security context information is clear to the user, if the necessary information is available given a reasonable amount of effort according to the task and the user's level of experience, and if the meaning can be determined with little to no training.

Suggested Methods

In each of these methods, the ideal participants are users who most closely represent the user groups defined for the use cases.

One method of user test verification involves participants performing the tasks in the use cases, and asking questions afterwards to gather feedback on how they felt about the security information they were provided with. The format would be similar to an in-lab study followed by a questionnaire.

Another method to gather feedback would be to present participants with the security information in the context of the tasks, without asking them to actually carry out any actions. The format would be more like an interview, or a focus group. The participant would be shown screen shots or something similar, and asked if the information presented sufficient and clear enough for them to make a security decision.

With both of these methods questions to ask the participants would be similar to: Do you understand the information the cue portrays? What does the cue suggest (to help indicate sources of confusion)? Do you feel this information is easily accessible? Do you think enough relevant information is displayed?

The task-based method is slightly more flexible and can be used to gather feedback that may be more representative of how users may react in a natural setting. It's possible to structure the study in a way that distracts the focus from being purely on browsing securely ( which is the case when the user is browsing at home). Similar to the user study on phishing toolbars by Wu et al, the task-based user study can also be adjusted so training is given halfway through the session so feedback can be gathered about the effects of training on results.

Also, the task-based method can be modified to gauge whether or not it is possible to trick the user into trusting a spoofed security cue. Possible spoofs or changes to the security cues could be presented to the participants in the tasks to see if they fall for the fake cues. The study would then be similar to the study in Why Phishing Works by Dhamija et al.