Last Call Comment: Implications of Audio and Visual Logotypes at the same Time from Sam Hartman on 2008-08-13 (public-usable-authentication@w3.org from August 2008)

From: Sam Hartman <hartmans-ietf@mit.edu>
Date: Wed, 13 Aug 2008 09:53:03 -0400 (EDT)
To: public-usable-authentication@w3.org
CC: hartmans-ietf@mit.edu
Message-Id: <20080813135303.CF9DE41EF@carter-zimmerman.suchdamage.org>

I have finished reviewing the July 24 draft of the user interface
guidelines.  This is excellent work.  However I do have a couple of
comments, the first of which I'll raise in this message.

Section 5.1.4 requires that if a user agent is going to render both an
audio and visual logotype, that the rendering by time synchronized so
they both start at the same time.

Outside of accessibility contexts, this is a fine requirement.
However, I'm familiar with a number of Screen Readers (software
designed to make resources accessible to blind users) where this
requirement would be problematic.  In particular, on Windows (for
products such as Window Eyes, JAWS and Microsoft's Narrator), for Unix
(products such as Orca) and Mac (products such as Voice Over),
rendering of the audio interface is handled by a separate software
component than the visual interface.  The audio interface has access
to special accessibility APIs to get access to the DOM, security
context and other information.

So, it would be very difficult in accessibility contexts to
synchronize the rendering of these two components.  I suspect that if
people implement this requirement they will do so by moving the audio
rendering into the main browser rather than the accessibility
component.  That seems highly problematic, because it separates
rendering of the logotype from rendering of other security context
information.  The information is synchronized visually, but for those
who use the audio user interface, this will mean that logotypes will
be rendered at some random time while the page loads, out of
synchronization with any rendering of signals in section 6.  As a
result, it seems like techniques such as using a different voice for
security context information would be ineffective at preventing
spoofing of the logotype.  An attacker could simply play the logotype
sound at any point in order to spoof an audio user.

To provide a good security experience for audio users, I think it is
important that the logotype be rendered along with other audio
security context information, regardless of when that happens or
whether it is synchronized with visual indicators.

I think the easiest fix is to remove the requirement for
synchronization.  If that is problematic, then scope the requirement
to cases where both logotypes are rendered outside of accessibility
contexts and suggest that accessibility API for browsers provide a
mechanism for screen readers to suppress the browser's own rendering
of the audio logotype.  Screen readers are already security sensitive;
allowing them to configure chrome is no worse than any other trusted
component.


Thanks for your Consideration,

Sam Hartman
Principal, Painless Security LLC

Received on Thursday, 14 August 2008 04:37:47 UTC