Enhancing Web Application Security with Secure Hardware Tokens

Abstract

The W3C WebCrypto API enables web applications to utilize cryptographic capabilities and key handling natively supported by a user agent, such as a web browser. This API, however, does not consider secure hardware tokens, such as smart cards and SIM cards. These tokens have provided crypto services for several decades supporting a wide range of applications, but there is no common standard for them to serve web applications. To fill this gap, we propose that W3C standardizes how web applications interact with secure hardware tokens. This will enable interoperability and allow web applications to take advantage of the additional security provided by the hardware tokens.

Use Cases

Secure hardware tokens, especially smart cards, are not new to the Web. When a web application uses SSL/TLS mutual authentication for user login from a web browser, the user's private key often resides in her smart card. The web browser can communicate with the card through middleware provided by the platform. The web application itself does not have such access at present.

Many use cases can benefit from web applications being able to communicate and use secure hardware tokens. The following lists some of them.

Strong authentication. This applies to web applications that need strong user authentication. In this case, the secure hardware token stores user's login credentials and performs the required crypto operations.
Digital signature. The applications include online banking, government services, e-health, and enterprise operations. In this case, the secure hardware token stores user's private keys for digital signature. When needed, the token performs the signature operation.
Data encryption. With the increasing use of cloud storage, data security and privacy are major concerns. One way to protect data is client side encryption. The secure hardware token can generate data encryption keys at an application's request. It also encrypts these keys for external storage.
API key protection. Web APIs often require an API key (e.g. secret key, private key, access token, password, etc.) for each API call. For applications with higher security needs, the secure hardware token can store the API key and perform the needed crypto operation, e.g. signing API request.
Remote provisioning. A secure hardware token may come with pre-provisioned keys and applications, or may be provisioned dynamically. Remote token activation, key rotation, certificate renewal, application update, and token deactivation may also be necessary. A web application can perform these tasks remotely when it can communicate and access the hardware tokens.
WebRTC. The WebRTC provides a JavaScript API that enables peer-to-peer communication between web browsers. The browser uses an Identity Provider (IdP) proxy to obtain an identity assertion about the user. It leaves user authentication to the IdP, which may require strong authentication using secure hardware tokens.

Proposal

In order to use the functionalities of a secure hardware token, a web application needs to communicate with it. This requires an Application Programming Interface (API) and an access control mechanism.

1. Application Programming Interface

A web application will communicate with a secure hardware token through a JavaScript API. We can define the API at different levels, which provide different degrees of flexibility and have different security implications. The following outlines three such levels, low, medium, and high level APIs. We present these levels as potential work for W3C standardization and to stimulate discussions.

1.1 Low level API

The smart card industry has standardized smart cards and their communication protocols [7816-4]. The standard card communication protocol data unit is called APDU (Application Protocol Data Unit), which is also referred to as the communication mechanism. The ISO 7816-3 and 7816-4 have defined the APDU standard, which specifies how an off-card application communicates with a smart card. The host computer (e.g. a PC, a mobile phone, or a banking terminal) usually has a middleware for the off-card application to transmit APDUs to and receive responses from the card. A web application running in a browser cannot access such middleware because it is a local resource. What is needed then is a standard interface for web applications to communicate and use the secure hardware tokens.

The Secure Element API [SEAPI] is a draft specification proposed at the W3C SysApps working group. This API enables web applications to communicate with secure hardware tokens at APDU level: connecting to a token reader, establishing a communication channel to the token, sending APDUs to and receiving responses from the token.

Communication at the APDU level provides flexibilities for web applications to utilize functionalities provided by all smart-card-based secure hardware tokens. This allows building a rich variety of applications and supporting new features or standards without changing the API. For example, it will enable a web server to authenticate to a secure hardware token using a standard Terminal Authentication protocol for ePassport and eID applications [GAP]. The downside is the need for web developers to know the APDU mechanism and commands, which most people are not familiar with.

1.2 Medium level API

The medium level API allows web developers to use cryptographic services of secure hardware tokens without having to deal with APDUs. It leverages the current W3C WebCrypto API for operations such as key generation, encryption, decryption, signature, verification, hash, and so on. This requires the user agent to discover and interact with the secure hardware token that the user intends to use. For this purpose, we need to standardize the token discovery interface. Once identified, the token should perform the crypto operations when requested instead of the user agent doing them. More specifically, the token generates keys and stores them inside its secure memory; the token performs the requested crypto operations using the keys inside the token.

Many secure hardware tokens have pre-provisioned keys. The current W3C Key Discovery API is designed for named origin-specific pre-provisioned keys and has one method, getKeyByName(). We propose to extend the API to allow applications to discover keys through attributes. With the access control (to be discussed below), a web application can only discover keys that it has access to.

1.3 High Level API

A high level API offers specific services to applications associated with a market or a domain. Examples of applications include online payment (e.g. using EMV standard), authentication (e.g. using FIDO Alliance standard), and application management (e.g. using GlobalPlatform standard). Developing such API in W3C will require maintenance when the applicative standard evolves. For this reason, a high level API is more appropriate when a specific market or applicative standard is mature.

1.4 Comparison

The low level API will be based on mature smart card standards [7816-4]. The medium level API is an extension to the existing W3C WebCrypto API. The high level APIs may be based on evolving standards, which requires more maintenance effort.

Both medium and high level APIs are more convenient than the low level API from the web application developers' perspective. They allow web developers to use secure hardware tokens in their applications more easily and to expand developer communities more quickly.

The low level API can leverage existing OS and device drivers that support the PC/SC standard. For the medium and high level APIs, the user agent communicates with the hardware token through another interface, which has to be standardized. Token manufactures need to provide middleware, similar to those supporting CAPI or PKCS#11 APIs. This makes deployment more complex.

The low level API enables web developers to utilize all functionalities that secure hardware tokens can offer. When the token has a new functionality, the developers can use it immediately, while the medium and high level APIs cannot provide the access until the specifications are updated.

1.5 Recommended action plan

We recommend W3C to consider working on all those three levels, giving priority to low and medium levels with respect to the maturity of their underlying standards. The lower level API may fill the gap for high level API, while the applicative API are discussed and developed in W3C. For this gap-filling, application developers can write specific JavaScript library for a specific application domain. They can develop, deploy, and evolve such libraries easily, independently of W3C.

2. Access control

The secure hardware tokens contain secret or private keys for their users. The token has many built-in security mechanisms to protect the keys and the operations, such as physical sensors, secure algorithms, user authentication, and try-counters. Access control is another important mechanism that manages which applications can use which functionalities of the token. In the following, we discuss two access control options. Each of them should also have user consent for each website as an additional control.

2.1 Application-based Access Control

The GlobalPlatform [GP] provides standards for managing applications on secure chip technology (e.g. secure element and trusted execution environment). One of its specifications "Secure Element Access Control" [SEAC] has defined a mechanism that prevents malicious applications from accessing a secure element. This method requires collaborations between the secure hardware token and a trusted entity, called Access Control (AC) enforcer, which resides on the host side. For a mobile device, the AC enforcer belongs to the OS. The token maintains a list of access control rules. When an application requests to communicate with a secure hardware token, the AC enforcer fetches the access control rules from the token and checks the application against the rules to decide whether to allow the access. The GP AC mechanism has been extended to the Trusted Execution Environment (TEE). It controls how a Trusted Application running in a TEE may interact with a secure element, which is attached to the device where the TEE resides.

We propose to use the GlobalPlatform's access control mechanism for web applications in using secure hardware tokens [SEAPI]. The user agent (e.g. web browser) may host the AC Enforcer. The origin of the web application can be a part of the access control rules. The rest remains the same as described above. More details can be found in [SEAC, SEAPI].

2.2 Origin-based Access Control

The web application security model includes an important concept called the same-origin policy, which web browsers implement. This policy allows scripts running in a browser to access resources (e.g. DOM) of the same origin as the scripts, and prevents the scripts to access resources of a different origin. The current W3C WebCrypto API uses browser's same-origin policy to control the key access. Through the API, a JavaScript running in a web browser can only use keys created by the same origin that the JavaScript came from. We could extend the origin-based access control to serve secure hardware tokens.

A secure hardware token may hold multiple keys, each of which may serve more than one origins. For example, a government issued signature key for a citizen may be used for signing tax return, driver license renewal, and government service receipt. One way of using origin for access control is to add origin list as another attribute to the access control rule of a key. We can consider two options. When a web application communicates with the hardware token: (1) the application presents its origin in the API call and the browser verifies it before sending the request to the token; or (2) the web browser sends the application's origin to the token in addition to other parameters in the application's API call. The token performs the access control to decide whether this origin is allowed. FIDO's U2F is an example of using origin-based AC [U2F].

2.3 Comparison

A main difference of the two AC mechanisms is who performs the access control task at runtime, the user agent or the hardware token.

With the application-based AC, the user agent controls the token access through the AC enforcer. The secure hardware token trusts the user agent to guard the door. This scheme requires more change in the user agent than in the token.

With the origin-based AC, the secure hardware token controls the access itself. It trusts the user agent to provide correct origin information. This scheme requires more changes in the token than in the user agent.

2.4 Recommended action plan

We recommend W3C to consider working on the access control to secure hardware tokens in parallel with defining the API. The first step could be to agree on a security model and establish the security boundary. W3C could explore both access control mechanisms described above and possibly others. The application-based AC is preferred from the secure hardware token perspective because it is an established standard, an update for web applications has been proposed [SEAPI], and is generic (i.e. not specific to a particular application domain).