Web Payment APIs

This is a discussion on the requirements for a payment solution neutral API that can be invoked by web applications to initiate a payment. Further background is available from the main page of the Payments Task Force.

Use Cases

In principle, Web APIs could be used for three major classes of payments:

Proximity payments are where both parties are present at the same location. An example is paying for goods or services at a point of sales terminal. Web technologies could be used within the point of sales terminal itself or within a mobile device that is presented to make the payment. This may involve presenting a user interface to select/confirm the use of prepaid vouchers or discount coupons, or for the selection of the means to use for the payment, e.g. selecting one of the virtual cards in your wallet. The process would in turn provide the user with a receipt for the transaction, and optionally discount coupons for use in future purchases. These could be stored in a device or cloud based wallet. Proximity payments could work without the need to be online, which may not be practical in all circumstances.

Remote payments cover the case where the user makes a payment to a website or service. This could be backed via a virtual credit card, or via mobile operator billing service, or by drawing down the value of an account that has been set up for this purpose. The user will be presented with a way to select the means of payment, and to confirm the payment. As in the proximity case, there could be a decision whether to use prepaid vouchers or discount coupons. Likewise, there would be need for providing the user with a receipt and optionally, discount coupons. Remote payments assume an online connection, but in principle, the account could be stored in a local secure element rather than remotely.

Peer to Peer payments would allow the user to pay another person. You could for instance, launch a payment app on your mobile device and select the amount you want to transfer, then tap the recipient's mobile device to make the transfer. The payment app in the recipient's device is invoked via NFC and displays a confirmation of receipt of payment on the display. The payment could be in terms of a shared system, that can be spent without having to go online, or it could involve some form of virtual cheque that needs to be cleared. Peer to peer payments can also be made as a form of remote payment when the recipient isn't present at the same location as the person making the payment. Such payments could be settled live, or require some form of deferred settlement.

The above scenarios show that there is a need for several kinds of APIs.

Initiating a payment request

A common scenario is a web app that requests a payment which is then paid via a wallet integrated into the same device that the app is running on. Another scenario is where the payment will be made by a separate device, for example, a smart phone. The app requesting the payment could be a regular web app running on a user's device, or it could be acting as a dedicated point of sales terminal.

A further scenario is where the user wishing to make a payment is using a public device, e.g. a computer in an Internet Cafe. Users would need to authenticate themselves with cloud based payment solutions, potentially using their phone as part of the authentication process. A related scenario involves browser sync where the user logs into their cloud based settings to initiate a browser session, and the computer temporarily acts as a personal device, purging all local records when the user logs out of the session.

Responding to a payment request

A web app that is invoked to handle a payment request by another device. The app could be running already, or it could be launched automatically, e.g. as a result of an NFC exchange, or perhaps even via presenting a QRCode that the user's device then scans. The payment app could provide the proof of payment directly, or it could be passed indirectly to a server identified in the request.

Initiating a payment

Where one person wants to pay another in a peer to peer transaction where the recipient may or may not be physically present.

Responding to a payment

In this case, the web app is responsible for handling the receipt of the payment and displaying it to the user.

Authenticating the User

On the desktop, a variety of approaches have been adopted, e.g.

User id and password plus a shared secret, e.g. LloydsTSB asks for the characters in this secret at a couple of randomly selected positions.
Displaying a number on screen, then calling the user's phone (as designated in their profile) and asking them to key in the number, as proof that they are physically adjacent to their phone (a mobile or landline).
Hardware generated security codes. Barclays bank's PINsentry requires users to insert their card and type their regular PIN on the device's keypad. An integrated display then shows an eight digit number that the user has to type into the bank's website. For visually-impaired users, a larger card reader will be available that includes a loud speaker and a headphone jack.
Some computers have finger print readers that could in principle be used as a second factor for authentication
please add to this list if you know of other techniques

On mobile devices, the physical presence of the phone provides some assurance, as the device will in many cases be owned and used by a single person. When using a web application from the phone, in principle, it should be possible to provide a proof of the phone's identity. This can be facilitated by services provided by the SIM card, or other secure elements integrated as part of the phone's hardware. The user could also be asked to type in a PIN in the same manner as for conventional point of sales card readers. Biometric techniques are applicable, for example, voice authentication where the user is asked to speak a pass phrase or a small digit sequence. Some phones (e.g. Motorola Atrix 4g, and allegedly Apple's iPhone 5S) integrate fingerprint scanners as a secure means to unlock the phone, and users could be asked to swipe their finger during the payment process. In principle, the phone's integrated camera could be used for face authentication.

Authenticating the Payment Provider

Websites/apps will want to ensure that providers are who they say they are. One technique is to use a whitelist of trusted providers along with the provider's public key. If the user's wallet doesn't contain a payment solution that directly matches the ones supported by the website, then in principle, a mutually trusted third party could be used as a go between. This is likely to be needed to allow for payment solutions to scale globally. This raises the challenge for enabling the website and wallet to negotiate to find a satisfactory solution. A simple approach is for the web application to list the accepted solutions as part of the payment request that is passed to the Web payment API. The wallet could then search for a mutually trusted third party. The proof of payment sent to the website could either be generated by the third party, or it could be countersigned by the third party as a guarantee that the payment will be honoured.

Information Flows

The following section looks at the kinds of information flows needed to support payments, as well as pre-paid vouchers and discount coupons.

Your help is invited to correct and expand the following:

The following analysis considers proximity and remote purchases.

Mozilla Payment Flow

This is included here to provide a concrete example:

The application initiates the process by signing a payment request and calling navigator.mozPay()
The browser initiates a trusted dialog in a chrome generated iframe
The Payment Provider presents a purchasing flow within this trusted dialog
The buyer is authenticated by the Payment Provider using the mechanisms chosen by the provider
The buyer completes or cancels the purchase
The application receives a JavaScript callback when the buy completes or cancels the purchase
The application server receives a signed POST request with a transaction identifier and an indication that the purchase was successfully completed, or that it failed

Here is an example of the payment token:

 paymentJWT = jwt.encode({
   "iss": APPLICATION_KEY,
   "aud": "marketplace.firefox.com",
   "typ": "mozilla/payments/pay/v1",
   "iat": 1337357297,
   "exp": 1337360897,
   "request": {
     "id": "915c07fc-87df-46e5-9513-45cb6e504e39",
     "pricePoint": 1,
     "name": "Magical Unicorn",
     "description": "Adventure Game item",
     "productData": "user_id=1234&my_session_id=XYZ",
     "postbackURL": "https://yourapp.com/payments/postback",
     "chargebackURL": "https://yourapp.com/payments/chargeback"
   }
 }, APPLICATION_SECRET)

For more details, see: https://wiki.mozilla.org/WebAPI/WebPayment

More Generally

The payment request needs to contain information describing:

The name and details for the recipient of the payment
The goods or services being purchased, and their prices
The net price and currency for the payment
The tax involved (if any)
Details for how the response is to be provided
Additional constraints on payment providers

To ensure sufficient flexibility, a tagged structured data format like JSON or XML seems appropriate. Since the request is in essence a contract, it will generally speaking require some form of digital signature. However, there could be exceptions, e.g. for really small payments.

The information needs to be sufficient for the payment provider, and for the receipt. Users may require more human friendly descriptions for the request and receipt than are needed by the payment provider and/or tax authority.

To allow for the use of prepaid vouchers and discount coupons we need a multistage process where the purchasers are presented with the payment request, and offered the chance to use any vouchers or coupons they have available in their virtual wallet. The application requesting the payment can then adjust the request as appropriate. The purchaser can then complete the selection of the means of payment to be used for this transaction. At this point, the payment provider will need to authenticate the purchaser. The requirements for this will depend on the context. Authentication may not be needed for small payments from a device resident wallet in the physical possession of the purchaser.

Once the user dialog with the payment provider completes, a response needs to be provided to the payment requester and separately to the purchaser for the receipt and optional coupons. When using their phone to pay for purchases at a point of sales terminal, users may expect to see (on their phone) an indication that the transaction completed, how much credit they have remaining, and any coupons they have been awarded. The receipts and coupons could be stored on the phone or in a personal service in the cloud.

Note that the process is similar for remote payments, where instead of a phone and a point of sales terminal, we have a browser with a trusted dialog and separately the application requesting the payment.

Payment Request API Responses

The web application invokes the payment request API passing information as described above. The response to the application is asynchronous, and would be via a call back to a handler defined by the application script. In principle, there could be several different call backs

to notify an error in the request
to pass vouchers/coupons for the app to consider and resubmit an updated request
to notify a successful payment and provide a proof of payment
to notify that the payment has been cancelled or failed in some way

For some payment solutions, the proof of payment may need to be delivered direct to a service in the cloud. The details for this would be outside the scope of the W3C API, but the payment request would need to include sufficient information for the payment provider to determine where and how to deliver the proof of payment. The call back to the web application, would allow the application to verify that the proof was delivered, e.g. via an XMLHTTPRequest to the application server.

How would offline payments work? In principle, the wallet could draw down a pre-filled account, keeping a record of the transactions, and synchronising these with the payment provider, when the device next goes online. In this approach, the proof of payment would be provided direct to the web application rather than indirectly as in the previous paragraph.

If the user's wallet doesn't contain a payment provider that directly matches the ones indicated in the payment request, then a search will be needed to find a suitable third party to broker the payment. This would only be practical if the device is online. What requirements does the use of such third parties make on the web payment API?

Person to Person Payments

The W3C System Applications Working Group is developing the APIs and security and execution model for trusted system applications. This would allow for web applications that enable the user to make a payment to another person. The usual requirements of accountability and prevention of double spending apply. If the payer and the payee are face to face, the payer could open up his wallet and select the amount to be paid, then touch his phone to the payee's phone to make the transfer, holding the devices in contact until the confirmation of receipt of payment is registered on the payee's phone display.

One way for this to work is for both devices to use the same payment provider with a prefilled account held in a secure element on the device. The software in the secure element is trusted to debit the payer's account and credit the payee's account. Another mechanism would be to use some kind of digital cheque that has to be cleared before the recipient can spend the money conveyed in the cheque. The payer and payee may well have accounts with multiple payment providers. In this case, some form of negotiation may be needed to find a mutually agreed provider for the current transaction. In principle, this can be implemented with the NFC LLCP peer to peer protocol which permits bidirectional messaging between devices as long as the session lasts. The W3C NFC Working Group Charter envisages work on enabling web applications to make use of this protocol. For interoperability, W3C would need to define the peer to peer messaging involved for the negotiation process.

What implications are there for the payment API? For person to person payments, we have the API used to request the payment in the payer's device, and the API used to register the receipt of payment in the payee's device. It could also be arranged the other way around, where the process starts with a request by an application on the payee's device. The main distinction is the difference between the intent to pay someone else versus the intent to ask for a payment from someone else. This should have minimal impact on the API design.