26295 – Page source not defined

This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 26295 - Page source not defined

Summary: Page source not defined

Status:	RESOLVED WONTFIX

Alias:	None

Product:	Browser Test/Tools WG
Classification:	Unclassified
Component:	WebDriver (show other bugs)
Version:	unspecified
Hardware:	PC All

Importance:	P2 normal
Target Milestone:	---
Assignee:	Browser Testing and Tools WG
QA Contact:	Browser Testing and Tools WG

URL:
Whiteboard:
Keywords:

Depends on:
Blocks:	20860
	Show dependency tree / graph

Reported:	2014-07-09 08:37 UTC by Andreas Tolfsen
Modified:	2015-03-31 15:12 UTC (History)
CC List:	3 users (show)

See Also:

Attachments
Example (857 bytes, text/x-python-script) 2014-07-09 13:18 UTC, Andreas Tolfsen	Details

Description Andreas Tolfsen 2014-07-09 08:37:08 UTC

The command to get page source is not yet defined in the specification.  The DOM Parsing specification mentions an XMLSerializer interface which barancev says FirefoxDriver is already using:

    http://domparsing.spec.whatwg.org/#xmlserializer

The algorithm they use for XML serialization is here:

    http://domparsing.spec.whatwg.org/#concept-serialize-xml

It should be made clear in a note or something that the stringified DOM we return doesn't necessarily correctly reflect the active DOM as it would've been seen if inspecting the DOM through a debugger.  With more tactful prose.

Comment 1 Andreas Tolfsen 2014-07-09 13:18:43 UTC

Created attachment 1491 [details]
Example

After having examined this closer I think what we want is to call outerHTML on the documentElement instead, which “[…] represents the markup of the Element and its contents”:

    http://domparsing.spec.whatwg.org/#outerhtml

This seems to be closer to the intended behaviour of getting the current DOM and will work on both HTML and XHTML type documents.

So the spec should simply refer to the DOM Parsing spec's definition of outerHTML without making any special guarantees or promises.  We should however add a note saying something along the lines of this:

“Note: The details of the outerHTML serialization may be subject to user agent details.  For example the sequence order of attributes on DOM nodes is unspecified.”

There are also the following provisions in the spec about error handling:

  * Throws InvalidStateError if element in an XML document cannot be serialized to XML
  * Throws SyntaxError if the given string is not well-formed
  * Throws NoModificationAllowedError if the parent of the element is a document node

The two first are theoretical in our case as there's a limited number of XHTML documents out there that are both ill-formed /and/ which modify the DOM to become invalid at runtime.

The third is irrelevant in our case as we will always be calling this at the document.documentElement.

We could raise an "invalid element state" in this case.

Comment 2 David Burns :automatedtester 2015-03-31 15:12:37 UTC

it was decided not to add pageSource to the spefication so closing.

To get the same result the local end can do 

> driver.executeScript("return document.querySelector(':root').outerHTML")