<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<!DOCTYPE bugzilla SYSTEM "https://www.w3.org/Bugs/Public/page.cgi?id=bugzilla.dtd">

<bugzilla version="5.0.4"
          urlbase="https://www.w3.org/Bugs/Public/"
          
          maintainer="sysbot+bugzilla@w3.org"
>

    <bug>
          <bug_id>29530</bug_id>
          
          <creation_ts>2016-03-14 17:20:27 +0000</creation_ts>
          <short_desc>Proposal for Get Element Text</short_desc>
          <delta_ts>2016-09-19 23:10:35 +0000</delta_ts>
          <reporter_accessible>1</reporter_accessible>
          <cclist_accessible>1</cclist_accessible>
          <classification_id>1</classification_id>
          <classification>Unclassified</classification>
          <product>Browser Test/Tools WG</product>
          <component>WebDriver</component>
          <version>unspecified</version>
          <rep_platform>PC</rep_platform>
          <op_sys>Windows NT</op_sys>
          <bug_status>REOPENED</bug_status>
          <resolution></resolution>
          
          
          <bug_file_loc></bug_file_loc>
          <status_whiteboard></status_whiteboard>
          <keywords></keywords>
          <priority>P2</priority>
          <bug_severity>normal</bug_severity>
          <target_milestone>---</target_milestone>
          
          <blocked>20860</blocked>
          <everconfirmed>1</everconfirmed>
          <reporter name="clmartin@microsoft.com">clmartin</reporter>
          <assigned_to name="Browser Testing and Tools WG">public-browser-tools-testing</assigned_to>
          <cc>dawagner</cc>
    
    <cc>dburns</cc>
    
    <cc>johnjan</cc>
    
    <cc>juangj</cc>
    
    <cc>mike</cc>
    
    <cc>simon.m.stewart</cc>
          
          <qa_contact name="Browser Testing and Tools WG">public-browser-tools-testing</qa_contact>

      

      

      

          <comment_sort_order>oldest_to_newest</comment_sort_order>  
          <long_desc isprivate="0" >
    <commentid>125484</commentid>
    <comment_count>0</comment_count>
    <who name="clmartin@microsoft.com">clmartin</who>
    <bug_when>2016-03-14 17:20:27 +0000</bug_when>
    <thetext>&quot;12.5 getElementText()&quot;

John Jansen and I were reading the specification for getElementText() and he remembers discussing it at a face to face but can&apos;t find the minutes. Below is what we think was agreed upon:

Get Element Text

GET /session/{session id}/element/{element id}/text

The Get Element Text command retrieves the textContent value of the given web element.

1. If the current top-level browsing context is no longer open, return error with error code no such window.
2. Handle any user prompts and return its value if it is an error.
3. Let element result be the result of getting a known element by UUID parameter element id.
4. If element result is a success, let element be element result&apos;s data.
Otherwise, return element result.
5. If element is stale, return error with error code stale element reference.
6. Let element text be the result of calling &lt;a href=&quot;https://www.w3.org/TR/2004/REC-DOM-Level-3-Core-20040407/core.html#Node3-textContent&quot;&gt;textContent&lt;/a&gt; on the specified element.
7. Let body be a JSON Object with the &quot;value&quot; memeber set to the element text.
8. Return success with data body.

What do you think?</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>125495</commentid>
    <comment_count>1</comment_count>
    <who name="Daniel Wagner-Hall">dawagner</who>
    <bug_when>2016-03-15 03:00:04 +0000</bug_when>
    <thetext>As I recall, the reasons we bothered to specify our own algorithm were:
 * textContent will include any text which is hidden by CSS/other styling; i.e. &lt;div&gt;foo&lt;span style=&quot;display: none;&quot;&gt;bar&lt;/span&gt;&lt;/div&gt;&apos;s textContent will return &quot;foobar&quot; where we specify it to return &quot;foo&quot;.
 * textContent doesn&apos;t perform any whitespace normalisation, so looks different to how the text will look in the browser; i.e. &lt;div&gt;foo  bar&lt;/div&gt;&apos;s textContent will return &quot;foo  bar&quot; where we specify it to return &quot;foo bar&quot;.
 * We forcibly insert newlines at the start of new block-level elements; i.e. &lt;div&gt;foo&lt;div&gt;bar&lt;/div&gt;&lt;/div&gt;&apos;s textContent will return foobar where we specify it to return &quot;foo\nbar&quot;.

The thing we really wanted to defer to was much closer to innerText, but innerText isn&apos;t standardised (and wasn&apos;t supported in Firefox until very recently).</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>125497</commentid>
    <comment_count>2</comment_count>
    <who name="David Burns :automatedtester">dburns</who>
    <bug_when>2016-03-15 10:08:55 +0000</bug_when>
    <thetext>We discussed in Sapporo that we want innerText[1] (I know this isnt an official specification) but what it gives us most of the end state that we want.

There is an outstanding bug[2] for Roc&apos;s spec[1] to be incorporated into the html spec

[1] http://rocallahan.github.io/innerText-spec/
[2] https://github.com/whatwg/html/issues/465</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>125510</commentid>
    <comment_count>3</comment_count>
    <who name="clmartin@microsoft.com">clmartin</who>
    <bug_when>2016-03-15 18:15:46 +0000</bug_when>
    <thetext>Just tested to verify, the major advantage of textContent is that all browsers support it and return the same value for it in most cases.

I tested some sample markup: https://jsfiddle.net/whqptr65/


textContent returned the exact same string for foo.textContent:
Edge:
&quot;\r\n            bar\r\n            \r\n                Foo bar\r\n            \r\n            foo\r\n        &quot;
Chrome:
&quot;\r\n            bar\r\n            \r\n                Foo bar\r\n            \r\n            foo\r\n        &quot;
Firefox:
&quot;\r\n            bar\r\n            \r\n                Foo bar\r\n            \r\n            foo\r\n        &quot;

innerText failed in IE (not supported) and returned a different string for Chrome/Firefox/Edge as seen below for foo.innerText:

Edge:
&quot;bar \r\n\r\nFoo bar\r\nfoo &quot;
Chrome:
&quot;bar\r\nFoo bar\r\n\r\nfoo&quot;
Firefox:
&quot;bar\r\n\r\nFoo bar\r\n\r\nfoo&quot;

I would argue in this case having something that works in all browsers the same way would be more valuable than something that works completely differently in each (and is unsupported in IE) not to mention not a spec.
I would also argue that a tester would know what content they can ignore and what is valuable to them, so hidden elements can be circumvented.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>127420</commentid>
    <comment_count>4</comment_count>
    <who name="John Jansen">johnjan</who>
    <bug_when>2016-09-19 16:42:20 +0000</bug_when>
    <thetext>We should follow HTML definition here. Need tests to make sure nothing is broken...</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>127421</commentid>
    <comment_count>5</comment_count>
    <who name="Simon Stewart">simon.m.stewart</who>
    <bug_when>2016-09-19 16:43:12 +0000</bug_when>
    <thetext>We should avoid breaking existing selenium tests --- this method is used extensively, and we can &quot;break the web tests&quot; of many users if we&apos;re not extremely cautious.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>127422</commentid>
    <comment_count>6</comment_count>
    <who name="">juangj</who>
    <bug_when>2016-09-19 23:08:01 +0000</bug_when>
    <thetext>Reopened pending further discussion of the test results.

In summary, the only atoms tests that fail are these two tests about &lt;title&gt; elements: https://github.com/SeleniumHQ/selenium/blob/c10e8a955883f004452cdde18096d70738397788/javascript/webdriver/test/atoms/element_test.html#L151-L161

36 of ~800-ish tests from the Selenium Java suite failed, largely because of extra leading or trailing whitespace, or differing numbers of internal newlines.

For example, TextHandlingTest#testShouldHandleNestedBlockLevelElements fails:
Expected: is &quot;Cheese\nSome text\nSome more text\nand also\nBrie&quot;
     but: was &quot;Cheese\n\nSome text\n\nSome more text\n\nand also\n\nBrie&quot;</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>127423</commentid>
    <comment_count>7</comment_count>
    <who name="">juangj</who>
    <bug_when>2016-09-19 23:10:35 +0000</bug_when>
    <thetext>We could also run this across a much broader suite of &quot;real&quot; tests if that seems helpful, though obviously Google isn&apos;t totally representative of WebDriver&apos;s user base.</thetext>
  </long_desc>
      
      

    </bug>

</bugzilla>