This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 15786 - "an idiomatic phrase from another language" doesn’t cover non-idiomatic transliterated foreign words
Summary: "an idiomatic phrase from another language" doesn’t cover non-idiomatic trans...
Status: RESOLVED WONTFIX
Alias: None
Product: WHATWG
Classification: Unclassified
Component: HTML (show other bugs)
Version: unspecified
Hardware: Other other
: P3 normal
Target Milestone: Unsorted
Assignee: Ian 'Hixie' Hickson
QA Contact: contributor
URL: http://www.whatwg.org/specs/web-apps/...
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2012-01-30 08:54 UTC by contributor
Modified: 2013-01-29 21:14 UTC (History)
3 users (show)

See Also:


Attachments

Description contributor 2012-01-30 08:54:19 UTC
Specification: http://www.whatwg.org/specs/web-apps/current-work/multipage/text-level-semantics.html
Multipage: http://www.whatwg.org/C#the-i-element
Complete: http://www.whatwg.org/c#the-i-element

Comment:
"an idiomatic phrase from another language" doesn’t cover non-idiomatic
transliterated foreign words in English prose

Posted from: 58.90.241.170 by w3.org@boblet.net
User agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_6_8) AppleWebKit/535.7 (KHTML, like Gecko) Chrome/16.0.912.77 Safari/535.7
Comment 1 Oli Studholme 2012-01-30 14:34:54 UTC
The spec for the i element currently includes “an idiomatic phrase from another language” as an example usage, for idiomatic phrases like “de facto” that are commonly italicised in English prose. However transliterated foreign languages are also typically italicised in English prose, and now that the spec doesn’t include the comment “(content whose typical typographic presentation is italicized)”, there’s no mention of this other foreign language-related use case.

My sloppy phrasing in an HTML5 Doctor article[1] has led to some confusion about whether the spec says foreign words should always use the i element. I’d like to correct the article, but the spec currently doesn’t seem to cover transliterated foreign words. Also, while I assumed “idiomatic phrase” referred to definition 1 in the American Heritage dictionary[2], definitions 2 and 3 were also possible interpretations, and 2 would make the current wording apply to transliterated foreign language.

1. “A speech form or an expression of a given language that is peculiar to itself grammatically or cannot be understood from the individual meanings of its elements, as in keep tabs on.”
2. “The specific grammatical, syntactic, and structural character of a given language.”
3. “Regional speech or dialect.”

I also suspect that this use case only applies to transliterated languages in prose in a European language, as for example Japanese doesn’t even have italics (the Japanese equivalent is katakana).


In The Chicago Manual of Style (15th Ed.) italicising foreign words is covered in Chapter 7 (“Spelling, Distinctive Treatment of Words, and Compounds”) 7.51-7.56, and Chapter 10 (“Foreign languages”), especially 10.93. Here are the relevant quotes…

# Chapter 7, under the subtitles “Italics, Capitals, and Quotation Marks” and “Foreign Words”:

7.51 “Italics. Italics are used for isolated words and phrases in a foreign language if they are likely to be unfamiliar to readers.”
“An entire sentence or a passage of two or more sentences in a foreign language is usually set in roman and enclosed in quotation marks.”

7.54 “Familiar foreign words. Foreign words and phrases familiar to most readers and listed in Webster are not italicized if used in an English context”
“If confusion might arise, however, foreign terms are best italicized and spelled as in the original language. ”
“The decision to italicize should not be based solely on whether a term appears in Webster.”

7.55 “Italics at first occurrence. If a foreign word not listed in an English dictionary is used repeatedly throughout a work, it need be italicized only on its first occurrence. If it appears only rarely, however, italics may be retained.”

7.56 “Scholarly words and abbreviations. Commonly used Latin words and abbreviations should not be italicized. [Examples:] ibid., et al., ca., passim. Because of its peculiar use in quoted matter, sic is best italicized.”

# Chapter 10, under the subtitle “Languages Usually Transliterated (or Romanized)”:

10.93 “Italics versus roman. Transliterated terms (other than proper names) that have not become part of the English language are italicized (see 7.51-7.52). If used throughout a work, a transliterated term may be italicized on first appearance and then set in roman (see 7.55). Words listed in the dictionary are usually set in roman (see 7.54).”

Under the subtitle “Classical Greek”:

10.131 “Transliterated Greek words or phrases are usually italicized unless the same words occur frequently, in which case they may be italicized at first mention and then set in roman.”


# Suggested change

s/an idiomatic phrase from another language/an idiomatic phrase or short span of transliterated prose from another language/
…or something conveying these two uses, as yes this is a little cumbersome.

[1] http://html5doctor.com/i-b-em-strong-element/#comment-21939
[2] http://www.thefreedictionary.com/Idiomatic+phrase

Aside: http://www.merriam-webster.com/dictionary/de%20facto ;)
Comment 2 contributor 2012-07-18 16:01:23 UTC
This bug was cloned to create bug 18032 as part of operation convergence.
Comment 3 Ian 'Hixie' Hickson 2012-07-20 04:08:24 UTC
I'm not sure I understand. Can you give an example of the kind of phrase you mean?
Comment 4 Ian 'Hixie' Hickson 2012-08-27 23:09:25 UTC
I'm not sure I understand what kind of thing you mean.

In any case, the term "an idiomatic phrase from another language" in the spec is non-normative, being as it is an example (introduced with "such as").
Comment 5 Ian 'Hixie' Hickson 2012-10-15 23:14:48 UTC
Reopening to minimise forking. Need to consider the change the W3C made to their version.
Comment 6 Ian 'Hixie' Hickson 2013-01-29 21:14:05 UTC
Looks like they just added "or short span of transliterated prose". That doesn't make much sense (why would you mark those up with an <i> but not non-transliterated prose, or non-prose, from another language?).

For the case of just text from another language, lang="" is the appropriate markup, not <i>.