Towards a best practice for marking up quotations in HTML+CSS

This essay contains

Table of contents:

Introduction

There are some difficulties with the use of the <q> element in HTML. One is that typing <q> takes much longer than typing ‘ let alone ". But that can often be solved with tools, such as SGML shortrefs or wiki-like input syntaxes. Another difficulties is that the logical end of a quotation is often different from the typographical end, i.e., the place where the quote mark is inserted or the italics end. And the typographical rules are not even the same in all languages.

One may say this is a bug in HTML and CSS: One ought to be able to use logical (semantic) mark-up and still apply the typography that the language, tradition, or author demands.

This essay explores different ways of marking up a text containing quotations, first assuming HTML and CSS as they are today, and then after possible enhancements to HTML and CSS.

Open questions & further work

What are quotations?

Quote marks can be used for various things. Many of those are candidates for mark-up with <q> elements. Quote marks in general are a form of highlighting of text fragments and there are cases where emphasis (<em>) might also be a logical choice.

Quote marks come in various shapes and usually in pairs. (Dashes are also used and usually not paired.) The choice of quote marks is partly a question of taste and partly a question of tradition. In some languages (e.g., French) the tradition is stronger than in others (e.g., Dutch). The meaning depends on the usage, not on the shape.

Many modern typographers now consider that the single high quotes are the best ones for most cases: bigger marks interrupt the reading too much and smaller ones (or the absence of quotes) makes people miss the start or end of the quotation. This is even the case in languages that traditionally used bigger marks, such as “…”, »…« or „…“.

There is less consensus on where to put the marks and when to repeat or suppress them: before or after punctuation, at the start of a line, after another quote mark… The different traditions still seem quite strong here.

Why use mark-up?

Quote marks are characters and one could type them directly rather then the HTML tags that generate them. Why may it be useful to mark up the text?

But there may also be reasons to put the quote marks in the text rather than in the style sheet:

At the end of this essay I explore ways to enhance CSS and HTML to allow both mark-up and author-supplied quote marks in the same document.

Simple case

The simple case is quoting a complete sentence in the same language as the surrounding text:

en-gb

John said: ‘The weather is nice in Nice.’

en-us

John said: ‘The weather is nice in Nice.’

nl-nl

John zei: ‘Het is mooi weer in Nice.’

fr-fr

John disait : « Il fait beau à Nice. »

Mark-up with the <q> element is straightforward:

en-gb

<p>John said: <q>The weather is nice in Nice.</q></p>

en-us

<p>John said: <q>The weather is nice in Nice.</q></p>

nl-nl

<p>John zei: <q>Het is mooi weer in Nice.</q></p>

fr-fr

<p>John disait : <q>Il fait beau à Nice.</q></p>

There is one thing to note, though: The full stop at the end belongs to the quotation and the outer sentence doesn't have one of its own.

All typographic traditions in fact seem to agree that the sequences .’. or .’? or ?’. or .’? or ?’. must never occur. One full stop must be dropped. There is no CSS rule for that; it's the author's responsibility.

The CSS 'quotes' property works well for simple quotations. All that is needed is to specify the preferred quote marks (if HTML's default ones aren't good enough already). E.g., to specify the quote marks for the whole document if the document is in French:

:root {quotes: "«\202F" "\202F»"}

The \202F is a narrow no-break space, which looks better, in my opinion, than the normal no-break space (\A0), which is also often used. (E.g., the book Règles typographiques recommends the no-break space but itself uses the narrow no-break space.)

The result in your browser:

en-gb

John said: The weather is nice in Nice.

en-us

John said: The weather is nice in Nice.

nl-nl

John zei: Het is mooi weer in Nice.

fr-fr

John disait : Il fait beau à Nice.

Here, and in all examples below, we assume the default style rules for the <q> element are:

q::before {content: open-quote}
q::after {content: close-quote}

Thought vs speech

In Dutch, it is not recommended to put quotation marks around text that somebody thought rather than spoke out loud:

en-gb

John thought: ‘No.’ But he said: ‘Yes, of course.’

en-us

John thought: ‘No.’ But he said: ‘Yes, of course.’

nl-nl

John dacht: Nee. Maar hij zei ‘Ja, natuurlijk.’

fr-fr

John pensait : « Non. » Mais il disait : « Oui, bien sur. »

We can mark-up the thought with <q>, but then need a class=thought to allow the quote marks to be suppresed. (Cf. the mark-up recommended in TEI Lite: <q type="thought">)

en-gb

<p>John thought: <q class=thought>No.</q> But he said: <q>Yes, of course.</q></p>

en-us

<p>John thought: <q class=thought>No.</q> But he said: <q>Yes, of course.</q></p>

nl-nl

<p>John dacht: <q class=thought>Nee.</q> Maar hij zei <q>Ja, natuurlijk.</q></p>

fr-fr

<p>John pensait : <q class=thought>Non.</q> Mais il disait : <q>Oui, bien sur.</q></p>

Corresponding CSS rules can be like this:

:root:lang(nl) q.thought::before {content: none}
:root:lang(nl) q.thought::after {content: none}

Result:

en-gb

John thought: No. But he said: Yes, of course.

en-us

John thought: No. But he said: Yes, of course.

nl-nl

John dacht: Nee. Maar hij zei Ja, natuurlijk.

fr-fr

John pensait : Non. Mais il disait : Oui, bien sur.

Punctuation and quotation marks

American English puts full stops and commas inside the quotation marks, even if they don't belong to the quotation:

en-gb

The label said: `For sale’.

en-us

The label said: ‘For sale.’

nl-nl

Het label zei: ‘Te koop’.

fr-fr

Le label disait : « À vendre ».

Currently, CSS cannot move the characters, so there is no choice but to put the closing tag according to the language:

en-gb

<p>The label said: <q>For sale</q>.</p>

en-us

<p>The label said: <q>For sale.</q></p>

nl-nl

<p>Het label zei: <q>Te koop</q>.</p>

fr-fr

<p>Le label disait : <q>À vendre</q>.

There is a similar issue with putting the punctuation in italics if the preceding quotation is in italics. See Quotations in a foreign language below.

Split quotations

In French, if the quotation is interupted by a short phrase, such as John said (called an incise), the guillemets aren't closed and reopened. Compare the following:

en-gb

`The weather is nice’, John said, ‘in Nice.’

en-us

`The weather is nice,’ John said, ‘in Nice.’

nl-nl

`Het is mooi weer’, zei John, ‘in Nice.’

fr-fr

« Il fait beau, John disait, à Nice. »

There is no incise element in HTML and if we follow the semantics, the <q> element should be closed and reopened. But then we need some trickery to get the quote marks right. If we are sure there is never more than one quotation per paragraph, we could try:

q:first-child::after {display: none}
q:last-child::before {display: none}

But that is not a very safe assumption to make. Maybe some classes are needed:

en-gb

<p><q class=first>The weather is nice</q>, John said, <q class=last>in Nice.</q></p>

en-us

<p><q class=first>The weather is nice,</q> John said, <q class=last>in Nice.</q></p>

nl-nl

<p><q class=first>Het is mooi weer</q>, zei John, <q class=last>in Nice.</q></p>

fr-fr

<p><q class=first>Il fait beau</q>, John disait, <q class=last>à Nice</q></p>

With style:

q.first::after {display: none}
q.last::before {display: none}

See if that works in your browser:

en-gb

The weather is nice, John said, in Nice.

en-us

The weather is nice, John said, in Nice.

nl-nl

Het is mooi weer, zei John, in Nice.

fr-fr

Il fait beau, John disait, à Nice

Quotations of more than one paragraph

Here is a small fragment of a long monologue from Lettres de mon moulin (Letters from my windmill) by Alphonse Daudet, as quoted in Règles typographiques. The French rules require the opening guillemet to be repeated at the start of every paragraph.

Other traditions recommend to set off long quotations with white space, if possible. They can soetimes also be set in two columns to distinguish them from the surrounding text.

en

‘I saw Pascal Doigt-de-Poix, who made his olive oil--with monsieur Julien's olives!

I saw Babet the gleaner, who, as she gleaned, grabbed handfuls from the stacks to make up her quota!

I saw Master Grapasi, who oiled his wheelbarrow rather a lot, so as not to be heard!

And Dauphine, who greatly overcharged for water from her wells.’

(This English translation was made for Project Gutenberg by Mireille Harmelin & Keith Adams ©2009)

nl-nl

‘Ik zag Pascal Doigt-de-Poix, die zijn olie maakte met de olijven van meneer Julien.

Ik zag Babet, de arenleesster, die, terwijl ze aren las, handenvol van de stapel nam om sneller haar schoof klaar te hebben.

Ik zag Meester Grapasi, die het wiel van zijn kruiwagen zo goed smeerde.

En Dauphine, die het water van haar put zo duur verkocht.’

fr-fr

« Je vis Pascal Doigt-de-Poix, qui faisait son huile avec les olives de M. Julien.

« Je vis Babet la glaneuse, qui, en glanant, pour avoir plus vite noué sa gerbe, puisait à poignée aux gerbier.

« Je vis maître Grapasi, qui huilait si bien la roue de sa brouette.

« Et Dauphine, qui vendait si cher l'eau de son puits. »

The <q> element cannot contain <p> elements. We can use a blockquote with multiple p's inside, or multiple p's each with a q inside. Here is how it looks with a blockquote:

fr-fr

<blockquote>
<p>Je vis Pascal Doigt-de-Poix, qui faisait son huile avec les olives de M. Julien.</p>
<p>Je vis Babet la glaneuse, qui, en glanant, pour avoir plus vite noué sa gerbe, puisait à poignée aux gerbier.</p>
<p>Je vis maître Grapasi, qui huilait si bien la roue de sa brouette.</p>
<p>Et Dauphine, qui vendait si cher l'eau de son puits.</p>
</blockquote>

Note that there must be no white space between the last word (puits) and the next tag (</p> or </blockquote>). A CSS property 'text-space-trim' is proposed in CSS Text Module Level 4 to fix this and similar white space issues.

If instead we choose a mark-up with multiple <p> elements with <q> elements inside, we need to indicate somehow the start and the end of the quotation that spans several elements. the mark-up could like this:

fr-fr

<p><q class=start>Je vis Pascal Doigt-de-Poix, qui faisait son huile avec les olives de M. Julien.</q></p>
<p><q class=cont>Je vis Babet la glaneuse, qui, en glanant, pour avoir plus vite noué sa gerbe, puisait à poignée aux gerbier.</q></p>
<p><q class=cont>Je vis maître Grapasi, qui huilait si bien la roue de sa brouette.</q></p>
<p><q class=end>Et Dauphine, qui vendait si cher l'eau de son puits.</q></p>

The style for the English and Dutch versions needs quote marks at the start and the end. The blockquote mark-up needs this:

blockquote p:first-child::before {content: open-quote}
blockquote p:last-child::after {content: close-quote}

The French style needs quote marks at the start of every paragraph and a closing quote mark after the last:

blockquote p::before {content: open-quote}
blockquote p:not(:last-child)::after {quotes: none; content: close-quote}

We add a closing quote after every paragraph and then hide it with 'quotes: none', otherwise every paragraph adds a new level of nesting and if we have nested quotations (see the next section), they would not use the quote marks for the second level, but for the n'th level.

The solution with multiple <q> elements needs these CSS rules for Dutch and English:

q.start::after,
q.cont::before,
q.cont::after,
q.end::before {content: none}

And these rules for French:

q.start::after, q.cont::after {quotes: none}

Here is how it looks in your browser using a blockquote:

fr-fr

Je vis Pascal Doigt-de-Poix, qui faisait son huile avec les olives de M. Julien.

Je vis Babet la glaneuse, qui, en glanant, pour avoir plus vite noué sa gerbe, puisait à poignée aux gerbier.

Je vis maître Grapasi, qui huilait si bien la roue de sa brouette.

Et Dauphine, qui vendait si cher l'eau de son puits.

nl-nl

Ik zag Pascal Doigt-de-Poix, die zijn olie maakte met de olijven van meneer Julien.

Ik zag Babet, de arenleesster, die, terwijl ze aren las, handenvol van de stapel nam om sneller haar schoof klaar te hebben.

Ik zag Meester Grapasi, die het wiel van zijn kruiwagen zo goed smeerde.

En Dauphine, die het water van haar put zo duur verkocht.

And here using multiple <q> elements:

fr-fr

Je vis Pascal Doigt-de-Poix, qui faisait son huile avec les olives de M. Julien.

Je vis Babet la glaneuse, qui, en glanant, pour avoir plus vite noué sa gerbe, puisait à poignée aux gerbier.

Je vis maître Grapasi, qui huilait si bien la roue de sa brouette.

Et Dauphine, qui vendait si cher l'eau de son puits.

nl-nl

Ik zag Pascal Doigt-de-Poix, die zijn olie maakte met de olijven van meneer Julien.

Ik zag Babet, de arenleesster, die, terwijl ze aren las, handenvol van de stapel nam om sneller haar schoof klaar te hebben.

Ik zag Meester Grapasi, die het wiel van zijn kruiwagen zo goed smeerde.

En Dauphine, die het water van haar put zo duur verkocht.

Nested quotations

There may be a quotation inside another quotation. Compare the following:

en-gb

John continued: ‘I only said two words: “Go away!”’

en-us

John continued: “I only said two words: ‘Go away!’”

nl-nl

John ging verder: ‘Ik heb maar twee woorden gezegd: “Ga weg!”’

fr-fr

John continuait : « Je n'ai dit que trois mots : « Allez-vous en ! »

People may prefer different quote marks (these are just the most common), but whichever are used, the 'quotes' property in CSS handles the rules well for Dutch and English. E.g., for British English:

:root {quotes: "‘" "’"  "“" "”"}

French typically uses the same guillemets for nested quotations, so that line simplifies to:

:root {quotes: "«\202F" "&\202F»"}

But French has an extra rule, as the example above shows: If the inner quotation ends at the same place as the outer one, the closing guillemets should not be doubled.

There is currently no way in CSS to achieve that automatically. One would have to add a class:

fr-fr

<p>John continuait : <q>Je n'ai dit que deux mots : <q class=start>Allez-vous en !</q></p>

And use that to apply the style rule:

q.start::after {quotes: none}

Here is how it looks in your browser:

en

John continued: I only said two words: Go away!

nl-nl

John ging verder: Ik heb maar twee woorden gezegd: Ga weg!

fr-fr

John continuait : Je n'ai dit que trois mots : Allez-vous en !

Dialog

French treats dialog differently from isolated quotations. (This applies to dialogs that occur in prose or poetry, not in plays.)

fr-fr

« Bonne nuit, fit le petit prince à tout hasard.

– Bonne nuit, fit le serpent.

– Sur quelle planète suis-je tombé ? demanda le petit prince.

– Sur la Terre, en Afrique, répondit le serpent.

– Ah !… Il n'y a donc personne sur la Terre ?

– Ici c'est le désert. Il n'y a personne dans les déserts. La Terre est grande », dit le serpent.

A dialog (a sequence of quotations from two or more people with no more than small phrases in between) starts and ends with guillemets and each quotation except the first starts a new paragraph and begins with an en dash.

If the quotations are very short, or cannot start a new paragraph because they are inside poetry, then the en-dashes can also be used inline.

There is no dialog element in HTML, although we could of course use a <div> or <blockquote> with a class attribute:

fr-fr

<div class=dialog>
<p><q>Bonne nuit</q>, fit le petit prince à tout hasard.</p>
<p><q>Bonne nuit</q>, fit le serpent.</p>
<p><q>Sur quelle planète suis-je tombé ?</q> demanda le petit prince.</p>
<p><q>Sur la Terre, en Afrique</q>, répondit le serpent.</p>
<p><q>Ah !… Il n'y a donc personne sur la Terre ?</p>
<p><q>Ici c'est le désert. Il n'y a personne dans les déserts. La Terre est grande</q>, dit le serpent.</p>
</div>

And then use these CSS rules:

div.dialog q::before {content: "\2013\A0"}
div.dialog q::after {content: none}
div.dialog p:first-child q:first-of-type::before {content: open-quote}
div.dialog p:last-child q:last-of-type::after {content: close-quote}

The result in your browser:

fr-fr

Bonne nuit, fit le petit prince à tout hasard.

Bonne nuit, fit le serpent.

Sur quelle planète suis-je tombé ? demanda le petit prince.

Sur la Terre, en Afrique, répondit le serpent.

Ah !… Il n'y a donc personne sur la Terre ?

Ici c'est le désert. Il n'y a personne dans les déserts. La Terre est grande, dit le serpent.

Without an element to enclose the dialog, we need to indicate that the quotations are part of a dialog and indicate where it starts and ends:

fr-fr

<p><q class="dialog start">Bonne nuit</q>, fit le petit prince à tout hasard.</p>
<p><q class=dialog>Bonne nuit</q>, fit le serpent.</p>
<p><q class=dialog>Sur quelle planète suis-je tombé ?</q> demanda le petit prince.</p>
<p><q class=dialog>Sur la Terre, en Afrique</q>, répondit le serpent.</p>
<p><q class=dialog>Ah !… Il n'y a donc personne sur la Terre ?</q></p>
<p><q class="dialog end">Ici c'est le désert. Il n'y a personne dans les déserts. La Terre est grande</q>, dit le serpent.</p>

The corresponding CSS rules are:

q.dialog.start::before {content: open-quote}
q.dialog::before {content: "\2013\A0"}
q.dialog::after {content: none}
q.dialog.end::after {content: close-quote}

The result in your browser:

fr-fr

Bonne nuit, fit le petit prince à tout hasard.

Bonne nuit, fit le serpent.

Sur quelle planète suis-je tombé ? demanda le petit prince.

Sur la Terre, en Afrique, répondit le serpent.

Ah !… Il n'y a donc personne sur la Terre ?

Ici c'est le désert. Il n'y a personne dans les déserts. La Terre est grande, dit le serpent.

Dashes as quotation marks

Some French authors use en-dashes at the start of the line for all quotations, not just to separate speakers in a dialog. It is rare in other languages. Here is an example from André Gide, who also uses guillemets for nested quotations:

fr-fr

– « Et je vous dis en vérité que Salomon même, dans toute sa gloire, n'était pas vêtu comme l'un d'eux », dit-elle, citant les paroles du Christ (…)

Translation: And I assure you that not even Solomon in all his royal robes was clothed like one of these, she says, quoting the words of Christ. (The quote is from Matthew 6.)

The mark-up is straightforward:

fr-fr

<p><q><q>Et je vous dis en vérité que Salomon même, dans toute sa gloire, n'était pas vêtu comme l'un d'eux</q></q>, dit-elle, citant les paroles du Christ (…)

Only the 'quotes' property is different if this style is used:

:root {quotes: "\2013\A0" ""  "«\202F" "\202F»"}

The result in your browser:

fr-fr

Et je vous dis en vérité que Salomon même, dans toute sa gloire, n'était pas vêtu comme l'un d'eux, dit-elle, citant les paroles du Christ (…)

Quotations in a foreign language

Compare:

en-gb

The motto of Paris is: ‘Fluctuat nec mergitur’.

en-us

The motto of Paris is: “Fluctuat nec mergitur.”

nl-nl

De wapenspreuk van Parijs is: ‘Fluctuat nec mergitur’.

fr-fr

La devise de Paris est : Fluctuat nec mergitur.

The recommended way to render foreign-language quotations in French is to use italics and no guillemets. Writers in other languages sometimes prefer that, too. It's good practice in HTML to add lang attributes anyway, so the mark-up is almost straightforward:

en-gb

<p>The motto of Paris is: <q lang=la>Fluctuat nec mergitur</q>.</p>

en-gb

<p>The motto of Paris is: <q lang=la>Fluctuat nec mergitur.</q></p>

nl-nl

<p>De wapenspreuk van Parijs is: <q lang=la>Fluctuat nec mergitur</q>.</p>

fr-fr

<p>La devise de Paris est : <q lang=la>Fluctuat nec mergitur.</q></p>

Note only that the punctuation must be inside the quotation for American English (as already discussed above), but also for French, because periods and commas should always get the same style (bold or italic) as the preceding word.

For French, a new CSS rule is needed for foreign-language quotations to suppress the usual guillemets and use italics:

q[lang]::before {content: none}
q[lang]::after {content: none}
q[lang] {font-style: italic}

This assumes that lang attributes are only used on elements that have a different language than their parent. If there is a possibility they occur redundantly, i.e., there are <q> elements with lang=fr, then the selector needs an explicit check for a language other than fr:

q[lang]:not([lang=fr])::before {content: none}
q[lang]:not([lang=fr])::after {content: none}
q[lang]:not([lang=fr]) {font-style: italic}

If that is hard to read, you can also split the rules in two: rules for quotations with lang=fr and rules for quotations with any other lang= attribute:

/* Q elements with an explicit language... */
q[lang]::before {content: none}
q[lang]::after {content: none}
q[lang] {font-style: italic}
/* ... unless that language is actually French: */
q[lang=fr]::before {content: open-quote}
q[lang=fr]::after {content: close-quote}
q[lang=fr] {font-style: inherit}

Possible enhancements to the <q> element

There may be cases where it is useful to mark-up quotations but provide the quote marks in the content rather than in the style. That still allows to disambiguate text for translators or for rendering in speech and to apply style to it, while it may make the text better readable without the style and allow an author or publisher to indicate that the precise quote marks are important (e.g., because they appeared that way in a certain historical edition of the text, or because the author insists on a certain kind.)

One idea is to simply set the CSS property 'quotes' to 'none'. However, that assumes that the software that reads the HTML also knows about CSS. The fact that the quote marks are explicit should ideally be flagged in the mark-up.

TEI allows to capture such aspects of a text precisely and we can maybe borrow some ideas from it for an improved version of HTML. In particular, TEI has an attributes next that allows to link the parts of a quotation together that are separated by other text, as in:

<q next=q2>And now</q>, said Eve,
<q id=q2>let's have an apple!</q>

And it has an element <rendition> to declare, in the head part of a document, exactly how an element is expected to be rendered, including any generated quote marks. (It can use CSS fragments for that if appropriate.) It can be used to set a default for an element or elements can refer explicitly to a particular rendition.

If we add the next attribute to HTML (and use a proposed selector from Selectors level 4), we can write the French rules for omitting quote marks around an incise as follows in CSS.

q[next]::after {content: none}
q /next/ q::before {content: none}

The q/next/ q selector selects <q> elements that are referenced by the next attribute of another <q> element. The attribute must hold an IDREF mentioning the ID of the second element, or the fragment part of a URL (#foo) pointing to the ID of the second element.

A formal definition of the next attribute could be:

<ATTLIST Q next IDREF #IMPLIED>

The question is if a next attribute is easy enough for the average author, compared to using a class attribute or the method described below:

We don't need the full power of the <rendition. element. Probably a boolean attribute that says whether quote marks are omitted or present in the text is enough:

He said: <q marks>‘Let's go!’</q>

With as formal definition:

<ATTLIST Q marks (marks) #IMPLIED>

Or, like in TEI, declare the rendering of quote marks globally for the whole document, e.g., by means of a <meta> element:

<meta name=marks value=all>

The default, if the meta element is omitted, is none, which says that the <q> element has the traditional meaning as in HTML5 and earlier.

Possible enhancements to CSS

See Logical mark-up vs typographical mark-up in List of CSS features required for paged media.

References

Bert Bos
Created: 2016-04-27
Last modified: $Date: 2016/08/05 12:40:32 $