Warning:
This wiki has been archived and is now read-only.

PoeticSemantics

From HTML WG Wiki

Jump to: navigation, search

Issue: Explicit Markup to Semantically Express Poetic Forms

Problem: HTML5 lacks explicit semantic markup to express poetic forms.

1 Issue: Explicit Markup to Semantically Express Poetic Forms

Semantic Markup for Poetry: A Proposal from Dr. Olaf Hoffmann

What I missed so many years in (X)HTML is some useful markup for poems.

The result we can see in the "real web life" -- a lot of meaningless tag soup around, disoriented authors lost between silence and semantically meaningless markup...

Obviously poem markup is still not available in "HTML 5". Why not? Can this be added to the "HTML 5" draft?

It is pretty nice to have something like "section", "article", "header" in "HTML 5" (why not a generic heading element as "h" from XHTML2 by the way? This would be pretty useful for poems as well as for larger projects as anthologies, books or general content fragments joined together for example with server sided scripts as PHP).

Some useful and usable markup for poetry is still missing. If someone really tries to markup a poem today, one ends up with a div-class-tag-soup-nonsense. And there are many authors out there publishing poetry only in the web, currently without having any sufficient markup elements in (X)HTML for this.

According to my observation readers of poetry and general literature have a wide range of capabilities (a lot of readers of poetry are robots from search engine for example). Therefore, it is quite useful to markup those type of literature to make elements with a semantic meaning accessible for authors in (X)HTML and to simplify the identification of poetry for readers.

I think, it is the main purpose of a "Text Markup Language" as (X)HTML to markup text in a semantic way, isn't it? Poems are text -- lets markup it now ;o)

Some useful elements (block elements):

poem

  container for a poem, similar to a section, may contain header, footer, div, p (maybe useful for modern poetry), strophe, line, h

strophe

  stanza or strophe of a classical poem, may contain either line or inline elements or CDATA

line

  a line or row of a poem, may contain inline elements or CDATA

  a heading of a poem

I think such a construction covers already many types of poems. For non-classical as for example concrete poetry this is maybe sufficient too, still div or p can be used to realize non-conventional content.

source: Dr. Olaf Hoffmann, post to public-html, 5 October 2007

Note: A fuller, PoeticSemantics#head-3cf2cce512593d782f466385b73f1272d7848e53: detailed discussion of poetic markup alternatives is contained in this document.

Leif Halvard Silli's Proposed Solution: Introduce a TEXT Element Parallel to VIDEO and AUDIO

Precis: The central idea is to introduce into HTML a <TEXT> element - as a parallel to VIDEO AUDIO - to be used when we want to "embed" a independent piece of text - such as a poem, a play, whatever.

Explanation: The very PoeticSemantics#head-a8c546e608727288373f4031d0aeb0e5d4e3c131: problems that Olaf raised can, as I have see it, be split in two interrelated issues:

A "container issue", for which I propose <TEXT> with various possible attributes for classification of the kind of text (perhaps, actually, CLASS would be good enough.)
a "low-level text element issue" - e.g. the need e.g. a <L> element for (semantic) lines.

I see those issues as interrelated because: If we want to reuse existing elements (such as

for lines), then that becomes much more simple if we have a semantic container. E.g. if an author have either <POEM> or <TEXT class=poem>, then I believe that he or she would find it more easy to simply accept using e.g.

for lines in a stanza/paragraph, than he/she would do if one simply had to use a typical

as container. If we don't have a semantic container, then it becomes the more necessary to have specific low-level elements.

That said, there might be good reason for adding some more low-level elements:

The legacy "semantics"/restrictions of legacy the elements is one thing. But also, there are several text genres that need low-level list-like formats, less bound up with the semantics of DL, OL and UL. (The semantics of DL, OL and UL are that each
or each
is kind of independent from the other list items.) The DIALOG element is an example of reuse of an existing element which is more "literature like": Suddenly a list tells a story - i.e. the items are suddenly more directly interrelated - in a story-like way.
I am not so sure that we need to add a stanza format/paragraph list format (
) that can only be used in poems - I think there is good chance we can come up with something that is useful in other genres.

source: Leif Halvard Silli, post to public-html

Responses & Reactions

Peter Krantz: XHTML2 and RDFa Satisfy This Request

Do I understand you correctly that you want to include markup for a specific domain (poetry) in HTML5?

XHTML2 provides an extension mechanism through RDFa. RDFa will let you add semantic meaning (and parsing by others) to your specific domain. In fact you could semantically express poems of specific forms this way and create interesting possibilities for people who want to extract the poetry for e.g. resarch reasons.

A markup language should probably include as little as possible from specific domains and focus on the general things instead. Domain specifics should be handled via an extension mechanism that allows for unambiguous interpretation of the expressed information.

source: Peter Krantz, post to public-html, 5 October 2007

Doug Schepers (5 October 2007)

I'd like to note that in addition to poetry, the same solution could be applied to song lyrics, which are very widespread content on the Web. There are many sites devoted to nothing else, and sites like MySpace (and many blogs) have a lot of lyrical content.

I personally favor the idea of loosening up the definition of

into just that of a block of text (since the idea of a paragraph is not universal among natural-language orthographies), and using some other semantic system to annotate specialities of written language (where you could, for example, choose between a simple poetry markup and a more complex one that notates free verse or sonnets or even structural elements of iambic pentameter). This might be RDFa, or spans marked with microformats tags. You'll be able to get much more precision than with a blunt tool like HTML. Including lyrics in the category of poetry does make explicit a couple of interesting technological/processing aspects, though:

guitar tabs (or other musical notation) could be integrated using ruby;
timed text (as for karaoke) could be used to add meter and rhythm to the presentation style (think SMIL or HTML+Time).

And, of course, as you point out, giving special consideration to particular types of content (such as poetic or lyrical) aids in its categorization or aggregation. source: Doug Schepers, post to public-html, 5 October 2007

Ian Hickson (5 October 2007)

HTML5 actually defines how to mark up poems in HTML (the word "poem" is in the spec half a dozen times, in fact!).

Specifically:

the heading of a poem is marked up using <header> and the appropriate level of <hX> elements,
the stanzas of poems written in the classical form are given by <p> elements, with line breaks indicated by <br> elements (one of the few allowed uses of
).
the stanzas of freeform poems are given by <pre> element.

There is an example of a part of a classical poem in the <img> element section (search for "On either side the river lie").

source: Ian Hickson, post to public-html, 5 October 2007

Gregory J. Rosmaita: Response to Ian Hickson (5 October 2007)

a stanza isn't a paragraph, nor is a verse -- if they were, they'd be called paragraphs;

line breaks carry no semantic meaning -- why not a containing element that indicates a line of poetry, much as

and

indicate the beginning and end of a list item?

PRE does not express any meaningful semantics, nor does it lend structure -- other than the visual illusion of structure -- to the text contained in a PRE container...

the "classical poem" example you cited:

<pre><h1>The Lady of Shalott</h1>
<strong><p><img src="shalott.jpeg" alt=""></p></strong>
<p>On either side the river lie<br>
Long fields of barley and of rye,<br>
That clothe the wold and meet the sky;<br>
And through the field the road run by<br>
To many-tower'd Camelot;<br>
And up and down the people go,<br>
Gazing where the lilies blow<br>
Round an island there below,<br>
The island of Shalott.</p>

</pre>

is used to illustrate the contentious claim that:

Examples where the image is purely decorative despite being relevant would include things like a photo of the Black Rock City landscape in a blog post about an event at Burning Man, or an image of a painting inspired by a poem, on a page reciting that poem. The following snippet shows an example of the latter case (only the first verse is included in this snippet):

why should those processing the poem non-visually be bereft of a description of the accompanying illustration? obviously, the illustration captures an artist's conception of the "lady of shalott", which could aid an individual's understanding of the poem, and which could enhance the readers understanding of the cross-fertilization of poetry and art in a particular era and a particular style...

i fail to comprehend why an illustration such as this should be null alt texted and why it should validate without a descriptor, in particular, a long description of the painting -- not only those who cannot see may need a description of the painting, but also those with color blindness and those with an extremely restricted viewport who may need guidance through the illustration...

if the illustration isn't worthy of description, then it isn't worthy of being included in the first place -- one cannot, as the draft currently does, classify this image as "A purely decorative image that doesn't add any information but is still specific to the surrounding content", as the example you cited is NOT a purely decorative image, but an interpretation of the poem it is being used to illustrate -- therefore, it demands both a terse and a long description...

source: Gregory J. Rosmaita, post to public-html, 5 October 2007

Examples of Poetry on the Web

Detailed Discussion

What is Poetry, Stanza, Strophe etc?

Words like stanza, strophe, verse, paragraph, section, article used in several european languages have the same origin in old greek or latin, but the meaning differs slightly in current spoken languages (compare pages in wikipedia related to each other, but written in different languages), therefore a too specific naming of elements may cause some confusion and annoyances in different languages, the better approach can be to use more technical terms to describe the functionality of such elements to cover all related usage cases for any author with a careful description of the usage of such elements.

Poetry in Other Markup Languages

There are formats with a more extended and meaningful collection of element names than HTML. For example DocBook is pretty helpful to markup prose properly, but has unfortunately no capabilities to markup poetry. DAISY can be seen in some parts as an extension to HTML and indeed it already has an element to indicate a poetry environment: poem in DAISY. Children of a poem element have a specific meaning to cover major parts of poetry content. FictionBook has elements to markup poetry as well. Unfortunately the specification consistent mainly on an abstract scheme, not comfortable for authors to read.

In reaction to the problems with HTML discussed here and some other semantical gaps of HTML to provide a meaningful markup for literature, LML was created to cover all these problems and to collect helpful element names for general literature application. LML can be either used as a normal XML or together with the new profile XHTML+RDFa or with the upcomming XHTML role to indicate the semantical meaning of arbitrary XHTML elements without the sufficient semantical meaning. The main disadvantage currently both with the XHTML+RDFa and the 'XHTML role' is, that the CURIE approach with the reuse of external wisdom blows up the source code and authors are always required to now at least two or more languages to markup simple texts properly. The main disadvantage with a pure XML approach as LML is, that current browsers like the Geckos, Opera or Konqueror do not support important other help languages like XLink or XForms completely to cover all required technical functionalities completely. This is the reason, why authors still have to create compound documents like XHTML+LML or XHTML+RDFa using property values from LML to get both the semantical intention and technical functionality in one document.

Obviously it would be still much more convenient for authors to have both the semantical requirements and the technical functionalities available in one format like (X)HTML.

How to Markup a Stanza or Strophe?

The fine structure of typical poetry is a stanza (or strophe or verse paragraph) with lines (or verse lines), example with pseudo code:

<stanza>
  <line>I dreamed I was a fly</line>
  <line>buzzing through the sky</line>
  <line>looking for some sweets</line>
  <line>or some spoiling meats</line>
</stanza>

To cover only the functionality and to avoid the impression, that this is very specific to a specific type of text, one can reduce this to more technical element names:

 
<ll>
  <l>I dreamed I was a fly</l>
  <l>buzzing through the sky</l>
  <l>looking for some sweets</l>
  <l>or some spoiling meats</l>
</ll>

Note: ll or gl equals "list of lines" or "group of lines"; l equals "line".

This can be very useful, if it is discovered that other content may have closely related requirements of structure or presentation. A too specific naming could prevent authors with only closely related content of similar functionality to use it. This avoids annoyance similar to that, having a section about prose in the recommendation, but none for poetry ;o)

Requirements to the Functionality of the Markup

The structure of a stanza is similar to that of a list. But additionally stanzas contain normally rhythmic content as poem fragments, lyrics fragments, song fragments. Therefore a list item like element is needed to markup this substructure.
The lines are similar to list items. As for block elements a line element starts with a new line, stanza too starts with a new line, separated somehow from possible previous and following content with some space perpendicular to ordinary writing direction in graphical representation, respectively a break in aural presentation. For aural presentation typically the end of a line will be marked with a smaller break as the break at the end of a stanza.
In contrast to prose lists as a shopping list a stanza is no prose and lines are intended as lines without line breaks within. This is easy to perform for aural presentation. If this conflicts in visual rendering with limited horizontal (vertical) space, author and use expect to see some intuitive sign for the undesired line break, for example an indentation after such a forced line break (or maybe a specific warning symbol about the line-break).
In contrast to prose lists apart from the problem described in 3, lines have as default presentation no list item symbol or indentation. Numbering is not excluded, but is more related to styling of specific cases (to reference a line for interpretation, educational purposes, a scientific approach) as for the default presentation.

Approach for a (Default) Styling Model for a Stanza

aural: The reader could switch to a speaker with more advanced abilities for rhythmic and metric text type, if the parent type is not already a poetry type, else it will cause a pain for the audience. A less advanced reader may have a warning note about the problem as a minimum requirement.

visual:

ll {display:block; padding:1ex}
l {display:block;padding-left: 2ex}
l:first-letter {margin-left: -2ex}

Most other presentation and styling should be available with CSS styling and there is no more generic usual typical styling, some poems are centered, some not, most other choice of styling properties is a matter of taste, solvable with CSS. But authors may use inline elements (3.12 Phrase elements especially) inside the line elements. For simple free form artworks it is maybe required to have fix whitespace within the line, either with a pre element, with   or if this is left to a poetry container. Other content directly within the stanza element is not to be expected.

The structure model above does not cover slightly more complex structure as for theatre or opera with some dialog. Such literature may include prose as well (stage directions). Currently the HTML5 draft dialog element is reserved by the heading of the section to be for prose, therefore not usable for such type of poetry. One might extent the use case to poetry, putting the dialog into a poetry container and the remaining prose parts in an element as aside or into a prose container.

Because such type of dialog poetry normally is perfomed and the written text equivalent is not the primary artwork, the specific requirements for a line may be skipped within such a poetry dialog environment, because it is less important and the behaviour of the dialog element does not depend on the parent container (simpler implementation rule). However, sometimes the speaker in the dialog switches within one verse line, this may cause further minor problems for markup (see Shakespeare sample below).

Open question: Are there deviations or other requirements for rhythmic text notation in other cultures or for other writing directions (as horizontal?)

Some Specific, More Critical Use Cases

Alliteration - one line in one stanza

<ll xml:lang="de"><l>Fischers Fritze fischt frische Fische. Frische Fische fischt Fischers Fritze.</l></ll>

Haiku

A variation about Matsuo Bashos frog haiku (~ 1686), might require vertical writing direction in original language, but appears traditionally in one line.

<ll><l>At an old calm pond - Suddenly a frog jumped in - A sound like a splash!</l></ll>

<ll xml:lang="de"><l>Uralter Weiher. Ein hineinspringender Frosch. Platschendes Wasser!</l></ll>

Blank Verse

blank verse

Either the author still interpretes everything as one list of lines or each line as a separate list of one line. If ll (list of lines) is used instead of stanza, this still fits in the simple model, the author may want to use CSS to adjust the styling a little bit.

Free Verse

free verse

If the author still needs to pronounce that a line is a line, ll is still pretty good to do that. If the author only needs to pronounce that it is poetry, a poetry container with paragraphs (p) is sufficient, see below.

Simple Free Form with Requirement for Defined Empty Space

<nowiki>
<ll>
  <l><pre>I dreamed    I was          a fly

</l> <l>

          buzzing through the sky

</l> <l>

          looking for some sweets

</l> <l>

       or some spoiling meats

</l>

</ll> </nowiki></pre>

Possible Suggested Methods Available with HTML4/XHTML1.x for "ordinary" Poetry

Method 1.1 (br only)

<br>
<br>
<br>I dreamed I was a fly
<br>buzzing through the sky
<br>looking for some sweets
<br>or some spoiling meats
<br>
<br>

Pro:

uses only existing br to 'markup' stanza and line this model can be expanded, using b as a generic element for headings and a for links and   or img to realise empty space, for many authors there is no need for further elements, because normally the server already sends text/html even elements like html or body become redundant
backwards compatibility back till the dawn of HTML

Con:

currently only available in the transitional profiles, else at least of div containing all content is used as a block element
no semantical structure and meaning for anything at all
no list like structure, none of the requirements is met

Method 1.2 (p+br)

 
<p>
I dreamed I was a fly<br />
buzzing through the sky<br />
looking for some sweets<br />
or some spoiling meats<br />
</p>

Pro:

source code not blown up with much markup
reuse of existing elements p and br
some authors/editors have this in use in HTML4 today, ignoring the structure gaps of this construction.

Con:

br does not markup a line as a line, this is just a forced line break. br is an empty element, therefore cannot contain the line content itself.
a stanza has more a list structure, p has no specific structure, for historical reasons a paragraph can be interpreted as a degenerate stanza, lost the line stucture. Therefore p lacks of generic inner structure to represent a stanza.
the requirements 2 and 3 is not met by p and br
the current use in HTML4 can be explained, because elements with a structure more related to poetry are defined currently for other purposes, such as dl+dt+dd, therefore p+br is somehow left as the simplest method, because there is no other choice.

Method 1.3 (pre)

<nowiki> 
<pre>
I dreamed I was a fly
buzzing through the sky
looking for some sweets
or some spoiling meats

</nowiki></pre>

Pro:

source code not blown up with much markup
reuse of existing element pre

Con:

no guarantee by markup, that the strucure is really respected by visual or oral representation.
HTML5 pronounces, that pre is intended for 'a block of preformatted text, in which structure is represented by typographic conventions rather than by elements' - a stanza and a verse line have opposite requirements, they are specific structure elements for a poetry text type.
no list like structure, none of the requirements is met.

Method 1.4 (dl+dd (or dt))

 
<dl>
  <dd>I dreamed I was a fly</dd>
  <dd>buzzing through the sky</dd>
  <dd>looking for some sweets</dd>
  <dd>or some spoiling meats</dd>
</dl>

 
<dl>
  <dt>I dreamed I was a fly</dt>
  <dt>buzzing through the sky</dt>
  <dt>looking for some sweets</dt>
  <dt>or some spoiling meats</dt>
</dl>

Pro:

fits to the fact, that a stanza is similar to a list
meets the structure of pseudo code above
reuse of existing elements dl, dd (or dt for more traditional types of poems)
the author may add additional markers using dt for educational purposes and scientic treatment of the content
already in use to markup poetry by advanced authors. For example several samples for poetry and lyrics are marked up with dl+dd in wikipedia: sample in the stanza article, samples in the rhyme article, some samples in the german article about 'Sonett'
usage of dl+dt avoids already not required indentation and not required list symbols

Con:

This use of a definition list implicates somehow that the lines define what? the stanza? This is maybe not exactly the intended use case for a definition list and authors will have problems to identify, that this is somehow related to poetry content.
requirement 3 for visual representation not met, minor problem with 4, typically all dd are indented
In HTML5 this usage is currently excluded, because a minimum of one dt respecitively dd is required now (authors can leave it empty of course).

Method 1.5 (ul/ol+li)

<ul>
  <li>I dreamed I was a fly</li>
  <li>buzzing through the sky</li>
  <li>looking for some sweets</li>
  <li>or some spoiling meats</li>
</ul>

<ol>
  <li>I dreamed I was a fly</li>
  <li>buzzing through the sky</li>
  <li>looking for some sweets</li>
  <li>or some spoiling meats</li>
</ol>

Pro and Con similar to those outlined for Method 1.4 except for Con 3; additionally typically ul and ol have list item symbols or numbers and indentation as representation for li. ol numbering is pretty useful for educational purposes and a scientific treatment, but else not in common use. To indicate the begin of a line with a symbol is not in common use either. In HTML5 ul, ol, li are restricted to prose content.

Method 1.6 (div+div)

 
<div>
  <div>I dreamed I was a fly</div>
  <div>buzzing through the sky</div>
  <div>looking for some sweets</div>
  <div>or some spoiling meats</div>
</div>

Pro:

reuses the existing element div.
div can "markup" almost any content, even very exotic poetry constructions can be represented somehow with enough divs and additional attributes like class or role to represent the semantical meaning of the construction (RDFa approach); but in the original HTML4 only class is available, only with values without a predefined meaning.

Con:

div is defined to have no semantical meaning at all. These are just containers for anything
attributes as class or role or kind have to be added to give some semantical meaning and detailed functionality to everything, needs predefined attribute values to meet the requirements 1, 2, and 3
this usage breaks the idea, that div is intended mainly for styling and for unspecific requirements as a generic grouping element without a semantic meaning.

Method 1.7 (p+span+br)

 
<p>
  <span>I dreamed I was a fly</span><br />
  <span>buzzing through the sky</span><br />
  <span>looking for some sweets</span><br />
  <span>or some spoiling meats</span><br />
</p>

Pro:

reuses the existing elements p, br, span
span can "markup" almost any inline content

Con:

span is defined to have no semantical meaning at all. These are just containers for anything
attributes as class or role or kind have to be added to give some semantical meaning and detailed functionality to everything, needs predefined attribute values to meet the requirements 1, 2, and 3

Method 1.8 (table+tr+td)

 
<table>
          <tr><td>I dreamed I was a fly
</td></tr><tr><td>buzzing through the sky
</td></tr><tr><td>looking for some sweets
</td></tr><tr><td>or some spoiling meats
</td></tr>
</table>

Pro:

reuses existing table structure, can be expanded to a complete poem (complete poetry artwork) model including caption, thead, tfoot, tbody, then even simple free from poems can be marked up with specific requirements about empty space, either using more td with width in one tr or using img
backwards compatibility back back to pre CSS time - tables were very popular for styling and presentation in the last millenium as is still for author from this time.

Con:

neither in HTML4 nor in HTML5 table elements are intended to markup list like content or even unspecific prose or poetry, however from the logical point of view a list is a degenerate case of 'data with more than one dimension' of data with one dimension - it is not excluded to use table having only one td in a tr
does not meet requirement 3.
source code blown up with a lot of meaningless elements related to a list or poetry content

Resume About Existing Methods

The best approach in HTML4/XHTML1.x is either to use a definition list (currently excluded in HTML5, but in common use for many authors with advanced semantical abilities - there is a lot of poetry marked up in wikipedia with definition lists for example) or to use ul/ol. ol is already a complete solution, if numbering is required for example for interpretation, educational purposes or a scientific treatment of the content. Using list symbols to indicate lines with ul is not in common use, but the use of ul/ol ensures at least, that a forced line break within a line can be identified by the reader immediately.

An indication for specific requirements for aural/oral presentation is not available by element naming or an attribute. A screen reader is not able to indentify poetry content to ensure high quality rhythmic or metric presentation of the content apart from possible additional styling with CSS by the author.

Suggested Methods Not Available in HTML4/XHTML1.x

Method 2.1 (p+li)

 
<p>
  <li>I dreamed I was a fly</li>
  <li>buzzing through the sky</li>
  <li>looking for some sweets</li>
  <li>or some spoiling meats</li>
</p>

Pro:

Extends the model for p to a more structured element. This fits to the idea, that in the dawn of (written) literature a paragraph became a degenerate case of a stanza without specific line structure. This model puts the structure back to the paragraph. p without a list structure then can be identified as some sort of degenerate type of text, the author was

not able to structure.

Con:

The reuse of p an li in an extended usage model creates a backward incompatibility with older browser - display results are not predictable, because this is an invalid structure for old browsers.
This structure requires specific rules for the behaviour of the li element depending on the parent element - is it a list item or a poetry line? This environment dependent behaviour with more complex rules is more difficult to implement as simple rules for li and other simple rules for a line element. Often the use of the same tool for different things causes such problems.
The structure requires more complex rules for the p element - for example if the p is the parent of li element, there should be nothing else in the p element, if there is something different inside, it is the degenerate prose case and it cannot contain li elements.
Currently li can contain any block element including p itself, not really useful for a line of a stanza in general use.

Method 2.2 (p+br, not empty)

 
<p>
  <br>I dreamed I was a fly</br>
  <br>buzzing through the sky</br>
  <br>looking for some sweets</br>
  <br>or some spoiling meats</br>
</p>

Pro:

same as for Method 2.1

Con:

Consult Cons for Method 2.1. Currently the br is an empty element
Currently both p and br are only defined for prose, this use case has to be expanded.
Surprisingly there is a backwards compatibility issue with Geckos (tested with 1.8) and Opera (tested with 9.23) using the XML parser - no display of any content. With a SGML parser, the content is present, sometimes there are two line breaks instead of only one.

Method 2.3 (ll+l, new elements)

using new elements for rhythmic, ordered, list-like content

 
<ll>
  <l>I dreamed I was a fly</l>
  <l>buzzing through the sky</l>
  <l>looking for some sweets</l>
  <l>or some spoiling meats</l>
</ll>

Pro:

clean separation from less structured prose or other structured content or content with another functionality or a possibly more complex structure as ul or ol lists.
possible to have simple rules exactly fitting to the requirements for functionality.
intended but not exclusively for poems or songs, this is generic for simple, rhythmic, ordered text, fitting already to most/many cases of common use cases.

Con:

needs to introduce new elements.
backwards incompatibility, old browsers will ignore the element and display everything as one (prose) inline content.

Discussion:

If authors need backwards compatibility, one may extent the model, allowing the use of br not just for prose but in lines too, the author may use this to get a useful appearance:

 
<div>
 <ll>
  <l>I dreamed I was a fly<br /></l>
  <l>buzzing through the sky<br /></l>
  <l>looking for some sweets<br /></l>
  <l>or some spoiling meats<br /></l>
 </ll>
</div>

With some rules for the br inside l:

Approach 1:

a) if br (with optional whitespace) is directly followed by the end of the line, the br is collapsed with the end of the line

b) else the br is interpreted as a line break, required because there was not enough space to put everything in one line

Approach 2:

a) if br appears inside a line, this is only a suggestion from the author, where to break the line, if there is no sufficient space to put everything in one line. If there are more than one br in a l, the user agent has the optimisation choice to put everything in as less lines as possible withing the available space.

b) If everything fits in one line, the br inside l are ignored.

The advantage of Approach 1 is, that authors are able to markup experimental content. Maybe this will not be used often because the possibilities to get a big difference is not very big. The advantage of approach 2 seems to be bigger - the author gets more control about the rendering in the user agent and can help to avoid line breaks between words changing the meaning of the content completely. Of course the author can do similar things using just " " instead of " " where required.

Method 2.4 (p+l)

<p>
  <l>I dreamed I was a fly</l>
  <l>buzzing through the sky</l>
  <l>looking for some sweets</l>
  <l>or some spoiling meats</l>
</p>

Pro:

Extends the model for p to more structure. This fits to the idea, that in the dawn of (written) literature a paragraph became a degenerate case of a stanza without specific line structure. This model puts the structure back to the paragraph.

p elements without l identify the content as some degenerate type of text without further structure given by the author.

Con:

The structure requires more complex rules for the p element - for example if the p is the parent of l element, there should be nothing else in the p element, if there is something different inside, it is the degenerate prose case and it cannot contain l elements
backwards incompatibility, old browsers will ignore the element l and display everything as one (prose) inline content - can be solved similar as discussed for method 2.3
more difficult to identify the p as a stanza/list of lines for robots as with a specific element for this purpose. p becomes a hybrid of a list like element and a paragraph without specific substructure

Method 2.5 (ll+li)

<ll>
  <li>I dreamed I was a fly</li>
  <li>buzzing through the sky</li>
  <li>looking for some sweets</li>
  <li>or some spoiling meats</li>
</ll>

Pro:

reuses the li for the list item aspect.
the outer new ll element ensures the immediate requirement of specific poetry functionality for the lines represented by the li.
If better backwards compatibility is required, authors can add an additional div without a semantical meaning around the ll to have a block element for older browsers (but this does not solve the problem of a li outside ul/ol for old browsers).

Con:

needs to introduce a new element.
backwards incompatibility; old browsers will ignore the ll element and display it as inline element. Not completely predictable, what happens with li inside an unknown element.

Method 2.6 (section+p)

<section>
  <p>I dreamed I was a fly</p>
  <p>buzzing through the sky</p>
  <p>looking for some sweets</p>
  <p>or some spoiling meats</p>
</section>

Pro:

A stanza is a section of a poetry artwork, this fits and is applicable, maybe then the complete poem/song is an article? But article is related to the prose domain by naming. A better naming of 'article' can avoid this impression.
Easy to add additional information as stage directions to a section using for example aside to distinguish them from the p/line structure
Simpler to use this model to include the pre element for free form poetry as in a line element intended for inline elements

Con:

A line within a stanza is not really related to a complete paragraph, often the line does not even contain a complete sentence, commonly the microstructure of a paragraph.
p does not meet the requirements for a line of a stanza.
The structure is not specific for poetry, there is no technical difference to prose, no indication for aural presentation for example, requires always a poetry container for such an identification

Method 2.7 (section+section)

<section>
  <section>I dreamed I was a fly</section>
  <section>buzzing through the sky</section>
  <section>looking for some sweets</section>
  <section>or some spoiling meats</section>
</section>

Pro:

A stanza is a section of a poetry artwork, this fits and is applicable, maybe then the complete poem/song is an article? But article is related to the prose domain by naming. A better naming of 'article' can avoid this impression.
Easy to add additional information as stage directions to a section using for example aside to distinguish them from the p/line structure
Simpler to use this model to include the pre element for free form poetry as in a line element intended for inline elements

Con:

A line within a stanza is not really related to a complete section, often the line does not even contain a complete sentence, commonly the microstructure of a paragraph in a section.
section does not meet the requirements for a line of a stanza.
The structure is not specific for poetry, there is no technical difference to prose, no indication for aural presentation for example, requires always a poetry container for such an identification

Method 2.8 (dl with attribute kind)

<dl kind="strophe">
  <dt>I dreamed I was a fly</dt>
  <dt>buzzing through the sky</dt>
  <dt>looking for some sweets</dt>
  <dt>or some spoiling meats</dt>
</dl>

<nowiki>
<dl kind="strophe">
  <dd><pre>I dreamed    I was          a fly

</dd>

          buzzing through the sky

          looking for some sweets

       or some spoiling meats

</dl> </nowiki></pre>

Pro:

reuse of existing elements, already in common use by advanced authors to markup poetry
refines and extends the functionality and presentation of a definition list to a common use case of such elements. HTML5 anyway tries to redefine definition lists as description lists or dialog, this idea can be improved to provide even more extended functionality and semantical meaning to lists with less good support in the current HTML, avoiding 'list domain specific markup' as currently present in HTML4 and the HTML5 working draft
good semantical and technical backwards compatibility for old browsers

Con:

requires one new attribute for dl and some redefinitions of the semantical meaning of a dl list

Resume for all methods

Method 2.8 (dl with a new attribute kind) has the best backwards compatibility and the biggest flexibility and has the potential to extend the functionality of definitions lists to several other applications and already used content in the internet for lists, today not really good specified in HTML4.

Methods 2.3 and 2.5 allow to use specific elements for a specific list like functionality, as other elements like ul, ol, dl, dialog, menu do with the disadvantage of minor backwards incompatibilities, authors have to care about, if backwards compatibility is required, this can be ensured using additional already existing elements like br (currently only for prose available), div and span.

It is a matter of taste to provide another element for each specific use case from the 'list domain' or to combine all these very similar list like functionalities in one element with an additional attribute to clean up HTML a little bit using only a new attribute with different values to get the same effect with less elements, see next section.

Currently, because for most elements the usage for poetry is explicitely excluded by the content model, the best approach is anyway to introduce completely new elements, even if this creates some problems for older user agents. For some container elements however it should be possible to use them both for prose and poetry, else it might get very difficult for authors to combine poetry and prose in one document.

How to Extend the Functionality and the Semantics of a Definition List for Different Use Cases Including Poetry

In HTML4 dl was defined as a definition list, in the current HTML5 draft it is redefined as a 'description list'.

In this approach it is redefined again as 'diverse lists' with advanced functionality and semantical meaning. This is accomplished with an addtitional attribute kind. List elements as dl, ul, ol, menu, dir, dialog have a very similar structure and a similar functionality, today this can be covered with one element having an attribute kind, defining the specific use case for the already existing use cases and some more, currently not available in (X)HTML. This method mainly sanctifies common use cases, which can be already found for dl in the existing internet, as a markup for poetry, law text, conversation, dialog, menu without the need of new elements and without the danger of backwards incompatibilities in older viewers.

This approach leaves the responsibility of a useful utilisation of dt and dd to the authors, following the old ideas of Kant and others: Enlightenment, "Dare to know". However the new attribute kind suggests interpretation of degenerate use cases to avoid confusion for the reader.

Technical Semantics

dl - diverse list(s), (manifold, miscellaneous lists)

Block-level element, and structured inline-level element.

Contexts in which this element may be used: Where block-level elements are expected.

Where structured inline-level elements are allowed. Content model: Zero or more elements dt or dd

Element-specific attributes: start (see the element ol, replaced with this one) and kind (details see below)

dt - diverse list topic

Contexts in which this element may be used: inside dl

Content model: Strictly inline-level content

Element-specific attributes: value (see the element li, replaced with this one)

dd - diverse list data

Contexts in which this element may be used: inside dl

Content model: Zero or more block-level elements, or inline-level content (but not both).

Element-specific attributes: value (see the element li, replaced with this one)

value is only used for the kinds 'ordered' and 'bol', 'poetry' and 'dialog' else it is ignored, the same for start of dl.

Correlation to other elements

The combination dl/dt/dd with the attribute kind replaces ul, ol, li, menu, dir, dialog.

Functionality and use cases, values of kind

kind has predefined values, indicating the functionality and the semantics in detail.

Possible kind values and typical usage:

ordered - like the old ol, the dd is then interpreted as the old li; dt is an additional possibility to note lables, presented without numbering and indentation.

unordered - like the old ul, the dd is then interpreted as the old li; dt is an additional possibility to note lables, presented without list symbol and indentation.

def - like dl in HTML4, dt interpreted as definition term, dd interpreted as description of the previous dt, if there is no previous dt at all in the dl, the dd describes the dl itself. If there is no dd at all in the dl but a dt, the dt describes the purpose of the dl itself. Other use cases are combinations of question (dt) and answer (dd) for example in a FAQ or a school book lesson, or the combination of tasks (dt) and activities (dd), or task topic (dt) and subtasks (dd). If kind is not specified, def is assumed for historical reasons (and backwards compatibility).

strophe - the dl has the semantical meaning of a stanza, strophe, verse paragraph, for poetry, content with specific rhythmic, metric behaviour or any artwork the author needs to call somehow poetry or lyric, for example poems, songs etc, dt and dd represent the (verse) lines of dl, they can be mixed and combined as required by the author. Using only dt is mainly related to conventional poems and songs and the common use of a stanza. dd might be useful to include artwork with specific requirements for example using the pre element for preserved whitespace within a line.

For song texts, the dd may contain additional information about the melody or in a compound document data from another XML format to represent the music/melody in a written form.

aural/oral presentation requires advanced rhythmic and metric capabilities.

visual presentation: dt and dl have no specific symbol for numbering or indentation, only if the content is broken into two or more lines, the second and the following lines are indented to indicate, that everything belongs to one (verse) line. Only if a dl has a start attribute (with any value), the dt and dd are additionally indented and the dt numbered as for 'ordered', but authors may use the value attribute for dt to overwrite the automatic numbering, authors may note value="" to suppress numbering for specific lines, typically applications only number each fifth or tenth list item, not all. If dl has no start attribute, the value attribute is ignored. This fits to a common use of numbering to reference poetry lines for educational purposes, interpretation and scientific treatment.

compact - some list like texts or text with numbering of text fragments do not require a separation of list items with new lines. This happens for example for some religious texts, which may have a similar requirement to reference specific text fragments labelled with numbers or other markers, but text fragments and markers are somehow a list degenerated to a paragraph (the dl behaves as a p element, dt and dd as inline, the text is expected to be inside dd, an optional marker or number in dt). Automatic numbering is available if the start attribute is provided and can be modified with the value attribute for the dd element. Alternatively authors may use the dt attribute to provide an inline marker. This may happen for example in compressed presentation of poetry content too, if such texts are cited, authors sometimes only use a marker like '/' to preserve the original line structure. This kind is related to the use of the compact attribute for lists in HTML4.

HTML2 sample: Bible text using a compact(!) dl list

conversation - a prose dialog or interview, use case similar as described for the dialog element in the current HTML 5 working draft, replaced here with this type of dl, avoiding domain specific elements. Additionally, if the first dd has no preceding dt, the dd is marked in a different way, for example either with a list symbol or presented with a font-style like italic to indicate it as an annotation related to the conversation at the current point. Authors are encouraged to use additionally the aside element inside such a single dd to indicate such an annotation. If two dd only follow on each other, this only indicates a new line, something like a paragraph, a closed fragment of content separated from the previous line/content from the same speaker. Two or more dd at the beginning of the dl without a dt are interpreted as separated annotations. Single dt elements not followed by a dd are interpreted in such a way, that the person is at this moment speechless.

dialog - the poetry equivalent of 'conversation' with the same usage for example for a theatre or opera play, but oral/aural presentation requires advanced rhythmic and metric capabilities. If two dd follow on each other, these are simply two verse lines within the dialog. For two or more dd at the beginning of a dl see 'conversation' respectively. If the first dd has no preceding dt, the dd is marked in a different way, for example either with a list symbol or presented with a font-style like italic to indicate it as a stage direction related to the dialog at the current point. Authors are encouraged to use additionally the aside element inside such a single dd to indicate such a stage direction. Usage of the attributes start and value as described for 'strophe'.

marker - Some lists have a the requirement for a specific hard coded numbering or symbol choice. For example in a law text. The dl represents the equivalent of a paragraph or article of a law, the dt/dd define the substructure. dt contains the 'numbering' or symbol required for the text, dd contains the law text itself. The content of all dt elements is used to determine the indentation for the dd using the largest dt content of all to define the indentation for all dd. The first line of the dd begins besides the dt as for an old ol list the li content is besides the related number. If dt is missing, it is assumed, that no numbering is required, if dd is missing, nothing is assumed as the content of dd for the related preceding dt. Other use cases are a bill, receipt, invoice, recipe, shopping list and other related things. Normally dt contains a number with an optional unit only to define a quantity of the entity, noted in dt. For a bill, receipt, invoice the dt contains the prize with a currency as unit. For a recipe this contains the quantity of the entity used for the recipe, for a shopping list again the quantity of the products to be acquired. An missing dt has the same meaning as an entity, the interpretation is obvious: for a shopping list and a recipe a quantity of 1 and for a bill, receipt, invoice the same as 'free of charge', both without a specific notation.

Sample for law text using dl: law texts, german government

Discussion: An alternative approach could be to allow any CDATA as value of the value attribute. Then dt could be left more as a topic or label element and the marker/numbering is left to the value attribute.

To outline the difference to a simple table, available to markup this purpose (as any list application):

The list use pronounces the strong correlation between the dt and the dd and only a close relation to other list items, for example in a recipe the units in the dt can be quite different for each list item, but always strongly correlated to the entity mentioned in the dd. If such a construction has a dt but no related dd, the interpretation is the same as for an empty dd - empty, nothing, to be added. For more complex use cases authors are encouraged to use the table elements to markup multidimensional correlated data.

menu - for menues similar as the old menu element or as that described in HTML5, to be replaced with this dl. dd contains the menue items and optional submenues, dt can contain a label. dd is not marked with a list symbol or a number, dt could have a default styling with a font-weight of bolder to indicate it as a label.

link - this appears mainly in the head element of a document and is then not a direct part of the displayed body. It is used to create toolbars or panels in advanced viewers or is added as clearly separated additional content after the body in simpler browsers (the logical point about this is, that normally the reader first wants to read the content before a menu is used to switch to other content) or on demand for navigation in aural/oral presentation. This can be used for the complete navigation within a larger project, to group together bookmarks/hotlists for toolbar display and direct export to the browser on demand by the user. dt is used for a lable of a toolbar or a pull down menue or a sub menue. dd is used to create the menue item with normally one link element per dd, using the title attribute for description of the list item purpose. A dd may contain more than one link, if the links have different values for hreflang or type or rel with alternate, all indicating alternative versions. Such a 'multi-link' dd creates a specific submenu for alternative access. Authors are encouraged to use this only to list alternative approaches for the same or similar content for accessibility reasons. Typical examples for 'multi-link' dd are versions in different languages, multimedia targets of different formats, alternate stylesheets (not noted using XML stylesheet processing instructions).

Other content than link is not allowed in dd, but gracefully ignored, when such a navigation toolbar is created.

For backwards compatibility authors may add the link kind too within the body. It is expected, that such a construction contains only an optional label with dt one dd with a reference (a element) to a conventional index page of the project, representing the content of the not accessible menue navigation in the head. Viewers able to generate toolbars from a 'link' kind in the head will not display content of a 'link' kind in the body. For others this list is presented with noticeable border or outline, indicating it as a warning, that this is just a replacement for the intended navigation.

Authors are discouraged to use other kinds of dl in the head, no other kind is to be displayed as toolbar.

Discussion: Maybe it is useful to put the dt label in a link element too, just to avoid confusion in the head element. Because link may contain href or not, it is possible to use it too as a reference with a href and as label only without.

none - no requirement for a specific structure, this is mainly for content with other requirements or semantical meaning, not covered by the predefined values for kind. Having this avoids the abuse of defined cases. dt and dd are presented as block elements without further specific structure, authors have to achieve this using for example CSS.

dir - behaviour as intended for the old list element dir. dt are labels for directory lists in dd. the dt/dd groups are presented next to each other.

bol - backwards compatibility mode for old ol. The dd are used as described for the 'ordered' case, the dt is not displayed at all. The effect is, that authors can add numbering for old browsers manually, if this is required.

bul - backwards compatibility mode for old ul. The dd are used as described for the 'unordered' case, the dt is not displayed at all. The effect is, that authors can add symbols like things (*, #, etc) or img elements manually for the display in old browsers, if this is required.

Discussion: Maybe the list is not yet exhaustive, more values? If authors have other use cases not yet defined, they may use 'none' or with additional attributes like 'role'. If more important already existing use cases are known, it would be useful of course to specify them here.

How to Markup Larger Structures of Poetry Containing Mainly Stanzas as Fine Structure?

Epic literature -- for example, Homer's Odyssey or Goethe's Faust -- are examples for more complex text structures of poetry containing mainly stanzas and no prose. Anyway the macro structure from prose can be in most cases reused, for example section, header footer markup macro structure. If the complete document is poetry, the author often will add a descriptive sub-title like "tragedy" or "epic" and may add meta data such as RDF or Dublin Core (DC) to define this for search engines.

The main problem occurs, if a piece of poetry appears in a prose context or vice versa. This is somehow an "alien" scenario -- even if prose can be interpreted as some degenerate less structured derivation form poetry in the dawn of literature, these are quite different types of text now. A poetry container joins together for example diverse stanzas to one poem, one artwork and separates the poem from other content around it in the document like prose interpretation or discussion of the poem, navigation, advertisement, other artworks. Typically artworks appear on a pedestal, such a container sich such kind of pedestral to separate the artwork from the surrounding content.

Another problem occurs with blank verse or free verse or poetry, containing even less obviously rhythmic content. An author may leave this to paragraphs, but insisting that it is poetry, the author might want to use a poetry container, avoiding misunderstanding for interpretation (even if the common user might not look on the source code this can be quite effective for interpretation or a general scientific approach). The poetry container still covers poetry types not covered by the simple stanza/line fine structure elements and can conserve already some amount of the intentions of the author.

HTML5 introduces already some elements specific for the text domain prose but none for poetry, therefore to identify this easier and to group together stanzas for example to a complete poem or a song, a container element would be helpful. Typically this construction will contain a heading and maybe some information about the author too (in HTML5 the construct of header and footer can be reused too).

BLOCKQUOTE is not usable, if it is not a quote or not intended as quote. (The use of BLOCKQUOTE for formatting/presentational purposes was formally deprecated in HTML 4.01)

For prose in HTML5 already an article element is available to separate independent prose text from surrounding other prose or surrounding poetry.

The opposite direction is harder to accomplish.

Possible are elements like object or div, but both give no indication what the content is.

A solution could be to use instead of "article" the more generic "prose" element and for poetry the "poetry" element as containers. Another approach would be to use a generic text element for all of them with an attribute like "kind" with the possible values "prose" and "poetry" -- both types can have many subtypes, to get more details it is useful but hard to offer them all as predefined values, for prose for example: article, report, letter, short-story, fiction, novel.

For poetry some examples are: poem, lyric, lyrics, song, epic, tragedy, drama. Some of the subtypes can belong either to poetry or to prose, depending on the content.

In former times tragedy and drama was more related to poetry, today this moves more to prose in many cases. Maybe no need to be too specific, to which type a subtype belongs. This is maybe the point one can benefit from some RDF scheme(s) covering definitions of lists of subtypes of prose and poetry.

What is the functionality of such a container like either text or (article, prose) and poetry? It groups together some sub structure with close relation to each other and separates this group as an independent piece of text from the surrounding text. Reasons for this separation are: a completely different content structure, another author, only a weak relation to surrounding content. Today typical HTML documents became quite complex and for human readers or robots it gets quite difficult to make the difference, what is only jammed together and what really belongs together.

The possible content of such a container is the same as described for "section" or the current "article". There is no specific exclusion for poetry, because artists tend to get creative about the question, what poetry can be, if there is some unnecessary restriction ;o)

Typically/often such separated text containers have their own heading, not related to the heading cascade outside of the container. Either a specific unnumbered new heading element can be used for such container to indicate, that it is independent from the text fragement outside, or the usual cascade h1-h6 is used. In almost any case then the top heading in such a container has to be a h1 heading representing the heading of the complete container. This happens too, if the container itself is only a child of a section or area belonging to a heading of another rank, because the rank outside is not related to the rank of headings inside the text container.

Approach for Text Container Default Presentation/Styling

For aural presentation a useful default presentation has a bigger benefit as for visual presentation. The reader could switch to a speaker with more advanced abilities for rhythmic and metric text types, else it will cause a pain for the audience. A less advanced reader may create a warning note about the problem as a minimum requirement. For visual presentation the functionality of a text container can be pronounced (to join together text fragments to one piece of art/literature and to separate this part from other fragments around it), this can be realised for example with a margin and padding of about one or two em for the text element and a thin border or outline around or at the beginning and the end.

Currently there is no really common behaviour to separate different content from each other, but it is easy for authors to change the behaviour with CSS again, if they need no separation (why do they use containers then?) or more advanced separation. For text only viewers the separation may be less detailed or simpler as for viewers with advanced rendering and styling capabilities.

CSS sample for a suggested default visual presentation/styling within an advanced viewer:

text {display:block;margin:1ex;padding:1ex; border: none}
text[kind] {border-style: double; border-width: medium} /*   to indicate unspecified/unknown values   */
text[kind='prose'] {border-style: dotted; border-width: thin}
text[kind='poetry'] {border-style: dashed; border-width: thin}
text[kind='poetry'] text[kind='poetry'] {border-style: none; border-width: thin}
text[kind='prose'] text[kind='prose'] {border-style: none; border-width: thin}
text[kind='poetry'] text[kind='prose'] {border-style: outset; border-width: thin}
text[kind='prose'] text[kind='poetry'] {border-style: inset; border-width: thin}

If not something like a kind attribute is used, but to elements as prose and poetry, the properties or selectors become simpler, of course.

Complete Samples (Pseudo Code)

Example 1:

 
<text kind="poetry" role="poem:freeForm textTune:fun">
<style type="text/css"><![CDATA[ @import url("poem.css"); @import url("poemAural.css"), aural; ]]></style>
<header>
  <h1>Dream to fly</h1>
  <aside role="text:dedication">to my buzzing spring love</aside>
</header>

<dl kind="strophe">
  <dt>I dreamed I was a fly</dt>
  <dt>buzzing through the sky</dt>
  <dt>looking for some sweets</dt>
  <dt>or some spoiling meats</dt>
</dt>
<dl kind="strophe">
  <dt>I waked up in a cold sweat</dt>
  <dt>last reminisence was a swat</dt>
</dl>

<footer>
  <address>Olaf, 2007-02-08, Hannover</address>
</footer>

</text>

Example 2. Using prose in poetry container as poetry:

 
<text kind="poetry" role="poem:freeForm poem:experimental textTune:fun">
<style type="text/css">
  <![CDATA[
  @import url("poem.css");
  @import url("proseInPoem.css");
  @import url("poemAural.css"), aural;
  @import url("proseAural.css"), aural;
  ]]></style>
<header>
  <h1>Ill me</h1>
  <aside role="text:dedication">to my beloved nag</aside>
</header>;

<dl kind="strophe">
  <dt>I</dt>
  <dt>Lost my rhythm</dt>
  <dt>Lost my soul</dt>
</dl>
<p>
     Maybe I have to take some break ...<br />
     Else I ... will never retrieve ... what I lost
</p>

<footer>
  <address>Olaf, 2007-10-06, Hannover</address>
</footer>

</text>

Example 3. Dialog, quoting Shakespeare:

<text kind="poetry" role="poetry:drama">
<style type="text/css">
  <![CDATA[
  @import url("poem.css");
  @import url("poemAural.css");
  ]]></style>

<blockquote>
  <aside role="drama:stageDirection">
    An open place.<br />
    Thunder and lightning. Enter three Witches.
  </aside>
  <dl kind="dialog">
   <dt>1 Witch</dt>
    <dd>When shall we three meet again?<dd>
    <dd>In thunder, lightning, or in rain?</dd>
   <dt>2 Witch</dt>
    <dd>When the hurlyburly's done,</dd>
    <dd>When the battle's lost and won.</dd>
   <dt>3 Witch</dt>
    <dd>That will be ere the set of sun.</dd>
   <dt>1 Witch</dt>
    <dd>Where the place?</dd>
   <dt>2 Witch</dt>
    <dd>          
                        Upon the heath.</dd>
   <dt>3 Witch</dt>
    <dd>There to meet with Macbeth.</dd>
   <dt>1 Witch</dt>
    <dd>I come, Graymalkin!</dd>
   <dt>2 Witch</dt>
    <dd>Paddock calls.</dd>
   <dt>3 Witch</dt>
    <dd>Anon!</dd>
   <dt>All</dt>
    <dd>Fair is fool, and fool is fair:</dd>
    <dd>Hover through the fog and filthy air.</dd>
  </dl>
  <aside role="drama:stageDirection">Exeunt</aside>
</blockquote>
<footer>
  <address><cite>William Shakespeare, ~1605</cite></address>
  <p><em>Macbeth</em>, Act 1, Scene 1</p>
</footer>
</text>

As can be seen, a minor problem in 'dialog' is left, indicated here with a line of   - how to preserve the rhythm between two different speakers? Maybe pre could be used, but normally this changes the used font, no completely nice solution as this with   is too...

Example 4. Found Poetry:

<text kind="poetry" role="poem:foundPoetry textTune:philosophical">
<style type="text/css">
  <![CDATA[
  @import url("poetry.css");
  @import url("poetryAural.css"), aural;
  ]]></style>
<header>
  <h1>Found Poetry</h1>
  <aside>
    <p>created from the wikipedia article about 
      <a href="http://en.wikipedia.org/wiki/Found_poetry">found poetry</a>.
    </p>
  </aside>
</header>
<dl kind="strophe">
  <dt>Found poetry created,</dt>
  <dt>recycled or "untreated"</dt>
  <dt>makes a philosophical comment</dt>
  <dt>by altering the rearrangement</dt>
</dl>
<dl kind="strophe">
  <dt>words, phrases,</dt>
  <dt>and sometimes whole passages</dt>
  <dt>contain clever ironic contradictions</dt>
  <dt>or a visual collage of juxtapositions</dt>
</dl>
<footer>
  <address>Olaf, 2007-10-11, Hannover</address>
</footer>
</text>

Search Engine Benefits

Benefits related to very specific search engine results and "improved" understanding for robots (artificial intelligence maybe in the future) can be expected, if such a RDFa subtype approach is detailed enough to give enough semantical information for example to create search engine robot experts for poems, song text, short-stories, web-log entries, etc. Another method could be to add RDF data semantics or meta elements to such text containers as well as to the complete document.

A well defined container already simplifies to extract such independent document fragments by more 'simple minded' robots for more specific search results or (apart from copyright problems) the generation of anthologies, if authors offer additional meta data - for example with RDF - in a second step about the theme of such a separated piece of literature. But even without such detailed data already the pre-selection by the type of container can be already very helpful for the search for literature related keywords.

Such a container element reduces the danger of a loss of data related to copyrights too, if the author notes this in such a container and the container is later extracted from a robot. The robot has not to 'understand' the complete structure of the document, has only to grab the complete container.

Because RDFa schemes can be improved independently from HTML5, HTML5 only needs to define such containers, either generic or for the types prose and poetry as a starting point.

The advantage of a separation between prose and poetry on element basis could be, that simple robots are already able to perform a rough and fast preselection, if they are only looking especially for poetry. Not just for poetry but for several types of literature, the behaviour of authors and readers is different, for fiction and poetry the amount of readers per month may be typical low, but it sums up over many years if it can be identified easily by robots and indexed. Such literature is expected to remain a long time, this can be already identified with a good container markup, different from other content. This other content on the web has often a large amount of users per month but is outdated or not available anymore within months and therefore does not need a long time identification for indexing by robots.

Lets think about a use case, how meaningful containers can improve the relevance of search engine results:

A query for a song text containing "sex and drugs and rock 'n' roll" is to be processed.

The search engine may already exclude a page of a rock band having a link to "sexy pics from the last gig" in the navigation container or inside a menu and having a song against the usage of drugs inside a poetry container and mentions rock'n'roll in a different section about their art concept/philosophy.

On another page a retired rock star is talking in an interview (prose container, inside dl of the kind conversation) about the old times of "sex and drugs and rock 'n' roll". This fits very well already to the keywords for the search -- and it is everything in one container. But the search engine will be able to exclude this, because the container is noted to be prose (and uses the kind conversation for the dl).

On another page there really is a song text about "sex and drugs and rock 'n' roll" within a poetry container. Because it is a duet, the author used dl of the kind dialog to markup this song. But the search engine is able to distinguish this from the interview case, because it is inside a poetry container, not a prose container and due to the kind attribute.

The user of the search engine can get much better results just if such a rough structure is avalilable and the author uses meaningful containers.

On another page there is another song text about "sex and drugs and rock 'n' roll". The author has less advanced semantic HTML abilities, but discoverd that those elements ll and l (respecitively the dl of the kind strophe) can be used to mark up poetry and songs. The search engine is still able to identify this fine structure and will offer this as a more relvant result as the first two pages, too. The user of the search engine has a big benefit from the fine structure markup and the meaningful container elements.

The user learned, that this is a good search engine to find song texts and the related artist and will use it again for this purpose.

Benefits for (CSS) Styling

In HTML4 it is almost impossible for user-agents or users to provide a style sheet for poetry content, because there are no specific element/attribute constructions to identify such content with style sheet selectors. However, if the author did not style poetry with a specific aural style sheet, the user itself or especially a user-agent with oral/aural presentation capabilities may want to change the behaviour for the presentation of rhythmic and metric content. Typically such user-agents and their users are more familiar with presentation requirements as the average author is, resulting in a big benefit, if it is possible to identify rhythmic and metric content inside a document for user (-agent) styling purposes.

Free Form, Visual and Concrete Poetry

Such poems often require a specific positioning precision for single words, letters or glyphs. Currently HTML does not offer a method for presentational and precise positioning. (Out of the scope of this language, no non-visual equivalent available, this requires the possibility for authors to offer alternative text-only content).

CSS positioning is possible, but because the positioning in such a case is not styling, but related to content, it is required to understand the intended information. Such types of text currently are not covered by HTML, simple problems can maybe already solved using pre inside an already sufficient strucure. Only such simple examples may work with the pre and monospace font types. But these types of art are not only related to text, because they have many graphical aspects too, authors may cover such types of art better with SVG, not with HTML. It is possible too to cover this with compound documents SVG+XHTML, but this causes typically problems in simpler user-agents without an XML-parser or without any graphical capabilities.

However, with an SVG referring object inside a poetry specific container at least the information can remain in HTML, that it is poetry and additional an alternative description can be given using HTML. If the container is somehow specific either for prose of for poetry, object can be sufficient, no specific element required to embed external poetry content. Either the text element gets directly the functionality from the object, or object is used directly inside the text to indicate that the object content is text like (prose or poetry). If text has object functionality, this means that text, audio and video are semantic equivalents to object. This is a similar construction as in SMIL with one generic container and some equivalents with a semantic meaning.

The advantage of the usage of object into text versus object functionality for text is the better backwards compatibility in theory, which works in practice too with most current browsers. The disadvantage is, that this is not true in practice for the current versions of MSIE 6 and 7 - the support for object is obviously completely corrupted, for example for SVG documents neither an installed plugin as that from adobe is used to display the SVG document, nor the alternative content of the object is displayed. Having object functionality for text avoids such problems in such old, flawed browsers, the alternative content of the text element will always be displayed.

Presentation for text with object functionality: If text has no data attribute, its content is used to get the size, else if the content is replaced with that from the data attribute, the size is taken from this content if available. Additionally if the data attribute is used, authors may add width and height to specify the size with preference.

If text has no object functionality, such presentation requirements can be left to the object.

Concrete Poetry Sample Using a Text Container Like Object (Pseudo Code)

<text kind="poetry" role="poem:concretePoetry textTune:philosophical" data="entropy.php" type="image/svg+xml">
<style type="text/css">
  <![CDATA[
  @import url("prose.css");
  @import url("proseAural.css"), aural;
  ]]></style>
<header>
  <h1>Entropy</h1>
</header>

<aside role="wai:hint">
  <text kind="prose">
    <p>
Obviously the viewer is not able to present documents of the type image/svg+xml itself or with a plugin.
One may try it with an external viewer: <a href="entropy.php" type="image/svg+xml">Entropy<a>.<br />
The following is an alternative prose description, if this is not possible either.
    </p>
  </text>
</aside>

 <text kind="prose" role="wai:alternativeContent prose:conceptArt">
  <p>
In the universe there are mechanisms to increase the entropy or disorder as the kinetic energy of particles in
a gas and there are mechanisms to increase the order named forces or interactions like gravitation, electromagnetic
forces, strong and weak nuclear forces, resulting in a mixture of structures with decreasing order and structures
with increasing order.
  </p>
  <p>
The concrete poetry artwork, this is a replacement for, shows such a mixture, animating the glyphs of the word
'entropy' by moving the position and the rotation of each glyph. The change of x and y motion and rotation is
independent from each other with random values and random acceleration, but acceleration and timing is correlated
for the complete group of glyphs representing the hidden forces of order. Advanced audience may observe the
hidden rhythm of order in the feigned random noise.
  </p>

 </text>

</text>

Simple Free Form Sample Using pre in dd (Pseudo Code)

<text kind="poetry" role="poem:freeForm textTune:fun">
<style type="text/css"><![CDATA[ @import url("poem.css"); @import url("poemAural.css"), aural; ]]></style>
<header>
  <h1>Dream to fly</h1>
  <aside role="text:dedication">to my buzzing spring love</aside>
</header>
<dl kind="strophe">
  <dd><pre>I dreamed    I was          a fly</pre></dd>
  <dd><pre>          buzzing through the sky</pre></dd>
  <dd><pre>          looking for some sweets</pre></dd>
  <dd><pre>       or some     spoiling meats</pre></dd>
</dl>
<dl kind="strophe">
  <dd><pre>I waked up        in a cold sweat</pre></dd>
  <dd><pre>      last reminisence was a swat</pre></dd>
</dl>
<footer>
  <address>Olaf, 2007-02-08, Hannover</address>
</footer>
</text>

Combination of Poetry (Rhythmic Text, Songs) and Animation (for Example Karaoke), Video, Audio

Declarative animation of text requires for example the SMIL animation and timing modules to be combined with HTML or this will be directly available with SMIL 3 having a text module. An alternative approach is maybe SVG tiny 1.2, having video, audio, declarative animation, graphics and text.

In most other approaches the text information will get lost or user sided script animation has to be used to cover or to move text fragments. Using only HTML 5 without any other formats or scripting will currently not be possible, similar to concrete poetry.

Karaoke Sample (Pseudo Code)

<text kind="poetry" role="song:karaoke textTune:fun" data="myHat.smil" type="application/smil+xml">
<style type="text/css">
  <![CDATA[
  @import url("proetry.css");
  @import url("poetryAural.css"), aural;
  @import url("prose.css");
  @import url("proseAural.css"), aural;
  ]]></style>
<header>
  <h1>Karaoke surrogate: My hat, it has three corners</h1>
  <h2><em>folk song</em></h2>
</header>

<aside role="wai:hint">
  <text kind="prose">
    <p>Obviously the viewer is not able to present documents of the type application/smil+xml itself or with a plugin.
One may try it with an external viewer: <a href="myHat.smil" type="application/smil+xml">My hat<a>.<br />
A sample of aural presentation is available as application/ogg: <a href="myHat.ogg" type="application/ogg">'My hat' sample<a>.<br />
The related melody only is available as application/ogg: <a href="myHatM.ogg" type="application/ogg">'My hat' melody<a>.<br />
The following are alternative presentations only for the lyrics in english and german, available as a language switch 
in the Karaoke <abbr title="Synchronized Multimedia Integration Language">SMIL</abbr> document.
    </p>
  </text>
</aside>

<text kind="poetry" role="wai:alternativeContent poetry:lyrics lyrics:folkSong" xml:lang="en">
  <dl kind="strophe">
    <dt>My hat, it has three corners,</dt>
    <dt>Three corners has my hat,</dt>
    <dt>And had it not three corners,</dt>
    <dt>It would not be my hat.</dt>
  </dl>
</text>


<text kind="poetry" role="wai:alternativeContent poetry:lyrics lyrics:folkSong" xml:lang="de">
  <dl kind="strophe">
    <dt>Mein Hut, der hat drei Ecken,</dt>
    <dt>Drei Ecken hat mein Hut.</dt>
    <dt>Und hätt' er nicht drei Ecken,</dt>
    <dt>So wär's auch nicht mein Hut.</dt>
  </dl>
</text>


</text>

Email

Thread: 'HTML 5' and some poem markup?

Thread: Conformance of DL Groups Missing DT or DD

Thread: Marking Up Poetry

Retrieved from "https://www.w3.org/html/wg/wiki/index.php?title=PoeticSemantics&oldid=10164"

PoeticSemantics

Issue: Explicit Markup to Semantically Express Poetic Forms

Contents

Semantic Markup for Poetry: A Proposal from Dr. Olaf Hoffmann

Leif Halvard Silli's Proposed Solution: Introduce a TEXT Element Parallel to VIDEO and AUDIO

Responses & Reactions

Peter Krantz: XHTML2 and RDFa Satisfy This Request

Doug Schepers (5 October 2007)

Ian Hickson (5 October 2007)

Gregory J. Rosmaita: Response to Ian Hickson (5 October 2007)

Examples of Poetry on the Web

Detailed Discussion

What is Poetry, Stanza, Strophe etc?

Poetry in Other Markup Languages

How to Markup a Stanza or Strophe?

Requirements to the Functionality of the Markup

Approach for a (Default) Styling Model for a Stanza

Some Specific, More Critical Use Cases

Alliteration - one line in one stanza

Haiku

Blank Verse

Free Verse

Simple Free Form with Requirement for Defined Empty Space

Possible Suggested Methods Available with HTML4/XHTML1.x for "ordinary" Poetry

Method 1.1 (br only)

Method 1.2 (p+br)

Method 1.3 (pre)

Method 1.4 (dl+dd (or dt))

Method 1.5 (ul/ol+li)

Method 1.6 (div+div)

Method 1.7 (p+span+br)

Method 1.8 (table+tr+td)

Resume About Existing Methods

Suggested Methods Not Available in HTML4/XHTML1.x

Method 2.1 (p+li)

Method 2.2 (p+br, not empty)

Method 2.3 (ll+l, new elements)

Method 2.4 (p+l)

Method 2.5 (ll+li)

Method 2.6 (section+p)

Method 2.7 (section+section)

Method 2.8 (dl with attribute kind)

Resume for all methods

How to Extend the Functionality and the Semantics of a Definition List for Different Use Cases Including Poetry

Technical Semantics

dl - diverse list(s), (manifold, miscellaneous lists)

dt - diverse list topic

dd - diverse list data

Correlation to other elements

Functionality and use cases, values of kind

How to Markup Larger Structures of Poetry Containing Mainly Stanzas as Fine Structure?

Approach for Text Container Default Presentation/Styling

Complete Samples (Pseudo Code)

Search Engine Benefits

Benefits for (CSS) Styling

Free Form, Visual and Concrete Poetry

Concrete Poetry Sample Using a Text Container Like Object (Pseudo Code)

Simple Free Form Sample Using pre in dd (Pseudo Code)

Combination of Poetry (Rhythmic Text, Songs) and Animation (for Example Karaoke), Video, Audio

Karaoke Sample (Pseudo Code)

Email

Thread: 'HTML 5' and some poem markup?

Thread: Conformance of DL Groups Missing DT or DD

Thread: Marking Up Poetry

Navigation menu