Warning:
This wiki has been archived and is now read-only.

Western Layout Requirements

From Print and Page Layout Community Group
Jump to: navigation, search

The purpose of this document is to describe what is needed in a style language such as CSS in order to be able to produce commercial-quality books, forms, and other paginated items.

The background goal is to make sure that HTML 5 plus CSS can be a replacement for XSL-FO.

This does not necessarily mean the ability to produce a paged media view, whether on a screen or paper or projected onto a retina, from any HTML document. In an XML workflow one would typically generate the HTML document using XSLT or XQuery, so having to put elements in a particular place is not onerous. However, HTML + CSS users would probably prefer not to have to do this.

Many features described here are possible with existing CSS features; they are mentioned because they are important, or for rhetorical consistancy, or possibly because the authors didn't realise they were already possible. Other features are possible using proposed mechanisms from CSS, but the maturity of those features may be unclear, and perhaps mentioning them here will help to prioritise them.


The Outside

For an ebook this includes packaging (e.g. a zip archive) as well as cover images; for a printed book it includes the format (hardback, leather-bound, paperback), choice of “stock” (paper) and inks, imposition, folding, gathering, binding and finishing: the physical delivery process.

A sufficient approach here might be to supply a link to instructions in JDF (Job Definition Format), whether from HTML, from CSS, or both, and to rely on the print driver to use this where necessary.

Prelims

The front matter of a book is often determined by legislation. Western books, for example, must generally have a title page and an imprint page.

For printed books the prelims, or preliminary material, or front matter, starts after the cover and end-papers that hold the book together and includes:

  1. half-title
  2. frontispiece, or series info, on the back of the half-title page (half-title verso)
  3. title page, on the first right-hand (recto) page that has a page to the left of it, and hence is numbered page iii in bibliographies and bookseller catalogues.
  4. imprint page, on the back of the title page, which includes copyright, printing, impression and library cataloguing data
  5. dedication (optional, with the verso usually blank)
  6. table of contents (special formatting required)
  7. list of illustrations, starting on the first recto after the table of contents (special formatting required as for table of contents)

The title page

This must contain the full title of the book (including any sub-title), the names of the authors, the publisher (including imprint if appropriate, e.g. Granta published by The Penguin Group) and the country of publication.

The page is usually numbered iii although a page number does not usually appear.

imprint page

For CSS, generated content needs to include date of printing, and you probably also need to be able to format the date (e.g. 3rd Jan 2015, vs. Jan 03 2015 vs 2015-01-03).

There may be boilerplate legal text required on this page (e.g. about moral rights and copyright) as well as cataloguing information. Case law will probably be needed to determine legislation about ebooks.

dedication page

Nothing specific. Note that the dedication is often a fragment of a poem or play, and that drama and verse have their own formatting and layout needs, especially in areas such as line-breaking, continuation lines and line numbering.

table of contents

Need to be able to fetch all the chapter titles and their page numbers. If you use XSLT you can use it to pre-generate the list of chapter titles and only need to fill in the page numbers.

Older books used dot leaders, rows of dots to connect chapter titles to page numbers. In this case you need to be able to specify the font, size, and spacing (e.g. with letterspacing) of the dots, and also whether they are to be aligned or staggered in alternate lines. Sometimes the dot leader is used for every alternate or every third row in the table of contents.

More recent books often put the page number after the chapter title, run on; sometimes the chapter title is right-justified so that the page numbers can be in a column, perhaps for easier adding up.

There may be multiple levels of page number, and sometimes chapter page numbers are formatted differently (larger, or in bold, perhaps) from page numbers for sub-sections.

There may be a short synopsis of each chapter under the title.

For an ebook one would also want the chapter titles and page numbers to be links, of course.

The table of contents may be several pages long. Usually the table of contents does not itself appear in the table of contents.

preface, synopsis, introduction

A book may have one or more of a preface, a synopsis, an introduction. Usually these are listed in the table of contents and are part of the book, but sometimes the introduction or preface may come before the table of contents. Pagination of the main body of the book usually begins with page 1 (one) at the start of chapter one, or at the start of the introduction, with preceding pages numbered in roman numerals.

The body of the book

By the body here I mean everything between the front matter and the back matter. This may include the introduction, but does not include any indexes or appendices.

Pages

Pages can be used in many ways. The most obvious is to print a document that doesn't fit on a single page, using a format more convenient to use than the scroll. Pages support cross-references by page number, as well as being more convenient to handle. Another way to use pages, especially in digital formats, is to present one logical part of a document or application at a time. Here, too, page numbers and navigation are important.

In a printed book the right-hand page is called the recto and the back of it, a left-hand page, is called the verso. A double page of a verso and a recto is sometimes called a spread, especially if it is designed as a single larger unit.

Running Headers

The start of a page (the top for left-to-right scripts) generally has a running header to help the reader navigate.

This typically contains the title of the current chapter or section on the recto (the front of the page) and the title of thebook on the verso (the reverse side, the left page in a book). Generally the chapter title will be left-aligned or will be centered between the edges of the body area, and the page number will be right-aligned at the edge of the body area; the book title on the verso will be right-aligned or centered and the page number left-aligned, so that the page numbers are nearest the otside margin where they can easily be seen by flipping through the book. The page numbers are often formatted differently from the rest of the running headers.

More complex publications, such as aircraft manuals, have a multi-line page header including an “effectivity table” on each page; government publications may include security clearance information in the header.

The running header is usually omitted on the first page of a chapter; this is almost always a right-hand page, and if necessary a blank left-hand (verso) page is left to make sure the chapter starts on an odd-numbered page. The otherwise blank page at the end of the previous chapter will usually still have the running header. When there’s no running header there’s usually a footer containing the page number.

Note: in XSL-FO one can define a page template to be used for the first page of chapters, so as to omit or move the page number.

Sometimes there are three parts to a running header, with one part left-alined, one centered in the whole width, and one right-aligned. If the centered part overflows it should not overwrite the left and right items, but can use all of a second line, with phrase-level breaking preferred.

Running Footers

These are the same basic idea as running headers except near the bottom of the page instead of the top. In a dictionary the running footer on the recto page often contains the last entry defined on the double-page spread, and the running header on the verso the first.

Page numbers

There are often multiple series of page numbers, especially in printed publications where they are essential for navigation.

In Western books the front matter pages are conventionally numbered in lower-case Roman numerals; the sequence starts again at one (usally in Arabic numerals) for the Introduction or for chapter one.

Note: In some Japanese and Hebrew/ publications the book or magazine can be opened at either end; the page numbers start at one at each end and meet in the middle.

In loose-leaf publications and documents that are updated it's common to see page numbers of the form 12-35, where 12 is a chapter or section number and 35 the page number within that chapter or section. In equipment manuals sometimes a single replacement page is issued to be inserted between pages 12-36 and 12-36, and might be numbered 12-37.1, 12-37.2 and so on. This is called a Point Page; it is probably out of scope for CSS Print.

Similarly in large publications there's often a financial desire to avoid reprinting existing pages for a new edition unless there is new text on them, so page breaks are locked.

For CSS, note that people may indeed need some control over pagination: force a page break, possibly with a page number taken from the document markup itself, e.g.

It’s common to need to put “Page 3 of 20” in a running footer or even at the bottom of a long table. Some publications even have government regulations requiring this. An example is documentation accompanying anything sold or supplied to the US military.

Page numbers are often placed or formatted differently from the rest of the running header. Sometimes they are placed part-way down the page in the outside margin. Sometimes they are printed in a different colour or font or size, or reversed (e.g. white numbers on a black circle).

Watermarks, Classification and Effectivity

Some print applications require per-page annotations about the status of the information on the page. A watermark in this context is a large word, phrase and/or symbol printed "underneath" the text, as if it is part of the paper, so that the content of the page overprints the watermark. Common phrases include DRAFT, CLASSIFIED, UNCLASSIFIED, PRIVATE, NOT FOR RESALE, REVIEW COPY and so forth.

The watermark is often printed diagonally and is sometimes omitted on blank pages.

Some documents will need something in the page header extracted from metadata in the page content, for example to say "this page of the repair manual only applies to cars with the GX engine." This is called effectivity.

Similarly there may be a need to indicate in the page header whether the page has changed since the last printing; this is very specialized and out of scope for CSS and XSL-FO, although actual implementations exist that provide this functionality through extensions.

Paper Sizes

A designer may choose to use different stylesheets for different paper sizes (possibly with CSS this could be done with media queries, or even by JavaScript). Some documents may mix paper sizes, for example using larger paper for pages with images or tables. Many printers, both commodity-level and commercial, support this.

Even where a single page size is used, some pages may be presented rotated (e.g. landscape), possibly with no running headers or footers. In XSL-FO this is done by selecting a different page master.

The body area, or main content area

The page body has the actual book, rather than the apparatus of the page which provides navigation and a place for the book to live. It is the inside of the building and, when we are reading, the walls are no longer the focus. In a digital book the running headers and footers might fade away, but they are still needed from time to time for navigation and for grounding: to know where we wish to go we must know where we are.

Within the page body may appear any manner of writing or graphic. That is, the content within the page body may contain elements of the photographic, lithographic, typographic, pictographic, cartographic, orthographic, calligraphic, collagraphic and even cryptographic marking or writings.

Within the body of the page are one or more compositional elements which may include type flowed into a region or blocks layed out in a grid. These regions and blocks may be thought of as "typographic devices." A paragraph, a list, a table, a heading; each is a device that typographers and orthographers have come to depend upon to convey meaning intended by the author.

In the following, we will explore use cases which may lead to specific requirements for Print and Page Layout. Each use case should include its own title, synopsis, detailed description, examples, link to image of an actual printed or paged representation, etc. Details TBD.

We will present each use case as a challenge to the community to demonstrate the CSS fragment and processor which satisfies the use case. Any use case that remains unsatisfied is a candidate to be considered as input to the creation of a requirement

Paragraphs

A paragraph is a written unit of discourse that presents an idea or makes a point. A paragraph typically consists of one or more sentences. A paragraph is typically begins on a new line and the first line may be indented, although this custom has fallen into disfavour in the past few decades.

A paragraph may begin with a pilcrow symbol. The leading letter, word or phrase in a sentence may be presented in display type, such as large capitals, capitals, bold lettering or even italic lettering. Paragraphs are sometimes set off from surrounding text by inter-paragraph space.

Paragraph numbering is discussed elsewhere.

Please submit unsatisfied use cases.

Lists

A list presents a series of items. Lists may be enumerated or not. Enumerated lists are discussed elsewhere.

The content of items in a list may range from a single character or symbol, to paragraphs, to graphic images and even subordinate lists.

Please submit unsatisfied use cases.

Numbering, Lettering, Marking

Most people are familiar with the use of enumerated lists, chapter and section headings, tables, figures, illustrations and even paragraphs, Numbering, lettering and marking are typically used as navigational aids, especially as used in cross-references. Marking, using symbols instead of numbers or letters, is more frequently used for legends and callouts.

Please submit unsatisfied use cases.

Tables

Tables are a familiar presentation format both for purposes of communicating information in columns and rows and for arranging design elements in a grid. The distinction is often captured by the use of the French words "table" and "tableaux".

The "table" is a an organized collection of information, or data. It is typically composed of a heading section used to identify the contents of the columns, a body and a sometimes a tail section in which may be presented notes and a legend. Some tables contain multiple heading and body sections. In many tables, the initial row may present headings to identify the content of the row.

Borders around the entire tables and between rows and columns are common design features.

Please submit unsatisfied use cases.

Extracts and Indention

As a matter of orthographic and typographic agreement extracts (block quotes, poetry, examples, etc) are represented with indention and vertical spacing. Please submit unsatisfied use cases.

Verse

Please submit unsatisfied use cases.

Code

Please submit unsatisfied use cases.

Quotations

Run-in, set-off, block, marks and " alignment, multi-paragraph, with speech, with display type, use of ellipses, omissions, interpolations and alterations, sic, added italics, citations.

Illustrations, Captions and Legends

Please submit unsatisfied use cases.

Spacing

Please submit unsatisfied use cases. Letter spacing, word pacing, sentence spacing, line spacing, kerning.

Alignment and Justification

Please submit unsatisfied use cases.

Hyphenation

Please submit unsatisfied use cases.

Superscripts and subscripts

Please submit unsatisfied use cases.

Numbers

Numbers are used a variety of contexts. The numbering of headings, paragraphs, lists, et al is discussed elsewhere.

Here are seeking unsatisfied use cases for typographic devices for the representation of specific types of numbers, so to speak.

Figures, words, ordinals, rounded numbers, scientific usage, datatypes, quantities, percentages, fractions, currency values, date and time, identifiers, roman, etc.

Composing Classical Greek

Ya, we know, you don't see a lot of classical Greek on the Web. Well, you just don't know where to look. Please submit unsatisfied use cases.

Chapters

A chapter usually starts with a heading. Often there is an initial cap or drop cap on the first paragraph, and the baseline of this must align with the n-th (often third) line of text exactly. The top of the drop cap will align exactly with the cap-height of the text on the first line, or will stick up above the text a little. The cap may protrude into the margin partly or entirely, and if it forms part of the first word the first line is kerned close against it.

The text will have a line spacing and line length that depends on the design of the font in use. A high x-height allows tighter line spacing; a stronger contrast between thicks and thins is harsher on the eyes and requires shorter lines. The font will generally have a companion typeface used for headings: if either is not available, a different pair of fonts may be preferred.

For periodical publications or for books in a series there is usually a Modular Grid, a set of guidelines to say where things go, a set of alignment points and lines and a set of slots that can be filled with different items. The grid is a large part of the brand identity of the publication.

One then needs to be able to say, "this article goes in the main copy area, uses two grid columns, and is itself formatted in three columns."

Images and tables may "float" to specific grid positions, with the constraint that they must be on the same page (or, in print, the same double-page spread) as the reference to them. They can generally float backwards through the text as well as forwards.

If images do not float they are formatted where they occur in the text, and may span multiple columns. The text might say, “The following table shows the average price of Welsh socks by region:” and obviously the table must follow that remark.

Marginalia and Footnotes

Figures are sometimes shown in a separate column reserved for the purpose. If there is only one figure it is next to the reference in the text, or perhaps, if there is room, starts a line or two higher to avoid the gestalt illusion of falling. Multiple figures each go next to their corresponding callouts, sometimes on alternating sides of the page in two separate columns. If there is not room the figures stack one above the other. If necessary the whole stack can be pushed up to make room for all the figures. The figures may also intrude into the text column.

Marginalia work the same as the margin-figures just described, and some publiations mix the two.

Footnotes have a callout and live in an area with a line separating them from the body that springs into being only if there are footnotes. In the input the footnotes are generally placed where the reference would go.

Short footnotes may be formatted as inline blocks, to save space.

Remember that the purpose of a footnote is to clarify the text, and the reason to use a footnote rather than an end-note or a pop-up note is so you can see it at a glance.

Footnotes may be per-column or per-page, and may be numbered starting at 1 (or * † ‡ | ||) on each page or column or article.

A footnote may be repeated on multiple pages if there are multiple references to it. If footnote numbering starts at 1 on each page, the footnote would would not necessarily have the same number when it was repeated.

Long footnotes sometimes require more than one page. Most publications want at least a couple of lines of actual text at the top of a page that continues footnotes, so readers can easily see which is book and which is note. If a new footnote reference occurs there, however, this requirement may be relaxed, unless the new footnote will fit entirely on the page even if the long note continues.

There may be multiple streams of footnotes, each with their own numbering sequence. Sometimes these are intermingled in the order the call-outs occur, and sometimes they have separate areas, e.g. one per-column and one spanning the page.

Blank Pages

Most publications will start each new chapter on a recto, which sometimes leaves a blank verso on the prevoius page, opposite the chapter opening. The otherwise blank page may say "this page intentionally left blank" so that people don't make support calls or request replacement copies.

Body size

The page body area should usually be filled all the way to the bottom on every page except the last in a section. Copy-fitting is used to do this.

In addition, if you hold printed pages to the light, the printed lines should align exactly, so that ugly and distracting show-through is minimized.

Headings

A heading in print should be nearer to the text to which it applies than to other text. For Western formatting for example this means the default margin above a heading should be larger than the space beneath it. This is more important in print than on screen.

Back Matter

Not written yet. Note that the plural of index in a book is indexes, not indices (the latter term is, however, used in mathematics, where the word index derives directly from the Latin)

Topics here need to include sorting and collating a multi-level index, multiple indexes, figure acknowledgements, colophon.

Appendixes

Notes

References

Glossary

Bibliography

Index(es)

Sorting and collating multi-level indexes

Acknowledgements

Colophon

Non-Book Paginated Documents

Forms, flyers, posters, folded brochures and other ephemera have very diverse needs. Folding can be specified with JDF. Two-colour commercial printing is common for posters, and so handling of spot colours is important. You may need to specify the ink mix, or give a colour with more precision than 8-bit RGBA, to ensure the correct solid ink is chosen. [@@Fabio? Chris?]