Copyright © 2015-2016 W3C® (MIT, ERCIM, Keio, Beihang). W3C liability, trademark and document use rules apply.
This document describes requirements for the layout and presentation of text in languages that use the Ethiopic script when they are used by Web standards and technologies, such as HTML, CSS, Mobile Web, Digital Publications, and Unicode.
This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at https://www.w3.org/TR/.
This document describes the basic requirements for Ethiopic script layout and text support on the Web and in eBooks. These requirements provide information for Web technologies such as CSS, HTML and digital publications about how to support users of Ethiopic scripts. Currently the document focuses on Amharic and Tigrinya.
The editor’s draft of this document is being developed by the Ethiopic Layout Task Force, part of the W3C Internationalization Interest Group. It is published by the Internationalization Working Group. The end target for this document is a Working Group Note.
Sending comments on this document
If you wish to make comments regarding this document, please raise them as github issues. Only send comments by email if you are unable to raise issues on github (see links below). All comments are welcome.
To make it easier to track comments, please raise separate issues or emails for each comment, and point to the section you are commenting on using a URL for the dated version of the document.
This document was published by the Internationalization Working Group as a Working Draft. If you wish to make comments regarding this document, please send them to public-i18n-ethiopic@w3.org (subscribe, archives). All comments are welcome.
Publication as a Working Draft does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.
This document was produced by a group operating under the 5 February 2004 W3C Patent Policy. The group does not expect this document to become a W3C Recommendation. W3C maintains a public list of any patent disclosures made in connection with the deliverables of the group; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy.
This document is governed by the 1 September 2015 W3C Process Document.
This document describes requirements of the layout and presentation of text in the Ethiopic script for use with Web standards and technologies, such as HTML, CSS, Mobile Web and Digital Publications (e.g. eBooks). In addition to the Ethiopian and Eritrean homelands, the script is widely used througout the diaspora of these two nations. Accordingly, requirements are gathered from stakeholders engaged in Ethiopic publishing from all regions.
The document does not describe implementations or issues related to specific technologies, such as CSS. Instead it describes the typographic requirements of Ethiopic in a technology-agnostic manner, so that the content remains evergreen and is equally relevant to all technologies that aim to represent Ethiopic text on the Web.
This document was created by the W3C Ethiopic Layout Task Force. The Task Force will discuss many issues and harmonize the requirements from user communities and solutions from technological experts.
The following types of experts will be involved in the creation of this document:
The Task Force will conduct a survey of the publishing industry to solicit input and identify the set of in-use layout styles. This document will then represent the normalized results of the industry survey which in turn becomes a basis for its validity and suitability to purpose. In the interim before to the survey results have been compiled and applied to this document, tentative specifications will be given based on the most-probable survey results anticipated from participating experts. Survey Pending notes will appear along side specification sections to denote their status.
Growing out of the 182 element Ge’ez language syllabary for two millenia, Ethiopic in its present day form is a multilingual, and multinational, script comprised by 494 symbols representing: syllables, numerals, punctuation and tonal marks. Numerous linguistic, cultural, literary, historical and political issues surround the script and its utilization -all of which the authors strive to avoid discussing unless directly relevant to clarifying a given layout use case. The following principles are applied in the development of this document:
While Ethiopic documents can be characterized under a number of time periods, we discern only two gross eras herein. Classical Ethiopic encompases the layout requirements found in the documents of the first printing presses and coming into fruition under the reign of Emperor Haile Selassie. While not the focus of this document, “Classical Ethiopic” also encompasses handwritten manuscripts whose practices are present in early publishing. This era is characterized more by the influences of the Ge’ez tradition as embodied by the Ethiopian Orthodox Church with respect to spelling conventions, syntax and punctuation use, Ethiopic Wordspace and numeral system preference. Also featuring less variation in layout practice which is likely the result of having fewer publishing houses in operation.
The Classical Ethiopic era is followed by Modern Ethiopic spanning from the post-Imperial period up until the present day. Modern Ethiopic practices are characterized by looser spelling conventions, the preference change toward whitespace and western numerals, more variation in layout styles and in some cases limitations imposed by desktop publishing software designed for Western markets.
The focus of this document is on the Modern Ethiopic layout conventions with distinctions pertaining to Classical Ethiopic noted when known. An exception will be a complication that the authors hope to resolve found in Classical era documents where Ethiopic Wordspace interplays with Whitespace in a number of contexts.
In multilingual documents, differences between the heights of letters in Ethiopic script and its companion foreign script are often found. The difference is likely an artifact of the typesetting technology in use and does not represent the intent of the author or publisher. In the classic typeface style of Ethiopic script the letters will be of variable heights. Fixed height styles are more generally used for advertisement and not publishing. The nature of variable height Ethiopic letters is a factor that complicates how to best align letter height with a foreign script.
At a given point size, letter heights within a script may vary widely between typefaces. This adds another level of difficulty to aligning heights between scripts as an alignment will only be optimal between a specific typeface pair. Within a script featuring variable (not fixed) height letters the relative heights of letters are subject to change between typefaces. This phenomena reinforces the previous assertion on typeface pair optimization, but also introduces the possibility that alignment optimization can be language sensistive. This happens when an alignment pair designed for the letter inventory of one language is applied to another language that includes letters that exceed the heights of the optimized set.
The relative heights of letters used in different languages may also change with typeface as this next figure illustrates.
With these caveats considered, “Zen” alignment is a means to optimize an Ethiopic-Latin typeface pair that is suitable for a general use case when priori knowledge of a document language is unknown. Its basis is reviewed here. The Latin letter “Z” and Ethiopic letter “ን” are chosen as pairing symbols representative of the mean height . They both feature broad horizontal strokes that are easy for the eye to follow as a nearly continuous stroke. Ethiopic letters that were introduced as an extension to the Ge’ez core will typically feature a macron or other modifier at the top of a base letter in order to form the extension letter. The macron necessarily extends the height of a letter. Using the top of the macron for the reference height of a letter leads to height alignment that makes the majority of Ethiopic letters appear too short against Latin letters.
A better approach is to align Z with caron (Ž) against ን with marcon (ኝ) while aligning and Z with ን and find typefaces with a good tuple of aligned pairings.
Going further, we may assume that the ኝ and Ž will align satisfactorily (optionally check any irregularities) and simply align the Z-ን pair. Phonetically the sequence of these two letters would sound like “zen”, hence the name.
Issues/Questions:
A common practice in Ethiopic literature is the change of typeface weight in one script to appear more visually similar to the other. Most typically a Latin typeface will be made heavier to better match its Ethiopic counterpart. This weight increase is demonstrated in many Ethiopic fonts that include Latin letters. The font designer may have increased the weight of the Latin range primarily to provide heavier weight punctuation to use with Ethiopic script (see Ethiopicized Punctuation).
Literature produced with a heavier Latin tyepface may represent the author’s stylistic sensibilities but in some cases may only be a pragmatic outcome when an author finds manually changing between fonts too burdensome. The view of professional publishers is unknown here and should be determined.
Issues/Questions:
It is not uncommon to observe mid-sentence baseline changes in interlingual documents produced with pre-digital typesetting systems where Ethiopic and Latin text, for example, would appear to be laid out along different baselines in a line of text. The most common example of this appears in documents produced with a typewriter where a sheet of paper had to be moved between typewriters to produce a line in two scripts. An apparent baseline difference here would be the result of mechanical misalignment.
The only point it seems to make here would be to state that Ethiopic and foreign scripts should share the same baseline. This may already be the case with computer typography. If so, this section should be removed.
The Sebatbeit (aka Sebat Bet) language features a greater number and frequency of labiovelarized letter forms in comparison to the larger language communities utilizing Ethiopic script. In Sebatbeit publishing a number of modifications to diacritical marks are regularly applied to aid glyph clarity. These modified glyphs will sometimes appear within in font as a Stylistic Alternative or an entirely separate font may be used in publishing where these letter shapes appear as the default forms. The glyphs are enumerated here and are recommended for Sebatbeit liteature.
In Classical era Ethiopic up to nine puncutation marks can be found. Though rarely, if ever, would all nine be found in a single document. The Classical Ethiopic punctuation inventory may appear to be larger in number as a result of bi-chromatic rendering which can be applied to any punctuation and in several different styles. However, the bi-chromatic forms do not change the syntactic role of a given punctuation. In the Modern era, a third of the punctuation marks: ፧, ፨, and ፠ have largely fallen into disuse while a number of punctuation marks from Western practices have been adopted. Bi-chromatic punctuation is now reserved for spiritual materials and remains a calligraphic practice.
Ethiopic punctuation segments text and is non-enclosing. The Ethiopic word separator, labeled “Ethiopic Wordspace” in the Unicode standard, is given special attention in this section as follows more complex rules of interaction with other punctuation as well as justification. Ethiopic punctuation is often aligned with similar English punctuation though these associations must be understood as approximate. Ethiopic fullstop and wordspace are highly regular in their application, others, particularly ፣, ፤, and ፥ will be consistently used within a document but their roles may change between authors or institutes. A detailed review of punctuation semantics is beyond the scope of this document, however Ethiopic comma is given special attention in the following section.
Symbol | Address | Names | Usage |
---|---|---|---|
፡ | U+1361 | Ge’ez: ንዑስ ነጥብ Amharic: ሁለት ነጥብ Tigrinya: ክልተ-ነጥቢ English: Ethiopic Wordspace |
TBD |
፦ | U+1366 | Ge’ez: አስተአምሮ Amharic: አስረጂ ሰረዝ Tigrinya: English: Ethiopic Preface Colon |
TBD |
፥ | U+1365 | Ge’ez: ንዑስ ሠረዝ Amharic: ነጠላ ሰረዝ Tigrinya: English: Ethiopic Colon |
TBD |
፣ | U+1363 | Ge’ez: ነጠላ ሠረዝ Amharic: ነጠላ ሰረዝ Tigrinya: ንጽል ጭሕጋር (ንጽል ሰረዝ) English: Ethiopic Comma |
TBD |
፤ | U+1364 | Ge’ez: ዐቢይ ሠረዝ Amharic: ድርብ ሰረዝ Tigrinya: ድርብ ጭሕጋር English: Ethiopic Semicolon |
TBD |
፧ | U+1367 | Ge’ez: ሠለስተ ነጥብ Amharic: ሦስት ነጥብ Tigrinya: ምልክት ሕቶ (ትእምርተ ሕቶ) English: Ethiopic Question Mark |
TBD |
። | U+1362 | Ge’ez: ዐቢይ ነጥብ Amharic: አራት ነጥብ Tigrinya: ኣርባዕተ ነጥቢ English: Ethiopic Fullstop |
TBD |
፠ | U+1360 | Ge’ez: Amharic: Tigrinya: English: Ethiopic Section Mark |
TBD |
፨ | U+1368 | Ge’ez: Amharic: Tigrinya: English: Ethiopic Paragraph Separator English: Ethiopic Seven Dot Section Mark |
TBD |
Adopted into Ethiopic writing practices are enclosing punctuation such as parenthesis, brackets, single and double quotation marks and guillemets. Expressive punctuation such as question mark, exclaimation point, inverted exclaimation mark, and ellipsis are also incorporated into Ethiopic practices. Additional foreign symbols that denote currency, time, mathematics, or communicate with Internet protocols (e.g. "@" , "://") have also been adopted as over the last century as international communication grew.
The ES-781:2002 standard identifies the following inventory of western symbols to be used with Ethiopic:
1234567890 ? ! ¡ . / () [] {} < = > \ # % & _ - + ± × ÷ ‘ ’ “ ” ‹ › « »
Additionally the following punctuation is observed to be used with Ethiopic writing:
$ : , € @ …
Inverted exclaimation mark is repurposed and utilized differently than in its Western usage. In Ethiopic writing the inverted exclaimation mark is known as “Timirte Slaq” (ትእምርተ፡ሥላቅ) appears at the end of a sentence and will denote sarcasm. All borrowed punctuation is subject to typeface alignment with Ethiopic weights and shapes, an aesthetic enhancement discussed in this section as "Ethiopicized Punctuation".
In Ethiopic writing practices three encoded symbols will be used in the context of comma, however they are generally not used together. Looked at another way, the Ethiopic comma may appear with three different glyphs. The western comma also has an important role in Ethiopic writing. Usage rules are as follows:
Issues/Questions:
In recent decades some communities have adopted a practice of employing the wordspace symbol as a comma when U+0020 SPACE [ ] is used as the word separator. The interpretation of the symbol is then dependent on the context of the writing convention in use by the author. Accordingly, an application user setting could be offered to set the symbol context.
An alternative view point on this practice is that U+1363 ETHIOPIC COMMA [፣] is in fact in use by these user communities; however its glyph has decayed whereby the line segment is lost and so it visually coincides with U+1361 ETHIOPIC WORDSPACE [፡]. Under this perspective, a simple solution would be modify an Ethiopic font for these users (perhaps adding an alternative glyph in an OpenType stylistic set) where the Ethiopic comma character address and semantics remain intact though the visual form has been tailored to meet aesthetic needs.
Issues/Questions:
The shape and weight of adopted symbols are often changed for a better visual fit with an accompanying Ethiopic typeface. Enhanced foreign symbols are referred to here as “Ethiopicized”. While many symbols are borrowed from western writing, not all necessarily benefit from Ethiopicization. Those that do will primarily be used in a context where the foreign symbol directly abuts some Ethiopic symbol. Common Ethiopic symbols are demonstrated in the following figure:
Issues/Questions:
Foreign language words or phrases are regularly found inline within a paragraph of Ethiopic text, often bounded within enclosing punctuation such as brackets and quotation marks (e.g. []()""''«»‹›). This practice is most often observed in news articles on international topics. The weight of the enclosing punctuation may found as matching either the Ethiopic or Latin weight. The preference of stakeholders must be determined here. Comparative samples follow:
As a rule within the embedded foreign script, the weight of punctuation and other symbols (numbers, etc) should be in keeping with the weight of the foreign text and not that of the surrounding Ethiopic.
Issues/Questions:
It modern literature where punctuation may be borrowed from Western writing, inconsistent formatting practices are found with respect to the presence of Ethiopic Wordspace (፡) alongside borrowed punctuation. It is helpful to establish rules for Ethiopic Wordspace in the presence of other symbols so that software grammar and formatting checkers can offer corrections leading to better quality and more consistent literature. The following rules are proposed:
Issues/Questions:
Rules are presented here to aid layout software that would offer the functionality of space symbol conversion to and from Ethiopic wordspaces. This functionality is desirable in a viewer application (e.g. web browser, eBook reader) to make the same substitution as per a user preference. Thus the user would be able to read a document with Ethiopic wordspaces that was composed and delivered with white space. Likewise in the reverse, a user who preferred white space could have their preference supported in a document that encodes Ethiopic wordspaces only. Similarly, this functionality would be useful to users of an editor application.
Develop rules for space-wordspace substitution (e.g. context for when not to substitute a space for wordspace).
This could lead to a CSS word-separator property that a javascript could manipulate to toggle the wordspace.
TBD: Test cases should be developed to validate these rules.
Space to Wordspace Transformation Rules
Wordspace to Space Transformation Rules
An additional wordspace conversion rule that is independent of the above space-wordspace substitution rules: Very commonly in Ethiopic documents a sequence of two wordspaces are found and may be substituted for ethiopic fullstop. This may be considered a defect correction rule.
Note that hyphenation is not required when wordspace is present, words split anywhere, would stakeholders still desire the hyphen nonetheless?
The Ethiopic gemination mark, ጥበቅ, is almost universally found at a fixed height above the baseline in typeset literature. The mark’s position must then be fixed so that it remains above the tallest Ethiopic letter symbol; this produces a variable height gap between the top of the letter and the mark. Conversely, when the symbol is hand written (often above typeset text) the mark will be found at a variable height above the baseline and demonstrating a fixed height above the letter symbol. Quite possibly the former style is an artifact of a limitation of the layout technology employed, and the later representative of an author’s desired rendering.
Issues/Questions:
Parenthetical expressions are found regularly in modern Ethiopic writing and will apply any of the enclosing symbol pairs: // , () and [].
Issues/Questions:
Classical Ethiopic literature applying quotation marks will employ double guillemet (« ») in a primary style and single guillemets (‹ ›) in a secondary style. Single guillemets will be used for inner-quotation and single word quotation. Modern Ethiopic writing will additionally utilize Latin quotation marks similarly (“ ” ‘ ’, U+201C, U+201D, U+2018, U+2019). The choice of Latin script quotation may represent either an author preference or a software limitation that made guillemets unavailable or difficult to access.
Issues/Questions:
Both pre-composed and mid-line ellipses are found in Ethiopic literature. The presence of one over the other may simply be an artifact of the publishing technology and not necessarily in line with the publisher’s preference.
Issues/Questions:
Discuss: Do regular rules apply? One consideration for Ethiopic would be that a newline is not a dependable word boundary in the common case where words are split across lines without a hyphen symbol and white space is the default wordspace. Word boundaries are known here only by context.
Discuss: The special case with wordspace sticking to a word leading to the mouse text selection rule that a following wordspace should be automatically selected with text, analagous to the rule applied to white space. MS Word does this.]
When Ethiopic Wordspace (፡) is used to separate words, there may still be some valid application for white space “ ”. White space is permittable to support the following formatting needs:
[TBD: Image samples needed]
Issues/Questions:
Sequences of Ethiopic numerals, such as years and page numbers, may be written in one of two styles. In the most common style in modern literature the numerals are written as discrete, independent, symbols. In a second “joining” style of writing, primarily found in calligraphic text and handwriting, the numerals may share a common upper and lower bar. Conceivably the joining style went out of favor as it proved more difficult to support in publishing technology. Modern preferences should be determined from stakeholders.
Issues/Questions:
In the Ethiopic numeral system a single symbol may represent a numeral with an order of magnitude in the power of 0, 1, 2 or 4. This feature of the numeral system leads to several potential layout possibilities when numerals are arranged vertically.
Issues/Questions:
An ordinal is formed in Amharic when "ኛ", and in Tigrinya when "ይ", follows a cardinal number. The ordinal marker is often, but not always, rendered in superscript form. The superscript practice is most prevalent with ordinals in western numerals, but is also applied with Ethiopic numerals.
Issues/Questions:
Classical Ethiopic writing does not feature a letter shape stylistic change to communicate word emphasis. In religious works, the color red will be used to emphasize a spiritual aspect of a word in a passage.
Emphasis in modern Ethiopic writing will employ every emphasis device available from the available publishing technology (e.g. underline, slant, embolden, letter size, letter outline, background shapes, etc.). The practice however is idiosyncratic and inconsistently applied leading to debate and disagreement within the publishing community.
The following subsections present a proposed best practice of the authors.
Within a paragraph, a reference to a book title, or section title is given in italic. Words may also be emboldened to emphasize importance.
[TBD: Examples Needed]
Issues/Questions:
In religous literature, certain words or phrases with a spiritual or more holy aspect may be colored in red. The practice is context dependent and a word that will appear in red in one sentence may not be red in the next (or elsewhere in the same sentence).
Issues/Questions:
As a rule, a wordspace following a word that is emphasized in any way (color, bold, italic, underline, etc.) shall receive the same emphasis. This is in keeping with the Ge’ez literature tradition.
Issues/Questions:
Abbreviations in Ethiopic languages will apply an abbreviation marker ("/" or ".") placed between the first letters of each word in a phrase. Single word abbreviation that applies a right side truncation (vs mid) will apply a "." as the abbreviation maker. In a multi-word abbreviations the last word may remain whole. Two letter words are only abbreviated in a multi-word abbreviation.
Examples:
Single Word
ሚኒስትር ⇒ ሚ/ር
ሆስፒታል ⇒ ሆ/ል
Multi Word
ጠቅላይ ሚኒስትር ⇒ ጠ/ሚ/ር
ኢትዮጵያ ኦርቶዶክስ ተዋሕዶ ቤተ ክርስቲያን ⇒ ኢ/ኦ/ተ/ቤ/ክ
Issues/Questions:
Any number of kerning pairs and ligatures are possible for Ethiopic typography that would lead to better visual quality of printed literature. While beyond the present scope of this task force, raising the topic with stakeholders to gauge the level of interest would be beneficial to help set the direction of future work.
An assumption here is that ligatures are only relevant to the reproduction of calligraphical manuscripts and not a requirement of modern literature.
Issues/Questions:
Word processors and text readers such as web browsers, eBook devices, etc. will automatically format the sentences of a paragraph over a number of lines as allowable by the available width of the viewing area. These software systems apply formatting rules that govern where how a line may end and a new line begin. Line breaking rules for Ethiopic are expressed with rules for how a line may start. A line may start with:
Issues/Questions:
In classical Ethiopic writing the wordspace separator symbol (፡) negated the need for word hyphenation across a line of text. When wordspace fell out of favor in modern writing the practice of splitting a word across lines of text continued. The reader would know to mentally reconstruct the word by relying on the knowledge of lexicon and context.
Issues/Questions:
When space (U+0020) is used as the word separator in Ethiopic text, the line spacing rules applicable to western text may be applied to meet user expectations.
Issues/Questions:
Since the arrival of the printing press in Ethiopia in 1863 (Pankhurst, 1998), full justification of Ethiopic has been a common typesetting practice in Ethiopian, and later Eritrean, publishing houses. Earlier, Ethiopic justification rules are a feature of Hiob Ludolf’s Historia Æthiopica which is noted as the first use of movable type for Ethiopic script (Ludolf, 1681). Prior to letterpress typography, calligraphic manuscripts rendered on parchment also featured full, or approximately full, justification. Though the latter likely reflects the scribe’s desire not to waste a millimeter of available lateral writing space.
The placement of Ethiopic wordspace presents a complication to the justification of Ethiopic text. Two placement styles developed in typeset literature which will be referred to here as “word bound” and “centered” styles. Additionally, the word spacing following an Ethiopic fullstop may (or may not) be governed by a special rule and in combination with the two wordspace spacing styles. These spacing rules are discussed in the following sections.
In keeping with line justification for Latin script, the non-printed or “blank space” (space and gaps) between words is treated as stretchable. The width of the space symbol itself will be elongated to some aesthetic width value that may vary from space symbol to space symbol across a printed line. In Ethiopic justification, the blank space between the Ethiopic word separator and the words it separates is likewise allowed to stretch. This stretching of blank space may be either symmetrical (“centered”) or asymmetrical but in the latter case space stretching is always between the right side of the separator and the following word –referred to here as “word bound”.
In “word bound” justification the word separator, which may be either a punctuation symbol or U+1361 ETHIOPIC WORDSPACE [፡], appears to adhere to the word to the left as if it were its final character. Figures Fig. 21 Ethiopic justification in word bound style (Erikson, 1921 (1913 EC)) and Fig. 20 Ethiopic Justification in Historia Æthiopica (Ludolf, 1681) both illustrate the word bound style.
In the second major form of Ethiopic justification the blank space around word separators is stretched equally on both the left and right sides; giving the appearance of the separator being centered between the words it divides. Fig. 22 Ethiopic justification in centered style (Gubenya, 1973 (1966 EC)). depicts the blank space stretching in both forms of justification using Ethiopic Wordspace as an example, though the stretching rules apply equally to Ethiopic punctuation as well.
To further illustrate the justification spacing applied to both Ethiopic punctuation and wordspace, Fig. 23 Depiction of blank space around Ethiopic wordspace for three modes of text justification. presents blank space stretching from the point of view of the symbol’s typographic bounding box. Here the “design blank space”, the space between the visible symbol and the box border, is itself stretched as needed to meet line justification:
Issues/Questions:
In the regular mode of Ethiopic justification (both forms) U+1362 ETHIOPIC FULL STOP [።] will be treated equally with all other punctuation symbols. In a second mode, the Ethiopic full stop will have special spacing rules applied to it whereby more separation space is allowed following the symbol and the start of the next word. In a sense, the right side space of the full stop is “more elastic” than in the regular mode. The elasticity rule and the visual effect are similar to that of the final line of a fully justified paragraph in Western text. When the final line of a paragraph of Latin script crosses a certain horizontal threshold, the line will become fully justified. Below that threshold the line will appear left aligned. The same rule appears to be applied to the Ethiopic full stop but on any line of the paragraph. An illustration of this sub-mode is depicted in the following:
Issues/Questions:
To date, computer software that typesets text has applied justification rules for blank space stretching that were designed to meet publishing requirements in the Western world. When the same rules are applied to Ethiopic text, the results are unsatisfactory as they do not meet user expectations. Largely responsible for the formatting dissonance when Western justification is applied to Ethiopic text, is the absence of a white space symbol in the writing system. There is no explicit white space symbol (in classic Ethiopic writing) to be “stretched”.
Formatting algorithms will then process U+1361 ETHIOPIC WORDSPACE [፡] as a punctuation symbol where word enclosing rules, rather than word spacing rules, will be applied. While still stretchable, “white space” in the Ethiopic wordspace is implicit rather than explicit. For a complete solution, software will ultimately need to be enhanced to stretch implicit space as required. Reclassifying the Ethiopic wordspace as a “Zs” symbol is expected to help alleviate justification issues and clears the way for software firms to implement comprehensive support for Ethiopic justification. Since the Ethiopic wordspace interferes with justification in present day software, authors may opt not to use it or may “pad” wordspace and Ethiopic punctuation with explicit white space to produce the desired justification style (i.e. Word Bound or Centered). To properly render text formatted in this way, future “wordspace aware” software, should elide spaces bordering Ethiopic wordspace and punctuation when producing justified text.
The following samples depict formatting of Kidane Wolde Kifle’s seminal work Maṣḥafa Sawāsew with a popular word processor (Kifle, 1955 (1948 EC)) under the limitations of Western spacing rules justification.
In digital documents such as in web pages and eBooks, it is recommended that the appearance of either U+0020 SPACE [ ] or U+1361 ETHIOPIC WORDSPACE [፡] be configurable as a user preference. An easy to access “space” toggle button would enhance a viewing application’s usability.
Paragraph indentation is a modern Ethiopic practice. The initial paragraph of a section is sometimes not indented. This practice may be idiosyncratic to an author but may represent a convention in use by a publishing house. A best practice supported by stakeholders should be established here.
Issues/Questions:
Bullet lists are utilized regularly in Ethiopic literature. Authors using a computer or typewriter will work with the list marker symbols made available by their software or machine. Many marker, or “bullet”, symbols are accepted for Ethiopic literature though not all will be considered optimal.
Issues/Questions:
⬩ (U+2B29)
◆ (U+25C6)
⬥ (U+2B25)
❖ (U+2756)
♦ (U+2666)
In Ethiopic ordered lists a number of symbols are used for the counter suffix. For example: "/" , "፦" , "." , ")" and even "፡" (Ethiopic Wordspace).
Issues/Questions:
Ethiopic corpus will present lists with two styles of alignment. These are a left side alignment at the list counter, or alignment along the counter suffix. Layout software will align a list at the suffix in keeping with the later style. The former style (left justified at counter) may reflect a limitation of the layout technology employed and not a preference of the author, copy editor or typesetter. A depiction of these two alignment styles is presented in the following figures:
Issues/Questions:
Inlined enumerated lists are commonly found in Ethiopic documents. Inline lists will follow the same sequences as regular lists. However, the spacing after the counter suffix may be different. Typically a regular keyboard space is observed following the suffix, if any. This is most likely a matter of convenience for the author and not necessarily representative of good formatting.
Issues/Questions:
Ethiopic literature will apply ordered numbered list using both Ethiopic and Western numeral systems. Ethiopic numeral lists are addressed in the Ethiopic Numeric Counter Style section of the CSS Counter Styles Level 3 specification.
Issues/Questions:
The Unicode standard encodes Ethiopic syllables for many languages using Ethiopic script past and present. Alphabetic lists are commonplace in Ethiopic literature, but will conform to the letter inventory of the language of the surrounding content. The W3C’s Internationalization Working Group publishes alphabetical counter style code snippets for a large number of languages using Ethiopic script. Many of these lists are believed to be only hypothetical and based upon the letter inventory of the identified languages; but may not have been used in practice. The alphabetical counter styles specified here encompass a smaller collection of languages with a demonstrated requirement as found by example in corpus or have come from stakeholder input.
Ge’ez | Amharic | Blin | Tigrinya (Eritrean) |
Tigrinya (Ethiopian) |
---|---|---|---|---|
|
|
|
|
|
Issues/Questions:
The አበገደ ordering of the Ethiopic syllabary is an alignment with the Coptic and Greek alphabets possibly to facilitate interdenominational communication or for the transfer of gematria practices. The ordering is used today largely for pedagogical purposes and has been used by some authors for the collation of entire works such as dictionaries. More often authors will apply the ordering for list orders.
The አበገደ ordering is potentially desirable to any language using the Ethiopic syllabary. The ordering is less likely to be found in the writing practices of languages that have a written tradition of under a hundred years. The language specific orders shown here are only those found utilized in corpus.
Ge’ez | Amharic | Tigrinya (Eritrean) |
Tigrinya (Ethiopian) |
---|---|---|---|
|
|
|
|
Issues/Questions:
An observed formatting practice is the start of a following paragraph inline with the last list item. The paragraph may flow immediately from the last item, or some indentation may be applied. This practice is illustrated in the following figure:
Issues/Questions:
Discuss: Basic common templates, Page elements, Page-level directionality, Bidirectional characteristics, Arrangement of elements, Text columns, Header, Footer, Illustrations, Tables, Page numbers, Margins, Positioning and arrangement of content, Pagination rules, Specimens and examples.
Page and section numbering can be discussed here. The user (author) should be to set the numeral system with a CSS attribute. A tool tip to over an ethiopic page number that presents the western equivalent would be useful. Are there any rules on page number position?
Proper document layout is very import for religious works in the Ge’ez traditions. Certain works like homiliaries (such as ድርሳነ፡ሚካኤል) are consistently formatted in two columns and the Synaxarium (መጽሐፈ፡ስንክሳር) in three. Margins in this class of literature will most common exhibit a 1x2x4 ratio where the top and fore edge margins are twice that of the gutter and half that of the bottom (depicted in the following figure).
These practices are not well understood by the authors and comprehensive is sought from experts.
Issues/Questions:
TBD: Note the practice of using two numeral systems for page numbering in books. For example in preface paging Ethiopic numerals may be used in documents where page numbers use western numerals. This is analagous to the practice in western literature of using Roman numerals in preface numbering and western numerals afterwards. A general statement should be made in this document that Ethiopic Numerals should Replace Roman Numerals where the later is a default. Additionally, note Desta Tekle Wold's አበገደ based preface numbering in "ዐዲስ፡ያማርኛ፡መዝገበ፡ቃላት።".
መጽሐፍ
አርእስት
መታሰቢያ
መስታወሻ
ምስጋና
ማውጫ
ቅድመ መቅድም
መቅድም
መግቢያ
ክፍል ፩
ምዕራፍ ፩
ምዕራፍ ፪
ምዕራፍ ፫
ክፍል ፪
ምዕራፍ ፬
ምዕራፍ ፭
ምዕራፍ ፮
ክፍል ፫
⋮
ሙዳዬ ቃላት
ዋቢ መጻሕፍት
መጠቍ ም
Issues/Questions:
[Consider if all or part of this section should be integrated with B. Alignment with HTML5 Layout & Formatting]
The default section headings stylistic changes (size, weight) applied to Roman script in word processors and web browsers are generally applicable to Ethiopic literature. In classic and modern layout practices applied to books and magazines, the document and chapter titles will be centered.
The use of underlining and color changes are not recommended for Ethiopic headings as they are not a traditional practice.
The following presents the six heading levels defined in the HTML standard and their applicable context in Ethiopic literature. Samples are provided here for review and consideration of the default settings for letter sizing and line spacing.
Relative sizes for comparison: አርእስትአርእስትአርእስትአርእስትአርእስትአርእስት
The following illustrates vertical spacing of the heading sizes. Heading and paragraph blocks are highlighted to illuminate spacing boundaries.
ይህን መጽሐፍ ለመጻፍ ያሰብኩበት ምክንያት የሮማ ልዑካን ባጼ ልብነ ድንግል በሺ፭፻፲፯ ዓ.ም ...
ይህን መጽሐፍ ለመጻፍ ያሰብኩበት ምክንያት የሮማ ልዑካን ባጼ ልብነ ድንግል በሺ፭፻፲፯ ዓ.ም ...
ይህን መጽሐፍ ለመጻፍ ያሰብኩበት ምክንያት የሮማ ልዑካን ባጼ ልብነ ድንግል በሺ፭፻፲፯ ዓ.ም ...
ይህን መጽሐፍ ለመጻፍ ያሰብኩበት ምክንያት የሮማ ልዑካን ባጼ ልብነ ድንግል በሺ፭፻፲፯ ዓ.ም ...
ይህን መጽሐፍ ለመጻፍ ያሰብኩበት ምክንያት የሮማ ልዑካን ባጼ ልብነ ድንግል በሺ፭፻፲፯ ዓ.ም ...
ይህን መጽሐፍ ለመጻፍ ያሰብኩበት ምክንያት የሮማ ልዑካን ባጼ ልብነ ድንግል በሺ፭፻፲፯ ዓ.ም ...
Issues/Questions:
A formally recognized standard for bibliographic citation of Ethiopic publications is not found in the Ethiopian publishing community, and bibliographic convention is left to the discretion of individual authors. Establishing a standard is recommended by the present authors and will aid in document consistency and in the machine processing of reference citations. To address a book citation convention, a strong starting point is available from the work of a recognized subject matter expert, Dereje Gebre of the AAU Amharic Language Department, and past Vice President of the Ethiopian Writer’s Association. Professor Dereje employs the following convention:
<Citation> ::=
<Author Full Name> "፤"
<Publication Date> "፤"
<i><Title></i>
<City> "፤"
<Publisher> "።"
where
<Title> ::= <Terminated Title> | ( <Unterminated Title> "።" )
<Terminated Title> ::= <Unterminated Title> [፡፤።?]
<Unterminated Title> ::= <Text> [:Letter:]
Issues/Questions:
Term | Amharic | Tigrinya | Definition |
---|---|---|---|
Ge’ez | ግዕዝ | ግዕዝ | The name of both the ancient Semitic language of northern Ethiopia and Eritrea as well as the name of the corresponding syllabic writing system. Also known as “Ethiopic”. It survives today as the liturgical language language of the Eritrean and Ethiopian Orthodox Chuches. |
text block | TBD | TBD | The part of the page normally occupied by text. |
justify | TBD | TBD | To adjust the length of the line so that it is flush left and right on the measure. |
measure | TBD | TBD | The standard length of the line; ie. column width or width of the overall textblock. |
Ethiopic Wordspace | ሁለት ነጥብ | ክልተ ነጥቢ | The printed word separator in Ethiopic literature depicted by two vertical dots (U+1361). |
Ethiopicized | TBD | TBD | The sytlization of western symbols (usually punctuation and numerals) to match the strokes and weight of an Ethiopic typeface. |
Classical Ethiopic | TBD | TBD | In the scope of this specification Classical Ethiopic refers to the set of practices observed in the mechanic printing of Ethiopian and Eritrean literature through the end of the imperial era. Literature at the start of this era begins as an outgrowth of scribal practices and in the the later half is more uniform in layout and editorial quality which is likely the result of state control over publishing. |
Modern Ethiopic | TBD | TBD | Ethiopic manuscripts published primarily after the reign of Emperor Haile Selassie where writing practices have become less adherent to the Ge’ez tradition, more pragmatic so as to fascilitate the constraints and limitations imposed by mass media. |
Yaredic Zaima Notation | ያሬዳዊ ዜማ ምልክቶቻ | TBD | The system of marking intonation in Ge’ez hymnody devised by the 6th century Saint Yared of Axum. |
Acknowledgment | ምስጋና | ምስጋና | |
Author’s Note | መስታወሻ | Also የአሳታሚው ማስታወሻ | |
Bibliography | ዋቢ መጻሕፍት | TBD | |
Chapter | ምዕራፍ | TBD | |
Dedication | መታሰቢያ | መታሓሳሰቢ | |
Foreward | ቅድመ መቅድም | TBD | |
Glossary | ሙዳዬ ቃላት | TBD | |
Index | መጠቍ ም | ኃባሪ ኣርእስቲ ገጽ | |
Introduction | መግቢያ | TBD | |
ISBN | መዓመቍ | TBD | የመጽሐፉ ዓለምአቀፍ መለያ ቍጥር |
Part | ክፍል | TBD | |
Preface | መቅድም | TBD | Sometimes “መግለጫ ” in older books |
Table of Contents | ማውጫ | TBD | Same as “መክሥተ አርእስት”? |
Title | አርእስት | TBD | |
TBD | መሳሰቢያ | Same as “ማሳሰቢያ”? | |
TBD | ማስታወቂያ | "Advertisement" resolve this with “Author’s Note”. Sometimes seen as “ማስተዋወቂያ”. |
This appendix is introduced to help insure that the Ethiopic layout requirements in this recommendation has a sufficient and practical coverage in its scope. HTML5 is applied here for comparison as it is anticipated as the most frequently applied document language under which the recommendation will be applied. HTML5 elements will be reviewed in this appendix and remarks made to indicate that either: no recommendation is needed (western defaults are applicable), a document section is identified that covers the element in the Ethiopic context, or that a gap in coverage is identified and will be addressed.
Special thanks to the following people who contributed to this document (contributors’ names listed in in alphabetic order).
This Person, That Person, etc
Please find the latest info of the contributors at the GitHub contributors list.
Transcribed Citation:
Fəqər əskä Mäqabər, H. Alemayehu. Berhanenna Selam Printing Enterprise, 1965. Addis Ababa.
Source Citation:
ፍቅር፡እስከ፡መቃብር፣ ሀዲስ አለማየሁ። ብርሃንና ሰላም ማተሚያ ድርጀት፣ ፲፱፻፶፰። አዲስ አበባ።
Transcribed Citation:
Tegbarawi Yetsihifet Memariya, D. Gebre. Commercial Printing Enterprises, 2004. Addis Ababa.
Source Citation:
ተግባራዊ፡የጽህፈት፡መማሪያ፣ ደረጀ ገብሬ። ንግድ ማተሚያ ድርጅት፣ ሚያዝያ 1996። አዲስ አበባ።
[TBD: Correlate the image references with Figure numbers]
Transcribed Citation:
Yeografi LeEthiopia Lijoch Tiqim, O. Erikson. Page 35. Swedish Mission, 1921 (1913 EC). Asmara.
Source Citation:
የኦግራፊ። ለኢትዮጵያ፡ልጆች፡ጥቅም፣ ኤሪክሶን። ገጽ ፴፭። የሚስዮንግ፡ስዌዱኣማህተም፡ታተመች፣ ፲፱፻፲፫። አስመራ።
Transcribed Citation:
Alweledem, A. Gubenya. Page 87. Brana Publisher, 1973 (1966 EC). Addis Ababa.
Source Citation:
አልወለደም፣ አቤ ጉበኛ። ገጽ ፹፯። ብራና ማተሚያ ደርጅት ታተመ፣ ፲፱፻፷፮። አዲስ አበባ።
Transcribed Citation:
Ḥaṣir Tarix Nebiy Muḥamed, J.H.A. Itedalewe. Page 35. Selam Printing House, 1987 (1979 EC). Asmara.
Source Citation:
ሐጺር ታሪኽ ነቢይ ሙሐመድ (ሰለላሁ ዓለይሂ ወሰለም) ብጅብሪል፣ ጅብሪል ሐጂ አቡበከር እተዳለወ። ገጽ 35። ቤት ማኅተም ሰላም፣ ፲፱፻፸፰። አሥመራ።
Transcribed Citation:
Maṣḥafa Sawāsew Wages Wamazgaba Qālāt Hadis, K.W. Kifle. Pages 65 & 159. Artistic Printers, 1955 (1948 EC). Addis Ababa.
Source Citation:
መጽሐፈ፡ሰዋስው፡ወግስ፡ወመዝገበ፡ቃላት፡ሐዲስ፣ ኪዳነ፡ወልድ፡ክፍሌ። ገጾች ፷፭ እና ፻፶፱። አርቲስቲክ፡ማተሚያ፡ቤት፣ ፲፱፻፵፰። አዲስ አበባ።
Transcribed Citation:
Maṣḥafa Ṣalot Mes Ser’ate Kiddase Betegreññā, B. Weldemariam. Page 32. Mahbere Haawaryat F-Ha Bet Mahtem Tehatmet, 1995 (1988 EC). Asmera.
Source Citation:
መጽሐፈ ጸሎት ምስ ሥርዓተ ቅዳሴ ብትግርኛ፣ በርሀ ወልደማርያም። ገጽ ፴፪። ማኅበረ ሐዋርያት ፍ-ሃ ቤት ማኅተም ተኀትመት፣ ፲፱፻፹፰። አሥመራ።