This document is also available in this non-normative format: Korean version
Copyright © 2013 W3C® (MIT, ERCIM, Keio, Beihang), All Rights Reserved. W3C liability, trademark and document use rules apply.
This document describes requirements for general Korean language/Hangul text layout and typography realized with technologies like CSS, SVG and XSL-FO. The document is mainly based on a project to develop the international standard for Korean text layout.
This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at http://www.w3.org/TR/.
This document describes requirements for general Korean language/Hangul text layout realized with technologies like CSS, SVG and XSL-FO. The document is mainly based on a project to develop the international standard for Korean text layout, and was originally developed by Korean typographic experts and standardization experts in Korean, then translated to English under the guidance of the authors. The Korean version of this document is also available, but the English version is the authoritative version. The Working Group expects this document to become a Working Group Note.This document was published by the Internationalization Working Group as a First Public Working Draft. We are looking for comments on the document before final publication. If you wish to make comments regarding this document, please send them to public-i18n-cjk@w3.org (subscribe, archives) before 14 June. All comments are welcome.
Publication as a First Public Working Draft does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.
This document was produced by a group operating under the 5 February 2004 W3C Patent Policy. The group does not expect this document to become a W3C Recommendation. W3C maintains a public list of any patent disclosures made in connection with the deliverables of the group; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy.
Every cultural group has its own language and writing system. Especially between East and West, the difference is great. Thus to digitize the writing system, a lot of data and technology is needed that accurately represents the language and writing system.
The purpose of this document is to explain the important differences between the Korean writing system and others, in order to digitize it. However this document does not provide the solution for actual implementations or problems, but only describes the basic information that should be treated as important issues.
This document was created by discussing the Korean related issues through Korea's Standard Infrastructure Enhancement Program and surveying needs from actual users and technical experts.
The following types of experts were involved in the creation of this document:
The contents of this document are related to the Korean writing system, so all discussions took place in Korean, and were then translated into English once the document had been finished.
The technical terms for discussing and describing the Korean writing system were carefully selected after considering potential differences in nuance even if a direct translation exists, and in some parts they were described in both languages so the discussion can be continued in the future. Also many figures are included to promote understanding for certain parts that are hard to describe in English.
This document describes the characteristics of the Korean language system along the lines of the following principles.
It does not cover every issue of Korean typography, but only the important differences from the Western language systems.
All text in the figures is in Korean, but the technical aspects of actual implementation are not covered by this document.
In order to help readers' understanding of how Korean is used, typical real life examples are provided to explain the core combination properties.
Text layout rules and recommendations for readable design are different matters, but it is hard to discuss these two aspects separately. In this document, these two issues are separated carefully.
This document consists of six parts as below:
The characters used in Hangul Fonts consist of the following: Hangul characters, punctuation marks, Latin alphabetic characters, numbers, and special characters.
The Hangul code ranges in Unicode consist of precomposed Hangul syllables and the Hangul 'Jamo' alphabet. (Refer to Appendix A for the code table.)
Following punctuation marks are used in a Hangul environment. (Refer to Appendix A for the code table.)
CJK Symbols and Punctuation (U+3008~U+300F) are used by default. However, in the case of punctuation, half-width punctuation (pause (comma) U+002C, full stop (period) U+002E) is used for horizontal writing, and full-width punctuation (ideographic comma U+3001, ideographic full stop U+3002) are used for vertical writing.
Hangul uses both fixed width and proportional width fonts.
In this mode, Hangul (Hangul Syllables U+AC00 ~ U+D7A3) glyphs use 'character frame width proportional to each letter face width'.
In this mode, Hangul (Hangul Syllables U+AC00 ~ U+D7A3) glyphs use fixed values for character frame width.
Standardization of 'letter face position in character frame' of fixed width Hangul fonts improves the compatibility of the space between Hangul font characters. (The relation between each side's spaces remains even when the Hangul font is changed. It prevents a paragraph's left outline being scattered when the opening quotation mark or parenthesis at the line head has an unexpected space).
In horizontal writing, the letter face of a full width opening parenthesis is placed on the right end of the character frame, and the left space is considered a user controlled area. In vertical writing, the letter face of a full width opening parenthesis is placed on the bottom end of character frame, and the space is considered a user controlled area.
In horizontal writing, the letter face of a full width closing parenthesis is placed on the left end of the character frame, and the space is considered a user controlled area. In vertical writing, the letter face of a full width closing parenthesis is placed on the top end of the character frame, and the space is considered a user controlled area.
The letter face of punctuation marks is placed at the left end of the character frame. The remaining space is available for use as a controllable margin by the font user.
Other glyphs are designed with either proportional or fixed width.
This is a typographic option for adjusting the kerning between all characters or character classes used in a Hangul environment.
Hangul font kerning is adjusted by considering the inner space and contour region of Hangul syllables that are composed of Hangul Jamos.
Group kerning can be applied for efficient adjustment of kerning for the 11,172 Hangul syllables. In order to apply Hangul group kerning, kerning groups are defined first, then groups are paired.
Typographic options include the control of characters and the control of paragraphs. In this chapter, the font itself and options in the system for controlling the font are described.
Characters or symbols with same characteristics in the typographic environment are classified, then writing options are applied for each character class.
In a Hangul environment, characters and symbols are classified by typographic characteristics, into 32 classes.
Characters like Hangul, Hanja, or Kana have zero space between characters by default. Additionally, there are inter-character space settings such as settings such as narrower or wider, to support double-sided justification.
Mixed writing is a Hangul sentence with Latin alphabetic characters, numbers, or symbols inside.
Hangul syllables, proportional width basic Latin, and fixed width or proportional width Arabic numerals are used.
Hangul syllables, proportional width basic Latin, and fixed width or proportional width Arabic numerals are used.
Each Latin character is placed vertically.
Latin characters are placed 90 degrees rotated.
Partial horizontal writing in vertical writing; two-digit numbers are rotated 90 degrees as a group, then aligned by the center of the line. Used mainly for two-digit numbers.
Superscripts and subscripts are placed next to a base character. Superscripts and subscripts are used for SI units, numeric·chemical formulae, footnote numbers, etc. The space between the base character and the superscript or subscript is zero by default.
Usually the size of the superscript or subscript is 60~70% of the size of the base character.
Typographic options include the control of characters and the control of paragraphs. This issue describes the writing/layout/editing system settings that control the font, not the font itself.
Hangul is written both horizontally and vertically.
In vertical writing:
Characters progress from top to bottom, and lines progress from right to left.
Columns progress from top to bottom, pages progress from right to left, and pages are turned from left to right.
In horizontal writing:
Characters progress from left to right, and lines progress from top to bottom.
Columns progress from left to right, and pages progress from left to right, and pages are turned from right to left.
Regular direction is used in horizontal writing, and there are three ways of arranging text in vertical writing.
Each character is arranged in the same direction as the Hangul characters.
Characters are rotated 90 degrees, mainly for Latin alphabet words.
Arrangement the same as Hangul characters. Two digit numbers rotated 90 degrees as a group, then aligned by the center of a line.
Captions or titles of tables, figures, etc. are rotated clockwise or counter-clockwise.
In vertical writing, the caption top of tables, figures, etc. is positioned to the right side of the page. If the caption is in a horizontal direction, the caption top is positioned to the top side of the page.
In horizontal writing, the caption top of tables, figures, etc. is positioned to the left side of the page. If the caption is in a horizontal direction, the caption top is positioned to the top side of the page.
Line adjustment; it is possible to do line adjustment regardless of columns in multi-column layout.
Vertical writing, align vertical lines in each column.
Horizontal writing, align horizontal lines in each column.
Indentation, by emptying the beginning of a line when a new paragraph starts, is applied so that the division into paragraphs is shown clearly. The value of the character width in the specific paragraph is used as the default unit for indentation.
Line head indentations on every paragraph. The most common writing mode.
No indentation on all paragraphs. More suitable for horizontal writing where the length of lines is shorter, relatively, than vertical writing.
Indentation on the first line of every paragraph, but not applied to the first paragraph of a page or the paragraph right after a title.
Indentation is applied to all lines besides the first line of a paragraph for bulleted list, numbered list, etc.
Inward paragraph indentation is writing placed inside of the page body by a certain number of characters, and outward paragraph indentation is writing placed outside of the page body.
Line alignment means aligning a line to the position of a certain character.
Center Alignment: apply zero or a specified value to the space between adjacent characters, and equally apply the same amount of space at both sides of the line. Align to the center of the line.
Line Head Alignment (left side alignment in horizontal writing): apply zero or a specified value to the space between adjacent characters, and align to the line head. If the number of characters in the last line is not enough to be a full line, fill from the line head and empty the line end.
Line End Alignment (right side alignment in horizontal writing): apply zero or a specified value to the space between adjacent characters, and align to the line end. If the number of characters in the last line is not enough to be a full line, fill from the line end and empty the line head.
Even Space Alignment: apply zero or a specified value to the space between adjacent characters, and align by the line head and the line end. If the number of characters in the last line is not enough to be a full line, fill from the line head and empty the line end.
Even Space Alignment with Forced End: apply zero or a specified value to the space between adjacent characters, then align by the line head and the line end. If the number of characters in the last line is not enough to be a full line, fill from the line head to the line end by force-adjusting the space between characters.
This is a process to avoid a situation when the number of characters in the last line of a paragraph is lower than the recommended minimum number. This is also called the widow process.
Periods and commas.
Brackets, quotation marks.
The following settings are recommended for automatic space placement (up to 4.3.6).
Opening parentheses, closing parentheses, and middle dots are half-width by default, but a half-width space is inserted before/after if these punctuation marks are with full-width characters like Hangul, Hanja, or Kana.
Hangul opening parentheses (cl1) must have a half-width space inserted before.
Hangul closing parentheses (cl2) must have a half-width space inserted after.
Middle dots (cl7) must have quarter-width spaces inserted before and after.
When an opening parenthesis is followed by an opening parenthesis, no space is added between them, and a half-width space is added in front of the first opening parenthesis.
When a closing parenthesis is followed by a closing parenthesis, no space is added between them, and a half-width space is added after the last closing parenthesis.
When a closing parenthesis is followed by a middle dot, a quarter-width space is added between them.
When a middle dot is followed by an opening parenthesis, a quarter-width space is added after the middle dot.
Lines cannot start with closing parenthesis (cl2), hyphen (cl5), dividing punctuation marks (cl6), middle dots (cl7), periods and commas (cl8~9), iteration marks (cl11), or prolonged sound marks (cl12).
Opening parenthesis (cl1) cannot be placed at the end of a line.
In cases of characters or symbols in a sequence like following, there is no line break between them.
Between a sequence of triple dot ellipses (……) or double dot ellipses (‥‥).
Between a sequence of Arabic numerals.
Between prefix symbols (cl13) and Arabic or Hanja numerals.
Between Arabic or Hanja numbers and suffix symbols (cl14).
Between a base character and a superscript or subscript.
Between strings of footnote numbers.
In cases like the following inter-character width cannot be expanded during line adjustment. This is done in order to group characters and symbols as one word.
All the cases mentioned in 4.3.6.
Before and after Hangul opening parenthesis (cl2) and Hangul closing parenthesis (cl3).
Before and after periods, commas (cl8~9) and middle dots (cl7).
Before and after dividing punctuation marks (cl6).
Before and after hyphens (cl5).
Lines are expressed by information such as line spacing or paragraph spacing.
There are four approaches to line spacing: character size, fixed value, space, and minimum line spacing.
'Line height' is applied using a fixed value, using units like point (pt), millimeter (mm), centimeter (cm), pica (pi), pixel (px), character (ch), 'Geop' (gp), inch ("), etc.
'Fixed value interline space' is applied by only specifying a value for the space, using units like point (pt), millimeter (mm), centimeter (cm), pica (pi), pixel (px), character (ch), 'Geop' (gp), inch("), etc.
'Minimum line height' is applied by specifying the minimum value for line spacing, using units like point (pt), millimeter (mm), centimeter (cm), pica (pi), pixel (px), character (ch), 'Geup' (gp), inch (˝), etc.
Line breaking is done by setting the rules for dividing the end of each line, and by adjusting the space between words in the line before.
Line Breaking Rules in Hangul.
If a line ends in Hangul, line breaking is done on character or word basis. The user can decide which approach to use on a paragraph-by-paragraph basis, or for the whole document.
Line Breaking Rules in English.
If a line ends in English, line breaking is done on character basis or word basis, or by using a hyphen.
Minimum Space for Breaking Line.
By specifying a minimum value for inter-word spaces, inter-word spaces in a line can be reduced in order to keep the word at the end of the line from breaking onto the next line.
Even space alignment is the default setting in Hangul writing. Lines are aligned using even spaces if fixed-width Hangul characters are arranged without spaces, but in cases like those below, line adjustment (adjusting inter-character spaces) is applied to align any dislocated line ends.
When proportional width glyphs like Latin alphabet text, Latin alphabet punctuation marks, numerals, etc. are included (Hangul-Latin mixed writing).
When the character sizes in a line are different to each other.
When restrictions, such as line breaking restrictions, are applied
The position of a tab and the alignment method (tab type) are specified, then the tab code is entered before the character or the word in the specified position.
Adjustment mode (tab type) is specified at the tab position.
(Upper) left corner adjustment tab: in horizontal writing, the left end of characters/words is aligned to the tab position, and in vertical writing, the top end is aligned.
(Bottom) right corner adjustment tab: in horizontal writing, the right end of characters/words is aligned to the tab position, and in vertical writing, the bottom end is aligned.
Center adjustment tab: the center of characters/words is aligned to the tab position.
Specific character adjustment tab: the front end of a specific character in characters/words is aligned to the tab position.
A note is used for presenting supplementary information about the main content, or presenting the source of cited information.
When footnotes are used in multi-column, there are three processing modes.
Showing footnotes at the bottom of the column with the corresponding text.
Showing footnotes in a single-column, aligned to the page width.
Showing all the footnotes in a column on the right side.
The position of footnotes can be specified when the page is not full.
Placing the footnote right above the specified footer area. If the footnote is increased, it goes toward the main text and the main text area is decreased.
Placing the footnote right below the main text If the main text content is increased, the footnote content goes downward
Numbers used in footnotes and endnotes use many types of symbols and characters, such as Arabic numerals, Latin alphabet characters, Hangul characters, Hanja characters, etc.
In order to divide the note and the main text, a dividing line or a space is inserted. Spaces are added above and below the dividing line, and spaces between each note are specified.
A note has many different positions due to the characteristics of footnotes and endnotes.
Footnote Position
A footnote can be presented below the annotated main text, right above the footer of the page including the annotated main text (refer to the figure in 5.1.2.).
Endnote Position
Endnotes can be presented all together at the end of the document. If the document is separated into many chapters and each chapter needs different endnotes presented, endnotes can be gathered at a specified location.
Note Restrictions
Notes can be used in any part of the text body including paragraphs with tables or boxes, but cannot be used in footers or headers.
Page numbers can be presented in many forms, and can also be located on any side of the corresponding page.
Page numbers can be at 10 different positions as follows. The appearance of numbers can also be different in many ways, sometimes used with a dash ('-') on each side.
Elements beside paragraphs (objects) are processed the same as characters, or as objects.
The object is positioned in a certain location between characters, and the size affects the line spacing.
Objects have 4 processing modes, as below.
Text Wrap mode
Top and Bottom mode
Behind text mode
The object is placed like a background.
Front of text mode
The object is placed on top of the text, covering it.
The object's placement in text and vertical/horizontal position is changed, depending on where the base is.
Paragraph base position
The vertical and horizontal position is calculated from the paragraph base.
Column base position
The horizontal position is calculated from the column base.
Page base position
The vertical and horizontal position is calculated from the page base.
Sheet base position
The vertical and horizontal position is calculated from the sheet base.
When objects are set to 'text wrap' or 'top and bottom' modes, objects do not overlap as a default. However such objects can be overlap by using an 'overlap' setting.
The outer margin of elements beside a paragraph means the space between the object and the paragraph elements around it. If an outer margin is set for an object, the caption for the object starts from the original object area, not the outer margin area.
Left: A space between the text and the left side of the object.
Right: A space between the text and the right side of the object.
Top: A space between the text and the upper side of the object.
Bottom: A space between the text and the lower side of the object.
Objects can have borders with various types of line.
Tables are composed of cells, and each cell can be configured into various forms as shown below.
Cells can be selected individually and have various settings.
Font and paragraph format of the cell
Border and background style
Merging and dividing cells
Size and margin adjusting
Space between cells
If inner padding space is applied to both table and cell, the value is applied to the cell prior to that of the table. The size of table does not change even if the value of inner padding is high.
The background of table and cell can be color, gradation, or background images.
The Hangul text layout system is approached from two perspectives: overall page layout design and page body design.
Page size (size of selected page or screen), page direction.
Text direction (horizontal/vertical writing).
Page body (the area excluding top/bottom/left/right margin from the page size).
Running head and page numbers.
In the case of pages in books, the elements are as below:
Size and name of fonts.
Writing direction (vertical/horizontal).
Number and spacing of columns.
Width of line (text box width=page body width).
Number of lines per page (per columns in multi-column format).
Line height value.
Note that the position of the page body is linked to the margin values. The following are examples of positioning and setting size of page body.
Vertical position, center of page; Horizontal position, center of page.
Vertical position, upper or lower margin specified; Horizontal position, center of page.
Vertical position, center of page; Horizontal position, inner margin specified.
Vertical position, upper or lower margin specified; Horizontal position, inner margin specified.
Running heads and page numbers are placed outside of the page body, and typical positions are as shown below.
The following punctuation marks are used in Hangul environment.
Basic Latin (U+0020~U+007F): Latin alphabet and numerals
General Punctuation (U+2010~)
Superscripts and Subscripts (U+2070~)
Currency Symbols (U+20A0~)
Letterlike Symbols (U+2100~)
Number Forms (U+2050~)
Arrows (U+2190~)
Mathematical Operators (U+2200~)
Enclosed Alphanumerics (U+2460~)
Box Drawing (U+2500~)
Block Elements (U+2580~)
Geometric Shapes (U+25A0~)
Miscellaneous Symbols (U+2600~)
Dingbats (U+2700~)
CJK Symbols and Punctuation (U+3000~)
Hangul Compatibility Jamo (U+3130~)
Enclosed CJK Letters and Months (U+3200~)
Hangul Jamo Extended-A (U+A960~)
Hangul Syllables (U+AC00~U+D7A3)
Hangul Jamo Extended-B (U+D7B0~)
CJK Compatibility Ideographs (U+F900~)
CJK Compatibility Forms (U+FE30~FE48)
ISO/IEC 10646-1:1993. Information technology -- Universal Multiple-Octet Coded Character Set (UCS) -- Part 1: Architecture and Basic Multilingual Plane
W3C Working Group Note. 2009. "Requirements for Japanese Text Layout" 1st ed., http://www.w3.org/TR/2009/NOTE-jlreq-20090604/
Korea Font Development Center. 2000. Korean Font Dictionary, King Sejong Memorial Foundation.
Sang-Soo Ahn, Jae-Joon Han, and Yong-Je Lee. 2009. Hangeul Design Textbook. Ahn Graphics.
S. J. Song. 2008. Book design textbook. Ahn Graphics. (translated from Andrew Haslam. 2006. Book design. Laurence King Publishing.)
Adobe Systems. 2010. User Guide. Using Adobe® InDesign® CS5 for Windows® and Mac OS. Adobe Systems.
This is the first publication of this document as a Working Group Draft.
This document has been developed with contributions from participants of Korean standardization committee members and Korean Society of Typography. The project to develop this document was supported by Korea's Standard Infrastructure Enhancement Program from KATS (Korean Agency for Technology and Standards).