PANOSE 2.0 White Paper

Hewlett-Packard Document EWC-92-0015h

December 16, 1993

Author:
Michael S. De Laurentis

Contributors:
Benjamin P. Bauermeister
Raymond G. Beausoleil
N. Gregg Brown
Clyde D. McQueen III

Hewlett-Packard Corporation
101 Stewart, Suite 700
Seattle, WA 98101

Copyright ©1993, Hewlett-Packard Co. All rights reserved.

Executive Summary

DISCLAIMER: This document describes PANOSE version 2.0. PANOSE 2.0 is not an actual technology -- this White Paper is it! However, we feel this White Paper has value as a starting point for a discussion on a general-purpose method of driving synthetic typeface engines such as our own Infinifont or Adobe's Chameleon. It also offers a starting point for a successor to PANOSE 1.0, which is an actual technology owned by Hewlett Packard Company and which we are working toward making a non-proprietary, public standard.

PANOSE is a classification system for typefaces that categorizes them based upon their visual characteristics. A PANOSE number is a data structure containing PANOSE digits. Each digit represents a visual characteristic like weight (heaviness of the strokes), contrast (ratio of thick-to-thin strokes), and serif style (sans serif, cove serif, square serif).

PANOSE is technology-independent and is designed to be used in open systems containing multiple font formats. It is a general-purpose mechanism for representing and communicating with current static and distortable font technologies such as Adobe Multiple Masters and Apple TrueType 2.0. It is ideal for describing fonts in documents designed to be transferred among multiple platforms and different software packages.

The PANOSE matching heuristic compares the digits between two PANOSE numbers. It sums the differences between each digit, applies a bias to each difference, and produces a match value. The match value represents the visual distance between the two fonts. The smaller the number, the closer the two fonts are visually.

Given a font request and an environment containing many static and distortable typefaces in a variety of different formats, the PANOSE matching heuristic selects the font that will best represent the request. If a distortable font is selected, the match heuristic will also produce the distortion parameters.

This document is ordered from the general to the specific. It starts with a description of the existing PANOSE classification system (PANOSE 1.0), describes the open system font architecture for which PANOSE is designed, and describes the logical model for the PANOSE number. Finally, it presents the data structures and mechanism used for defining PANOSE digits.

Executive Summary

1. Introduction

1.1. History

1.2. Features of PANOSE

2. Mission and Goals

3. Rich Font Environment

4. PANOSE Model

4.1. PANOSE Space

4.2. Static Font Substitution

4.3. Distortable Fonts

4.3.1. Substitution

4.3.2. Instantiation

4.3.3. Piece-wise Linear Representation

4.3.4. Many-to-one Representation

5. Implementation

5.1. Algorithms

5.1.1. Assumptions

5.1.2. PANOSE Matching Heuristic

5.1.3. Instantiation

5.1.4. Normalization

5.2. Data Structures

5.2.1. Static Digits

5.2.2. Dynamic Digits

5.3. PANOSE in Fonts

5.4. PANOSE Software

6. Classification

6.1. Digit Characteristics

6.2. Digit Composition

6.3. Typeface Measurement

6.4. Cross-Class Digits

6.5. PANOSE 2.0 Digits

1. Introduction

This document describes PANOSE 2.0, the successor to the PANOSE font matching system.

This document should be read by individuals interested in fonts. Font vendors, application vendors, and system vendors will gain a great deal from this new version of PANOSE's simplicity, multiple-platform, and multiple-language support. It is primarily a technical document intended for engineers, but it is organized so it becomes more technical the further one reads. The material through Section 4, PANOSE Model, are presented in as non-technical a manner as possible.

It is assumed the reader is familiar with conventional font architectures used in most operating systems and applications. Familiarity with TrueType 2.0 by Apple Computer, and Multiple Masters by Adobe Systems Incorporated, is required for the sections that describe distortable typeface support.

All questions should be directed to:

Robert Stevahn
Hewlett-Packard Co.
11311 Chinden Blvd.
Boise, ID 83714
Phone: (208) 396-4787
Fax: (208) 396-3457

1.1. History

A thorough introduction to PANOSE, abundant with pictures and examples, is at the beginning of A Manual of Comparative Typography by Benjamin P. Bauermeister. Readers completely unfamiliar with type would find such a thorough introduction very useful. Copies of this book are available from Hewlett-Packard.

PANOSE is a font classification and matching system. A PANOSE number is a group of digits that represent the visual characteristics of a typeface. It is used to assess the similarity of two typefaces. This allows for font selection and substitution that is based upon visual characteristics, instead of naming or historical origin conventions.

The PANOSE 1.0 number consists of an array of 10 digits (10 BYTEs). The first digit is the family digit. Its value determines the meaning of the remaining 9 digits. For example, for the Latin Text family, the digits that follow are serif, weight, proportion, contrast, stroke variation, arm style, letterform, midline, and x-height. For the Latin Script family, the digits that follow are tool kind, weight, monospace, aspect ratio, contrast, topology, form, finials, and x-ascent. The meanings of these digits are defined in the PANOSE Classification Guide.

The PANOSE font mapper provided by Hewlett-Packard accepts a list of PANOSE numbers and compares them against the number representing the 'desired' typeface. It calculates the distances between the typefaces and selects the one closest to the request. To compute the distance, the mapper has built-in 'penalty' tables for each digit. Given the same two digits with different values, the mapper locates their penalty value in the table. It multiplies this by a weight and adds it to the weighted penalties of all the other digits. The result is called the distance for the typeface. The smaller the number, the closer the distance, and the more common the visual characteristics between the two typefaces.

The mapper contains within it tables for every digit. When a new classification is released by Hewlett-Packard (i.e., a new family digit is assigned thus defining a new set of digits that follow), the PANOSE 1.0 mapper must be updated with new penalty tables for the new digits.

In general the PANOSE 1.0 software operates best in closed systems; that is, where the storage requirements are strict and the available fonts are well known. It is compact and fast, and well-suited for Latin systems, or Latin-to-Kanji systems. PANOSE 1.0 is the correct solution for many customers, and will continue to be used by these customers.

But PANOSE 1.0 requires updates to handle new classifications, and was not designed to handle distortable fonts.

1.2. Features of PANOSE 2.0

PANOSE 2.0 is designed for extensible systems where fonts are constantly being added and removed, using mixed languages and mixed font formats, including distortable fonts.

It has the following primary features:

The digits are self-typing, and the PANOSE 2.0 number uses a flexible storage format, so the structure may be easily extended.
The meaning of the digits has been modified so the PANOSE 2.0 mapping software may operate without any knowledge of what the digit represents. Thus, new digits and new classification systems may be added without requiring updates to the embedded software.
The PANOSE 2.0 number can describe a distortable font such that the software may select the best distortable font to simulate a request, and provide the necessary inputs to the distortion software to create the font.

As in PANOSE 1.0, the PANOSE 2.0 number is technology-independent. It works on all platforms with all font formats.

Sections 4, 5, and 6 describe PANOSE 2.0 in detail.

Back to contents

2. Mission and Goals

The mission for PANOSE is as follows:

PANOSE is a comprehensive system for describing the visual properties of a font for the purposes of font matching and creation.

Goals for the PANOSE system are as follows:

PANOSE is the standard for describing and matching fonts in all formats, including distortable fonts, on all platforms.
New digits and classifications may be added to PANOSE without modifications to existing PANOSE software.
PANOSE is scaleable (grows with the needs of fonts and platforms) and is backward compatible.

Back to contents

3. Rich Font Environment

This section presents how PANOSE fits into application and operating system font architecture or, more accurately, how we believe distortable fonts necessitate the use of PANOSE.

We call the mechanism that handles fonts in an operating system the font server. Its services include:

Retrieving the font the user has requested and supplying it to the application.
Given a font request from an application, locating the font or the best substitute.
Distorting, rasterizing, or downloading the font as necessary to cause it to be displayed or printed.

Many of these operations, especially those in #3, are handled by special purpose software like commercial rasterizers or device drivers that we call font makers, but the action is administered by the font server.

The Rich Font Environment (RFE), is shown in Figure 1.

Figure 1. The RFE, showing the flow of prompting the user to select a font.

The diagram contains the following components:

Application
The application drives the font server. It initiates font queries and determines when to use fonts. It stores font information (the Rich Font Description) in its documents.
Font Server
This is the operating system service that manages fonts for the application, and interfaces to all the font makers.
Font Maker
This is a font technology capable of producing fonts, like Adobe Type Manager (ATM), SuperATM, Apple TrueType 1.0, Apple TrueType 2.0, Bitstream FaceLift, or HP Intellifont.

The diagram shows a font request being stored in the application document. The application initiates the request by calling a routine in the font server, which displays a standard font selection dialog. From this dialog, the user may access sub-dialogs specific to distortable font technologies.

The application may still enumerate (request a list of) all available fonts and create its own font selection dialog. With the advent of distortable font technologies, however, the application vendor will have to also provide a mechanism for the user to access the distortable font dialog as well.

We call the font request the Rich Font Description (RFD). The key word is 'rich' which means it is a general, platform-independent, font-format-independent description of the font. In this document, we present our ideas on what an RFD should contain, but more importantly we put forth the following definition of an RFD:

A technology-independent font description that is rich enough to describe any font for any technology in any environment.

We describe some technology-dependent components that we feel are likely to be important and can probably be stored in a somewhat generic manner, but the important thing is that the technology-independent component be rich enough to provide satisfactory font information under any condition. We believe the PANOSE number makes this possible.

We define the Physical Font Description (PFD) to contain everything necessary to actually produce the physical font. For example, it contains the outline, hints, and so on necessary to rasterize the font. The PFD, by definition, is specific to one environment and one font technology. In other words, the PFD is technology-dependent. Also, the PFD contains an RFD.

We use the terms instance and instantiation to mean the process of creating one displayable or printable typeface from a distortable font master. That is, the process of supplying the font maker with distortion parameters is called instantiation. The resultant font is an instance of the distortable font.

The application stores the RFD in its document. At some time later it requests the font from the font server by passing it the RFD. By 'later' we mean sometime during the session in which the font was selected, 6 months later when the user needs to update the document, on another user's machine, on another operating system, or even from another software package. In other words, sometime 'later' when the font that was originally requested is not necessarily available.

An example of using the font is shown in Figure 2.

Figure 2. The RFE, showing the flow of font creation.

The application passes the RFD to the font server, which determines which font to use, and then creates the font. The font server passes back an 'id' to the application which may be used to access more information about the font, or to draw with the font.

PANOSE aids the font server in three very important places:

Selecting which font to use, which may include selecting the distortable font technology that will provide the best substitute.
Describing the font, including distortable fonts, in a technology-independent manner.
Instantiating a distortable font in a technology-independent manner.

PANOSE is a font matching system based upon visual characteristics. This means PANOSE selects fonts based upon how visually close they are to the original request. This is especially useful with distortable fonts, where the technology has the ability to modify the shape of the letters to more closely resemble the requested font. PANOSE is able to quantify which among the distortable font technologies will come closest visually to the request. Likewise, PANOSE contains the ability to convert the font request (a PANOSE number) into an instantiation request.

The PANOSE number is one component of the RFD, as shown in Figure 3.

Figure 3. The Rich Font Description (RFD).

The RFD is what applications store in their documents. It replaces or supplements the concept of font embedding, where the entire font is stored in the document. The purpose of the RFD is to store the minimum information necessary to select or simulate the font the next time it is needed. As described earlier, this may happen on another machine, operating system, or from a different application

The font name is the key to locating the original font. If the original font is not available, then PANOSE is the key to locating the best substitute. In the case of distortable fonts, PANOSE is also the key to instantiating the font

The software from Hewlett-Packard includes support for the entire RFD. The software is designed in layers, where the lowest layer is the base PANOSE matching software. The highest layer is the Mapper Application Interface (MAI), which handles the entire font selection process. It searches first for a font by name. If an exact match is not available, then its substitution algorithm takes into account non-PANOSE parameters like matching vendors, metrics, and character sets. Licensees of the software may choose to use just the base PANOSE algorithms to supplement their own software, or may use the MAI to handle the entire font selection and management process.

PANOSE describes visual characteristics of a font. There are some characteristics that are arguably visual, like character set, that are not easily expressed as PANOSE digits. We regard these as not part of PANOSE, but instead part of the general font substitution algorithm of which PANOSE is a major component (like the MAI).

Also, parameters like point size, set-width, and pair kerning are not part of font substitution. Instead, these are inputs provided to the software that scales and rasterizes the font. Rasterizer controls like shadow, outline, and strike-through are embellishments made by the rasterizer software.

The details of the software are presented in separate technical documentation available from Hewlett-Packard. The remainder of this document focuses on the PANOSE system.

Back to contents

4. PANOSE Model

This section presents the concepts behind PANOSE, but not the implementation. The examples use fictional fonts and numbers for the purposes of illustrating a point. Please refer to Section 5, Implementation, and Section 6, Classification, for details on the algorithms and exact meaning of PANOSE digits.

4.1. PANOSE Space

In PANOSE 1.0 the digits stored information in 'buckets;' that is, fixed identifiers to a range of possible values. For example, a font's weight would be measured and categorized into one of 11 'buckets.' In PANOSE 2.0, on the other hand, each digit represents an axis in a virtual PANOSE space. A sample of this is shown in Figure 4.

Figure 4. Two-dimensional PANOSE Space.

In this example, PANOSE is limited to 2 digits; weight and contrast. A static font is represented by a point in this 2-dimensional space, with a weight value of 300 and contrast value of 400.

Notice we use the term 'digit' to mean the axis value. In other words, for the purpose of describing the concepts in this section, a digit is a real number. In the section titled Implementation we refine the term to satisfy storage requirements.

The properties of PANOSE space are as follows:

Each digit represents an axis, thus PANOSE space may have up to m-dimensions where m is the maximum number of PANOSE digits.
A single static font is represented as a point in PANOSE space.
A distortable font is represented as a higher-order object, such as a line, rectangle, triangle, cube, trapezoid, and so forth.
The distance between two fonts in PANOSE space is a measure of how visually close the two fonts are. In other words, the shorter the distance between two fonts, the greater the visual similarities are between them.
PANOSE space is extensible. In the rare circumstance that a font is created that does not exist in PANOSE space, we may issue new digits to account for it, thus widening the scope of PANOSE space to include the font.

Given these base properties, we are able to define any font in terms of PANOSE space. Thus we have a comprehensive system for describing and comparing fonts.

4.2. Static Font Substitution

The PANOSE font matching system simply finds the distance between fonts, as illustrated in Figure 5.

Figure 5. PANOSE font matching.

Given that each static font is represented by a point in PANOSE space, the process of finding the best substitute for a desired font is that of finding the closest point.

The distance between two points, that is the distance between two fonts, is called the match value.

There also exists a threshold at which no font is considered a reasonable substitute. This is represented by a circle in the diagram. The purpose of the threshold is to place a limit on what is considered a desirable match. If no fonts are within the limit, then the desired behavior is to return the system default font, like Courier, to the user instead of returning a font grossly out of range (of course Courier is probably also grossly out of range, but it signifies to the user that no acceptable substitute was found).

4.3. Distortable Fonts

A distortable font is represented in PANOSE space as a line, rectangle, triangle, cube, trapezoid, and so on. In other words, in PANOSE space the font is a k-dimensional object where k is less than or equal to the number of axes defined in the font's own space.

PANOSE space itself may be expressed in more dimensions than the distortable font, so given the distortable font is a k-dimensional object, it may reside in m-dimensions in PANOSE space, where m is greater than or equal to k.

We limit the examples in the following sections to 3-dimensions or less because those are the easiest to draw. However, the problem is linear and may be extended to m-dimensions.

A simple example is shown in Figure 6.

Figure 6. A distortable font mapped to PANOSE space.

In the example, a distortable font with one axis is represented. The axis is called 'color' and affects, in PANOSE terms, the weight and contrast of the font. To be specific, the further along the color axis the darker and slightly more expanded the font becomes. Thus, a font which varies only in one dimension may vary in two dimensions in PANOSE space.

If there were a one-to-one mapping, for example, color to weight, then it would be represented by a line with zero slope in PANOSE space.

A slightly more complex example is shown in Figure 7.

Figure 7. A distortable font with 2-axes mapped to a 3-dimensional square in PANOSE space.

In the example, a distortable font is shown with two axes, 'color' and 'width.' This maps to weight, contrast, and width axes in PANOSE space.

We assume a distortable font described in its own space is, by definition, rectilinear. In other words, for one axis it is a line, for two axes it is a rectangle, for three axes it is a brick (rectilinear volume), and so on.

4.3.1. Substitution

An example font substitution with distortable fonts is shown in Figure 8.

Figure 8. Font matching with distortable fonts.

As described with static fonts, the process of locating the best substitute is that of simply finding the closest font in PANOSE space. For distortable fonts, this means finding the closest point on the distortable font to the desired font.

In the example, a distortable font represented by a line was determined to be the closest font.

A circle represents the threshold at which a reasonable match may be found. In the case of distortable fonts, if any part of the font falls within the circle, then that means the font is capable of providing a reasonable match.

4.3.2. Instantiation

Instantiation is the term we use to mean the creation of a single static font from the distortable font. The resultant static font is an instance of the distortable font. The process of creating that font, which is providing a value along each axis of the distortable font, is instantiation.

In order to instantiate a font represented in PANOSE space, we must determine the values along the axes in the distortable font space that will yield the desired font. Another way of saying it is that we want to project the desired font onto the surface of the distortable font in PANOSE space. This is shown in Figure 9.

Figure 9. Instantiating a distortable font.

In the example, the desired font is represented by a point and the distortable font is represented by a line. To find the best font substitute we needed to find the closest point on the line to the desired font. The distance from the point to the line was the match value used in font substitution.

To instantiate that font, we locate its distance along the line, which is equivalent to the distance along the distortable font's axis.

A more complex example is shown in Figure 10.

Figure 10. Instantiation with a 2-axis distortable font.

In this example the distortable font is represented by a square in a 3-dimensional PANOSE space. As always, the desired font is represented by one point. The shortest distance from the point to the square is the match value used in substitution. The point on the square that is closest to the font is used to instantiate the font. Its distance along each axis represents the same distance along each corresponding axis in the distortable font space.

The example showed a point that was not on the square. In another scenario the desired font may be contained within the square. In that case the distance to the square, which is the match value, is zero. The distance along each edge corresponds to the axis value in the distortable font space.

4.3.3. Piece-wise Linear Representation

So far we have presented distortable fonts that have a direct mapping into PANOSE space. In other words, if the distortable font had one axis, then it was represented by a single line segment (not necessarily in one dimension) in PANOSE space, a distortable font with two axes mapped neatly to a rectangle in PANOSE space, and so on.

Distortable fonts may not necessarily offer such a smooth transition. An example is shown in Figure 11.

Figure 11. A single-axis distortable font with a piece-wise linear representation in PANOSE space.

In the example, a distortable font's 'color' axis is represented in PANOSE space with weight and contrast. The first three-quarters of the distance along the color axis the weight increases proportionate to color but contrast increases only slightly. The last quarter of the axis contrast increase dramatically.

To handle this, the font substitution algorithm adds an initial step: find the closest line segment to the desired font. From there, the algorithm is the same: the point on the line segment that is closest to the desired font represents the best match for the distortable font. The distance from the point to the desired font is the match value used in substitution. The distance along the line segment, normalized to the axis values represented by the end-points of the line segment, is the value used to instantiate the font.

A more complex example is shown in Figure 12.

Figure 12. A 2-axis distortable font with a piece-wise linear representation in PANOSE space.

In the example, a 2-axis distortable font is mapped to 2-dimensional PANOSE space, but it is not a direct one-to-one mapping. We did not attempt to label the axes or the points because the purpose of the diagram is to show how a font that is rectilinear in its own space may be non-rectilinear in PANOSE space.

Substitution and instantiation operates as follows: each quadrilateral is broken into triangles. The nearest triangle is located. The distance to it is the match value. The distance along the edge of the triangle represents the value along the axis.

Notice that the requested font in the example is on the surface of the distortable font, so its match value is zero.

As described earlier, we avoided examples that would be difficult to draw. Suffice it to say that the problem is linear and may be extended to m-dimensions. For example, a cube in distortable font space may be represented by a cube, or possibly several linked trapezoids, in PANOSE space.

It is also possible that a rectangle in distortable font space (a 2-dimensional object) may be represented by rectangles or triangles in 3-dimensional PANOSE space. Because the distortable font is 2-dimensional, the representative object in PANOSE space would be planar, but the space itself may be greater than 2 dimensions.

The algorithms are the same: find the closest thing, the distance to that thing is the match value used in substitution. The distance is zero if the desired font is contained within the thing. The distance along the edges of the thing equate to the distance along the distortable font axis.

4.3.4. Many-to-one Representation

As just described, the shape of a distortable font in PANOSE space may not be identical to the shape of the font in its own space. In fact, we anticipate this will normally be the case.

It is also possible to have the situation shown in Figure 13.

Figure 13. Example of a many-to-one representation.

In the example, a 2-axis distortable font is a square in its own space, but is represented by a triangle in PANOSE space. This means the top part of the square converges to one point in PANOSE space. In other words, PANOSE is incapable of accurately representing the top part of the square. This could be because of two situations:

The distortable font has a characteristic that is not described in PANOSE.
The PANOSE digit does not have the precision necessary to describe the distortable font in full detail.

The transformation from PANOSE to distortable font space is, by definition, one-to-one, so the single point in PANOSE space will match to one selected point on the square in distortable font space. In the case where one point in PANOSE space may logically transform to more than one point in distortable space, the PANOSE number will contain the default selection. In other words, the person who classifies the font determines what value to use.

Another example is shown in Figure 14.

Figure 14. Another many-to-one representation, where an entire axis is not represented.

Given the example, suppose there exist a 'year' axis that is not represented in PANOSE space. In other words, it modifies some visual characteristic other than weight, contrast, serif style, and so on. It modifies a characteristic that is not currently represented in PANOSE space. The transform function from PANOSE to distortable font space for this axis would simply be a constant. That is, one value along the year axis would be selected as its default. PANOSE would access one plane in the distortable font.

In this example and the previous one, the issue is that this is a possible representation of a distortable font in PANOSE space. It is an undesirable and unlikely situation but PANOSE is designed to handle it. To prevent situations like this, Hewlett-Packard will define new PANOSE digits as needed to account for new distortable font characteristics not represented by the current digits. Because the architecture is designed to handle this, we can add new digits without affecting other classifications.

Back to contents

5. Implementation

The previous section, PANOSE Model, presented the concepts behind PANOSE, this section presents the algorithms and formulas used by the PANOSE software.

We do not attempt to describe the meaning of PANOSE digits in this section. That is presented in Section 6, Classification. For the purposes of describing the implementation, we are concerned only with that fact that each digit represents an axis in PANOSE space and may contain any value up to the limits dictated by storage requirements.

5.1. Algorithms

The previous section on the Model presented that substitution is the process of finding the shortest distance between two fonts, and instantiation is the process of projecting the desired font onto the surface of the available font and from that deriving the distortable font axes.

We formally present the algorithms here.

5.1.1. Assumptions

We make the following assumptions:

The font request is always a static font. In other words it is represented by a single point in PANOSE space.
An n-dimensional region in distortable font space becomes a k-dimensional region embedded in m-dimensions in PANOSE space, where k<=n and k<=m.
All distortable fonts may be expressed at least piece-wise linearly in PANOSE space. In the event a curve would be a more accurate representation, we approximate with line segments.
The transformation from PANOSE space to distortable font space is at least piece-wise linear.
The transform from PANOSE space to distortable font space is 1-to-1. In other words, one point in PANOSE space goes exactly to one point in distortable font space.
The transform from distortable font space to PANOSE space may be many-to-1, or 1-to-1. In other words, many points in distortable font space may converge onto one point in PANOSE space.

5.1.2. PANOSE Matching Heuristic

Given a font request that is represented by a single point in PANOSE space, the substitution algorithm walks each of the available fonts computing a match value. The match value is the distance between the desired and the available font. The smaller the value, the better the match.

The general algorithm for computing the match value is as follows:

If the font being examined is a single point, continue with step #6.
If the font being examined is a single surface, continue with step #5.
If the font is contained within the font being examined, then the match value is zero (end of algorithm).
Locate the surface on the font being examined that is closest to the desired font.
Locate the point on the surface that is closest to the desired font.
Compute the distance from the point to the desired font. This is the match value.

The formula for computing the distance between two points is the Pythagorean theorem extended to m-dimensions, with a bias added. Let Ri represent a digit in the requested font, Ai the digit in the available font, Wi the weighting applied to each digit, and n the number of PANOSE digits. The formula is as follows:

The weight value W allows certain PANOSE axes to have greater or lesser effect on the match value. For example, it may be desired that fonts match closely on the serif digit. This digit would then have a high weighting to bias any fonts that are close in serif values. In terms of PANOSE space, the weighting value has the effect of scaling the axis.

5.1.3. Instantiation

In order for a distortable font to be instantiated, the point on the distortable font closest to the desired font must be located. In other words, the substitution algorithm must first be executed to locate the font, and then instantiation adds one more step of projecting the point into the surface of the distortable font (in PANOSE space).

In other words, instantiation is the method we use to transform a point in PANOSE space to its equivalent point in distortable font space. We limit distortable font space to the font itself, so the result of instantiation is always a point contained in the distortable font. In PANOSE space, however, it is definitely possible that the point will not be contained within the distortable font. Thus we must first project the font onto the distortable font in order to transform from PANOSE space to distortable font space.

There are two high-level steps in instantiation, as follows:

Project the font onto the distortable font surface in PANOSE space (if it is not contained within the distortable font).
Transform from PANOSE space to distortable font space.

The process of projection is that of finding the surface on the font closest to the requested font. If the distortable font contains the request then projection is not necessary.

In order to project a point onto the surface of a distortable font, we construct a projection space. An example is shown in Figure 15.

Figure 15. Example of a projection space.

The example shows a projection space of a single line segment (a 1-axis distortable font). It is bounded by two parallel planes perpendicular to the line segment, one at each end-point.

Projection spaces for 2-dimensional objects and higher can become quite complex. For example, in 3 dimensions each edge and each surface has a projection space. The edge projections are like that of the single line segment just described. The surface projections are like tubes extending out of the surface, perpendicular to it.

The algorithms do not differentiate between projections from edges or surfaces. Thus we will simply call them projection spaces, and both are implied.

The general algorithm for projecting the font request (a point) onto the distortable font is as follows:

Test to see if the point is contained within the distortable font. If this is the case, no projection is necessary (end of algorithm).
Locate all the projection spaces that contain the point. If none are found, then skip to step #4.
Compute the distances from the point to each surface/edge containing the projection spaces.
Compute the distance from the point to each of the vertices.
If the point is closest to a vertex (or not contained in any projection spaces), then project the point to the vertex.
If the point is closest to a surface/edge, then find the line that runs through the point and the surface/edge such that it is perpendicular to the surface/edge. The intersection of the line to the surface/edge is the projected point.

The result of projection is that the desired font is contained within the distortable font in PANOSE space. It is then transformed to the distortable font in its own space. From this, the axis values may be derived to provide inputs to the distortable font maker.

Notice that part of this process is required to find the match value in substitution. In order to find the 'distance' or match value to a distortable font we need to know which surface to measure from. Thus, the process of projecting is the step 'locate the surface on the font that is closest to the desired point' described in the substitution algorithm.

Also, we did not attempt to describe optimizations in the algorithm here. It is possible, for example, that we would not build a projection space for every surface. We may pick only 'close' surfaces to begin with.

5.1.4. Normalization

Normalization is the process of making sure the match values computed by comparing the requested PANOSE number to all the available numbers are all within the same range. The match values would be inconsistent if the PANOSE numbers do not all contain the same digits.

The matching heuristic does the following:

For each digit not present in the compared-to number that is present in the requested number, the algorithm assesses a distance of 10,000.

As described in Section 6.1., Digit Characteristics, it is part of the PANOSE definition that most meaningful values for a digit fall between the range of -10,000 to 10,000. Thus a distance of 10,000 means half the worst expected penalty. This implies the following logic: if another font has that digit, and its distance is better than half, then PANOSE favors that font. If its distance is not better than half, then the font not containing the digit is given the benefit of the doubt.

In general, the expected behavior is that not having a digit penalizes the font. Missing one digit is unlikely to affect the match if the font clearly matches closely on all the other digits. However, the higher the number of missing digits, the more likely another font is going to match better.

Hewlett-Packard establishes guidelines for font vendors on how to assign digits, and what digits should appear in given kinds of fonts. The more consistently vendors follow these guidelines, the more accurately PANOSE will handle all fonts.

5.2. Data Structures

This section describes the data structures. The pictures of the structures are self-explanatory. The size in bytes of each variable is listed to the left of the structure.

The byte ordering follows the Motorola format (low byte followed by high byte; low word followed by high word).

The sign of variables uses the following general rules:

Variables providing information about the data structure are unsigned, like the size of the structure, offsets to tables, indices to points, flags fields, and format numbers.
Variables providing information about the typeface are signed, like the PANOSE digits and the distortable font axis values.

All byte offsets are relative to the header (the first byte in the header is byte zero).

The header for the PANOSE structure is shown in Figure 16.

Figure 16. The format of the PANOSE structure header.

The first field in the table is the common digit format number. This value is used as a shortcut to reduce the size of the structure. A value of 1 means the remainder of the structure is exactly 10 bytes, containing the 10-digit PANOSE 1.0 number. A value of 0 means no shortcut; the number is followed by a count followed by count directory entries. A value of 2 means the Latin Text PANOSE 2.0 digits listed in Section 6.5., PANOSE 2.0 Digits.

The tag is a four-letter acronym for the type of table indexed to by the header entry. The following tables are supported:

pan1 PANOSE 1.0 format number.

p2.0 PANOSE 2.0 format number.

fvar TrueType 2.0 font variations table.

name TrueType 2.0 name table.

The pan1 tag indicates a PANOSE 1.0 format number, which is an array of 10 bytes. The p2.0 tag indicates the PANOSE 2.0 structure defined below.

There may be multiple pan1 and p2.0 header entries, which means there may be more than one PANOSE number structure representing one font. For example, one font may contain both Latin text and Kanji text characters, which would require a Latin Text number and a Kanji Text number, respectively.

For clarity, we will refer to the actual PANOSE numbers in the structure as the 'sub-numbers,' and reserve the term 'PANOSE number' to mean the entire structure. Thus there is one PANOSE number per font, but it may contain multiple sub-numbers corresponding to the different variations packed into the font.

Each sub-number is indexed by the following information:

PANOSE classification digit
This is the language of the typeface, like Latin, Kanji, or Hebrew.
PANOSE genre digit
This is the usage of the typeface, like text, script, decorative, or symbol.
Font variation index
This is a single typeface from a font family that may be packed into one file, like 'normal,' 'bold,' or 'italic.'

The PANOSE software first locates the sub-number that matches the desired index, and then performs the PANOSE match heuristic on the sub-number. Notice the index variables (class digit, genre digit, and fvar index) are only valid when the tag indicates a PANOSE sub-number (the tag is either pan1 or p2.0). For the fvar and name tables the index variables are zero.

When the PANOSE number is contained within a TrueType font file, the fvar and name tables are identical to the ones defined separately in the font file, and are therefore omitted from the PANOSE number. These tables are defined in the TrueType font file documentation.

In a PANOSE number implementation external to a TrueType font file, the fvar and name tables may be present to support the fvar indices in the PANOSE header. If the tables are not included, then the indices in the header are ignored.

When stored in the PANOSE number, the fvar table need not contain the font tuples array or font instances array. These arrays contain coordinate information about the font space, which is ignored by the PANOSE software. The nameID member of the font variations array is ignored if a name table is not present.

5.2.1. Static Digits

The structure of the PANOSE 2.0 number table is shown in Figure 17.

Figure 17. The format of the PANOSE number.

The common digit format number represents a specific digit-types table known to the PANOSE software. It is useful for two reasons:

It removes the requirement for a digit-types table.
It encourages standardization on the digits contained in the PANOSE number.

A value of zero means there is no common digit format. A value of 1 means a PANOSE 1.0 number. A value of 2 means the Latin Text PANOSE 2.0 digits listed in Section 6.5., PANOSE 2.0 Digits.

If the software encounters a digit format it does not recognize and a digit-types table is provided, it will learn from the number the digit format. If the software encounters other numbers with the same digit-format and no type table, it will presume the type table it learned from the other number.

The count of digits has a specific value if the digit format is non-zero. The software uses the count as a sanity check against the format, and reports a warning if it is not correct.

The digit-types table is an array of digit types variables. Each digit-type is a 3-byte variable containing the following information:

digit-id (2-bytes) The id number identifying the digit.

bias (1-byte) The weighting value W in the PANOSE matching heuristic.

As described earlier, the PANOSE matching heuristic compares digits with the same id number. These numbers are assigned by Hewlett-Packard when a digit is defined.

The PANOSE matching heuristic uses the average of the two bias values when comparing two digits. It is possible for the same digit to have a different bias. This does not happen when comparing two PANOSE numbers with the same class digit, but it may happen when comparing two numbers from different classifications. For example, the serif digit exists in both Latin and Kanji, but it has a lower bias in Kanji than in Latin.

In the previous section 5.1.2., PANOSE Matching Heuristic, for simplicity we let the variable Wi represent the weighting of the two bias values. The following is a more accurate representation of the formula:

The value of Wmax is the largest possible bias value (127). The application may override the bias values in the following ways:

Provide new values for W for each digit.
Instruct the weight function to either: 1) average WA and WR (as described above), 2) use WA, or 3) useWR.
Provide its own weight function.

The static-digits table is the array of PANOSE digits that describe the font. The order of values is parallel to the order of the digit-types table.

5.2.2. Dynamic Digits

The dynamic-digits table consists of several data structures, preceded by a header, which is shown in Figure 18.

Figure 18. The format of the dynamic digits table.

The reserved flags word should be zero and is reserved for future use. It will be used to indicate extensions to the structure. The structure will always contain at least the variables described here.

The create message is used to describe the distortion settings to the font distortion software. Much like the 'C' printf() function, the software will merge the axis settings into the message. The message may be one of the following:

A character string, not necessarily null-terminated (the length is stored in the dynamic digits table).
A binary data structure.

We assume the message may be formed in a way that is compatible with the distortable font technology. In situations where this is not the case, we assume it will be possible to write 'glue' software that converts from the message to an acceptable format. In the section below titled PANOSE Software, this would be handled by the Mapper Application Interface (MAI) component of the PANOSE software.

The message format will be documented in full detail in a future technical document from Hewlett-Packard, but suffice it to say the intent is that the format will be very close to that accepted by the printf() function. The %d and %u will cause the axis values to be merged into the message in the order they appear in the points table. The %f will cause the axis values to be converted to type float by dividing by the denominator value stored in the values array table (described below).

The printf() behavior will be extended to include the following:

The merge function will allow any character, including nulls, in the string. It will operate on string length, not null-terminator.
The format of the % characters will be extended to include additional PANOSE-specific flags. This will be something on the order of squiggly braces immediately following the % sign.

The extra PANOSE-specific flags will include the following:

The letter b to indicate merge binary, not character. For example, '%{b}d' would merge a 16-bit signed integer, '%{b}ld' would merge a 32-bit sign-extended long, or '%{b}lu' would merge a 32-bit unsigned long.
A number to indicate a zero-based index into the points-table. For example, '%{1}d' would cause the second axis value in the table to be merged, or '%{b0}ld' would cause the first axis value to be merged as a binary 32-bit sign-extended long. The same index could be used more than once.

Given that the merge routine will preserve all characters, and the % flags will be extended to allow for binary to be merged, then it follows that the message may be a binary data structure.

For example, a create message for Multiple Masters may take the following form:

MyriadMM_%u wt %u wd

A binary create message for TrueType 2.0 may be the following:

\000\002wght%{b}ldwdth%{b}ld

In the example, the first 2 bytes are the number of axes, followed by an array of axis/value pairs.

The common distortable font shape number indicates a standard distortable font shape, like a line segment, square, or cube. If this number is non-zero (indicating a common distortable font shape), then the vertices table is omitted, and a specific count and order of points is implied.

The digit-types table is identical to the types table for the static digits described earlier, except it lists the types for the points described in PANOSE space. It normally will contain fewer entries than the static digits table.

The vertices table defines the connections between all the points. It is shown in Figure 19.

Figure 19. The format of the vertices table.

The vertices may be stored in 8-bit or 16-bit values, depending upon the count of points stored in the points table. The information stored is the indices into the points table. If the count of points is less than 256, 8-bit indices are stored. Otherwise, 16-bit indices are stored.

The format of the values array entry is shown in Figure 20.

Figure 20. The format of the values array entry.

The count of items in the array matches the 'count of dimension in distortable font space' value in the header to the dynamic digits table. The order of the array matches the order of the axis values stored in the points table, and listed in the create message.

The software ensures the axes values it computes from transforming a point from PANOSE space to distortable font space fall between the low-value and high-value limits. In the event a distortable font axis is not represented in PANOSE space, as described in section 4.3.4, Many-to-one Representations, the software uses the default value.

The purpose of the normal values are so the axis value calculation may be detached from the real axis range. This allows for potentially higher accuracy if the real range is very small.

5.3. PANOSE in Fonts

The PANOSE 1.0 number for Latin Text classification appears in the TrueType font file format. It is in the 'OS/2' table. The reader should refer to the documentation on the TrueType font file for details.

5.4. PANOSE Software

The presentation of the software interface is beyond the scope of this document. A future PANOSE Technical Note will describe the exact interface. Suffice it to say that the interface will be very similar to the PANOSE 1.0 software.

The high-level architecture for the software is shown in Figure 21.

Figure 21. The high-level architecture of the PANOSE software.

The software contains the following components:

A 'core' (platform-independent) component that executes the PANOSE matching heuristic. Given two PANOSE numbers it will return their match value. Given a list of PANOSE numbers and a 'request' number, it will return the list ordered by closeness to the request.
A 'core' (platform-independent) component that executes a full font matching algorithm, of which glyph-shape (PANOSE) matching is one component. Operating on the Rich Font Description (RFD), with several controls from the application, the software may include items like face name, font family, character set, or vendor as components of the font matching process.
A platform-dependent Mapper Application Interface (MAI) containing a full set of font management services, including dialogs and platform-dependent font handling. The MAI will accept a font description in the native format of the operating system and generate an RFD from it.
Tools for testing and demonstrating the PANOSE software, along with tools for generating a PANOSE number and converting PANOSE 1.0 numbers to PANOSE 2.0.

The software will feature several tools for accessing the contents of a PANOSE number. There will also be a C++ object for the PANOSE number with these tools built in.

Back to contents

6. Classification

The previous sections described the PANOSE matching system without regard to the actual meaning of the digits. This section describes how digits are created and what their values mean.

6.1. Digit Characteristics

A PANOSE 2.0 digit, as described earlier, represents an axis in PANOSE space. For the purposes of describing the PANOSE model in previous sections, it was assumed that the digit was an infinitely large number and that PANOSE space had no bounds.

In reality, of course, in order to store these numbers with any reasonable space requirements, we must establish limits.

The following are characteristics of the PANOSE digit:

The value for the digit is integer-based (not fractional) and is stored in a 16-bit number.
The smallest value is -32,767 and the largest is 32,767.
The value zero (0) is considered to be the 'normal' for the digit. For example, for the weight digit zero represents medium weight.
There are a given set of defined values for each digit that represent the characteristics of most fonts. For example, for the weight digit there are values that identify light, semi-bold, bold, and heavy characteristics.
The defined values fall in the range between -10,000 and 10,000.

It may be inferred from this description that, because digits by definition have their most common values between -10,000 and 10,000, this also is the general limit of PANOSE space. In other words, most fonts will fall within 10,000 units of the origin in PANOSE space.

In PANOSE 1.0 the digits had special values that indicated 'no-fit' and 'any.' Each of these were peculiar meanings in that 'no-fit' meant that the digit did not apply to the particular font, and 'any' meant that any value was acceptable. In font substitution, the 'any' translated to a 'don't care' parameter. The 'any' value was also a crude attempt at representing distortable fonts, meaning the font could service any value for the digit (no limits).

The 'no-fit' digit does not make sense in PANOSE 2.0 because a digit that does not apply simply should not be included in the number. The 'any' value is replaced by the more comprehensive distortable font descriptions.

6.2. Digit Composition

The determination of what becomes a PANOSE digit, and how it is defined, is the job of Hewlett-Packard Corporation. The digit definition is what we consider to be the proprietary part of PANOSE.

Clearly defined digits along with carefully classified fonts are what make PANOSE work. Because this is our business we have a strong interest in making sure that users of PANOSE understand the classification process, or at least use our services to classify fonts.

Digits represent visual characteristics of fonts. For example, the weight digit describes the heaviness of the letters, the contrast digit describes the ratio of thick-to-thin strokes, the serif digit describes the serifs, and so on. The selection of the visual characteristic depends upon two factors, as follows:

Characteristics that are important for finding the best font substitute.
Characteristics that are likely to equate to a distortable font axis.

Once a visual characteristic is selected, many fonts are examined to identify specific instances. For example, for the serif several variations of sans serif and serifed styles were identified. For the weight digit, light, semi-bold, bold, and heavy are examples of instances.

Also, the 'normal' is determined, which is the value zero for the digit. This value is sometimes obvious, as is with the weight digit (the normal is medium weight), and is sometimes not as obvious, as is with the serif digit. Given a defined set of instances of the digit, the selected normal is usually somewhere in the middle of the range.

Values are assigned to the instances such that the vast majority of fonts will fall within the range of -10,000 to 10,000. As described above, this is one of the defined characteristics of PANOSE digits.

The diagram in Figure 22 illustrates the formation of the weight digit.

Figure 22. Example of weight digit, showing PANOSE 1.0 and PANOSE 2.0.

Let the variable at represent the true attribute of the font. In other words, at is expressed in measurements taken from the font itself. In the case of the weight digit, the width of the upright stem of the capital letter 'E' is measured as well as the cap height. These are referred to as WStemE and CapH in the diagram. The ratio WStemE to CapH is used to determine the weight attribute.

The PANOSE Classification Guide categorizes the result of this ratio into the PANOSE 1.0 digits.

Let the variable am represent the attribute stored in the PANOSE 2.0 weight digit. Rather than representing a category, it represents a visual weighting for the digit. The difference between it and the attribute from the same digit in another font determines the penalty for this digit.

The value am is derived from at. A function, f, defined by Hewlett-Packard, establishes the relationship between the true measure and the attribute such that the difference between two am for the same digit yields the penalty for that digit. In other words, the following is true:

The example shown in the previous figure showed an almost direct correlation between the PANOSE 1.0 and 2.0 weight digit. The digit values changed from 2 through 11, stepping by 1, to -4000 to 5000, stepping by 1000. The important difference, however, is in the change in logic. In PANOSE 1.0 tables were necessary to look up distances even in this simple case. In PANOSE 2.0 the differences between the digits are the distances.

An example of a digit that does not have a direct correlation is the serif digit. In PANOSE 1.0, there were 13 different kinds of serifs. All serifs were measured and 'bucketized' into one of those 13 kinds.

In PANOSE 2.0 there are several serif digits, as shown in Figure 23.

Figure 23. Illustration of 6 of the serif digits.

The digits correlate less to a specific kind of serif (as in PANOSE 1.0) and more closely to the individual characteristics that can be used to describe any serif. In other words, the am values more directly correlate to the true measures at that are taken on the serif.

PANOSE 2.0 has the ability to more precisely describe the serif attribute of the font, but because digits are optional it is not necessary to include all the digits. The three basic measures width, tip, and tall for the most part identify the distinguishing characteristics of most serifs, and should be included in the PANOSE numbers for all Latin fonts. The other digits are more useful for matching or distorting the more subtle nuances of the serif.

6.3. Typeface Measurement

The process of assigning a PANOSE number to a static typeface involves the following steps:

A page is printed containing samples of the type at 50, 150, and 300 points.
Significant parts of selected letters are measured (like the width of the vertical stem on the uppercase 'E'). These are the at measures.
The at measurements are entered into a database. The f(at) functions are executed on the database to determine the am values.
At least one other person visually checks the measurements and the am values.

These are the steps Hewlett-Packard follows, and encourages others who choose to classify fonts themselves to follow.

Distortable typefaces are measured as follows:

A typeface is generated at the end-points of each distortable axis.
A typeface is generated at each interesting internal point along selected axes.
Each typeface is measured as described above.
Only the am values that vary significantly are stored in the PANOSE number.

The PANOSE number for the distortable typeface stores all the am values at least once. However, only the am values that change significantly (approximately 100 units) are stored for each of the typefaces measured. This smaller group of am values are used in the algorithms for projecting and transforming a font request from PANOSE space to distortable font space. The algorithms are described in the previous sections titled PANOSE Model and Implementation.

The generation of the distortable typeface PANOSE number is a partially subjective process. The person measuring the typeface will take into consideration the storage and processing requirements for the number. The larger the number of measurements, and the larger the number of dimensions in PANOSE space that are stored (that is the larger the number of am values stored for each typeface measured), the more storage required and the longer the PANOSE match heuristic will require to match and instantiate the font.

6.4. Cross-Class Digits

We renamed the family digit from PANOSE 1.0 to the class digit in PANOSE 2.0. It indicates the classification system used to measure the font. The class indicates the language and character set, like Latin, Kanji, Hebrew, or Cyrillic. We also added a genre digit which indicates text faces, display faces, symbol faces, and so on.

The class and genre digits are in the header of the PANOSE number. They are used as a key by the PANOSE matching software to determine whether or not it should even compute the match value. By default, the mapper only searches for fonts with a class and genre matching that of the requested font.

However, the application software can explicitly request the software match against a different class and genre. For example, a common action in mixed-language systems is to take a font request made in Latin and use it to request a font in another language like Kanji. Thus, the Latin characteristics are desired in Kanji. This means that, even though the shapes of the letters are different, the same stroke weight, contrast, serif style, and so on are desired.

Once the software has decided which numbers to compare, by keying on the class and genre digits, the rest of the matching heuristic is identical. The software locates the digits with the same type, and compares them.

It is possible for one digit to have different meanings in each class. For example, the serif in Latin Text truly is the serif as we know it. There also exists a serif in Kanji Text which represents the termination on strokes. This is not truly a 'serif' in the conventional sense, but, by design, it is visually equated to the same thing.

Because PANOSE 2.0 digits store the weighting not the definition of the digit, it is possible for different families to define different mechanisms for arriving at the weighting value.

Given the example of the serif; there exists a digit with a type of serif width. It stores a value am which represents a weighting for that digit. There exists a serif width digit for both Latin Text and Kanji Text. This is shown in Figure 24.

Figure 24. Comparison of Latin serif width digit to Kanji serif width digit, demonstrating how each
has a different definition but is represented by the same units of measure in the PANOSE digit.

Even though both classification systems share the same am, each has its own true measure at and conversion function f for obtaining am. The advantage of this is that the logic for differentiating between the two classifications is built into the function f for each classification, not the am value stored in the PANOSE digit. The process of converting between the two classification systems happens when the number is defined, not when it is matched.

As described in the previous section titled Data Structures, the bias or weighting value applied to each digit differs for each classification system. For example, the serif digit in Latin has a very high bias, but in Kanji it has a low bias. During cross-class matching, the software uses the average bias between the two.

6.5. PANOSE 2.0 Digits

The focus of this document has been the concepts behind PANOSE 2.0. The design of this new version of PANOSE is such that new digits may be defined without the need to update software. As Hewlett-Packard defines new digits, it will release PANOSE Technical Documents describing the measurements, at, the digits, am, and the functions, f(at) = am, for converting between the two. This information will be available to licensees of the PANOSE classification system.

Figure 25 shows the PANOSE 2.0 equivalents for the PANOSE 1.0 Latin digits.

Figure 25. PANOSE 2.0 equivalents for PANOSE 1.0 Latin digits.

We show this diagram to demonstrate the evolution from PANOSE 2.0 to PANOSE 1.0. The exact definitions are defined in the PANOSE Technical Documents.

All of these digits are derived from the same at values measured to create the PANOSE 1.0 digits. Its just that more of the information is exposed, and at a much higher precision than with PANOSE 1.0. However, because they are the same measures, we can rapidly assign PANOSE 2.0 numbers to all fonts for which we have already assigned a PANOSE 1.0 number.

Back to contents