April 16, 2015

W3C Blog

Idea for a Web Payments Visual Identity

Today the Web Payments Interest Group published the first draft of Web Payments Use Cases 1.0. As we progress toward an architecture for payments well-integrated into to the Open Web Platform, I am thinking about a visual identity for Web Payments. Here is a draft idea:

HTML5 logo tweaked to look like a currency symbol

I hacked up the SVG by hand so it will surely need refinement in the next revision. While I like the idea of extending the HTML5 logo just a little bit to mean “Web Payments,” there are some issues. One is that the Web of course is more than HTML5. Also, a Web Payments architecture will be global and accommodate all currencies (not just the US dollar implied by the logo).

I hope with this post we will start the conversation about a visual identity for future W3C standards for Web Payments. I look forward to your comments!

by Ian Jacobs at April 16, 2015 07:20 PM

April 09, 2015

W3C Blog

One to watch: Web and TV progress

That TV and video is moving to the web is not new — a recent report by Nielsen showed that the number of American households subscribing to an internet video streaming service is already 40%. However there are still areas where these services may not meet user expectations, for example when compared to the instant availability of broadcast TV or the full-featured extras in DVD releases. Equally, broadcasters are increasingly integrating the web into their services but these experiences often need to be more seamless to encourage broad adoption. There are also new ways to enjoy content that the web has the potential to realize, such as multiple simultaneous camera views or customizable synchronization with other online and data services.

There are several groups within W3C working to make this a reality and the entry point for this activity is the Web and TV Interest Group. It’s here where use cases and requirements are established and gaps in standards are identified. Most recently, the needs of video delivery on the web today include:

  • Multi-screen content delivery
  • Stream synchronization
  • TV function and channel control
  • Mixed media sources and content overlays
  • Stream recognition and identification
  • Server-side content rendering (e.g. for low-powered STBs)
  • Improvements to existing features (e.g. adaptive streaming, timed text)

The incubator-style role of the Web and TV Interest Group has led to the creation and support of various groups that are aiming to address these issues and currently there are some exciting developments to be aware of and ideally participate in.

Diagram showing relationship of TV-related groups.

GGIE (Glass-to-Glass Internet Ecosystem) Task Force

A young Task Force within the Web and TV Interest Group that has attracted a lot of attention, it has a broad focus of looking at all phases of the video life cycle: Capture → Edit → Package → Distribute → Find → Watch. The ultimate goal is to identify essential elements in digital video’s life cycle and features that would be appropriate for recommendation for standardization in the appropriate SDOs, not just W3C. To achieve this, the Task Force is currently gathering use cases and all members of the Web and TV Interest Group are welcome to join in the discussion. See the Task Force page for more.

TV Control API Community Group

Thanks to the contributions of a growing number of participants, an API to control TV-like content and features is taking shape with the hope of eventually producing a new standard for media devices, set-top-boxes and of course televisions. We’ve used existing TV APIs for reference but there’s still lots of work to do on the draft specification for it to one day become a standard. See the group page for more and to join.

Multi-device Timing Community Group

The newest TV-related group looking at how to accurately synchronize media streams across the web. This could be friends on a train wanting to watch the same movie on their separate devices, laughing at the same time. Another use case is watching a sports event on a large screen and having a separate single player or athlete view on your phone or tablet. Some interesting demos have been presented to the group but it’s still early days and a great opportunity to influence its direction and deliverables. See the group page for more and to join.

Media Resource In-band Tracks Community Group

This group is developing a specification defining how user agents should expose in-band tracks as HTML5 media element video, audio and text tracks. In other words, web applications would be able to access information (e.g. metadata, captions, translations, etc.) within media stream containers through the media element. The draft specification that the group is working on currently covers the following media stream container types:

  • MPEG-2 Transport Stream (MPEG-2 TS) (video/mp2t)
  • ISO Base Media File Format (ISOBMFF aka MP4) (*/mp4)
  • WebM (*/webm)
  • OGG (*/ogg)
  • DASH (application/dash+xml)

Other formats could be considered in the future, such as RTP streams. . See the group page for more and to join.

Second Screen Presentation Community Group & Working Group

The Web and TV Interest Group also closely follows this work which is a good example of the evolution of an idea to a standard — it started as a proposal brought to W3C with a Community Group created for easy collaboration. A draft specification for displaying web content on secondary screens was edited and improved by a variety of stakeholders to the point where it formed the basis of a new Working Group. At this point it’s officially on the standards track and further stabilization should see it implemented and brought to a big screen new you. Meanwhile, the Community Group remains open to foster discussion and ideas for future features. See the Community Group page and the Working Group page for more.

These are just some of the recent developments and as you can see, now is a prime time for those wanting to influence and guide new standards that will affect video on the web worldwide.

by Daniel Davis at April 09, 2015 12:23 PM

April 08, 2015

W3C Blog

W3C Interview: Capital One and Tyfone on Tokenization for Web Payments

W3C’s Web Payments Interest Group is gaining momentum in its pursuit of the integration of payments into the Open Web Platform. As part of building understanding security, and the role of the Web, I am organizing a series of interviews on Web payments. In this first interview with Tom Poole and Drew Jacobs of Capital One, and Siva Narendra of Tyfone, we open with the broad question of what the Web needs to facilitate eCommerce.

Ian Jacobs (IJ): Let’s jump right in. The Web is 25 and people have been engaging in commerce from the start. But smart phones, cryptocurrencies, and stories of stolen passwords and other sensitive information have created a lot of new activity around payments. What does the Web need so that we achieve the full potential of ecommerce?

Drew Jacobs (DJ): The Web is clearly one of the most prominent channels for payments today. But there are gaps and pain points across the value chain, from the consumer to the merchant to the financial institution. The reality today is that the process for online purchases today can be convoluted, often requiring a lengthy checkout process where you provide a lot of information without a lot of security. Over the years solutions have emerged such as Amazon¹s One Click to reduce friction at checkout. Other capabilities in market today attempt to solve the challenge of convenience and security, but no clear winners have emerged. We see a lot of opportunities in the payments ecosystem for improving the lives of customers, and giving them new tools for payments.

DJ: For example, online credit card transactions today leverage static data and are therefore vulnerable. We are seeing a move toward tokenization.

Editor’s note on tokenization: In eCommerce today, consumers typically provide sensitive information such as credit card numbers directly to merchants. Many in industry see tokenization as a promising secure alternative. As an example of tokenization: when I want to pay for something with a credit card issued by my bank, the bank can send me an electronic token that I give to the merchant. To the merchant the token is a meaningless sequence of numbers, but it can be “redeemed” at my bank as part of settling payment. This indirection through a secure token benefits the consumer, who keeps control of sensitive information, and also merchants, whose liability is reduced in the case of attacks on their IT system.

DJ: But tokenization should not be a separate process from other forms of payments, we need a cohesive solution across channels. I want to add that there are also many new opportunities to provide customers with richer experiences by leveraging data available to merchants or to banks.

Siva Narendra (SN): I agree with Drew that we need to improve security. Today, ecommerce is a relatively small proportion of the world’s overall commerce; I believe around 7%. But there is a fraud rate of about .9% for ecommerce while it is .09% for other forms of transactions. So the fraud rate for ecommerce is 10 times what it is for non-ecommerce. There are a number of reasons for this, including the fact that passwords are not very effective. Tokenization, as Drew mentioned, is an important path for the future. But securely authenticating the right user is being provisioned the right token is necessary, otherwise criminals can steal tokens, too. When you have a secure element on your phone that can be used for authentication, for example, your fraud footprint decreases significantly. Tools provided by the Fido Alliance are useful, but they do not yet take into account tokenization standards already in use. I also agree with Drew that leveraging data can be powerful, but I think consumer privacy protection is going to be a big issue, and there may be pushback if users start to receive a lot of annoying alerts.

DJ: Our data can enrich the experience or make it more efficient, but I agree with Siva about the importance of privacy.

IJ: It sounds like security should be our first focus.

DJ: Yes, we need to secure both authentication of the user and then the transaction itself. But we also need to simplify payments. Today there are many disjointed ways to pay: PayPal, Credit Cards, wallets from Apple, Google, etc. There’s already a proliferation of checkout bugs for payment instruments on Web sites. Introducing more is not the answer.

Tom Poole (TP): The more checkout bugs, the less likely any particular one will be noticed. There are three different levels where payments could be improved. The first involves adding support for secure storage of information, such as via a browser plug-in. An open standard would enable multiple providers of such plugins (and of course, browsers might provide their own solutions). The next level up is the “white label container” like Softcard that could provide consistency for payment scheme providers, but still allow for innovation. The third layer would be to build on something like Apple Pay, but that would mean very little differentiation and a single vendor would drive the normalization of payments. But I don’t think many people want to invest in that sort of centralized solution. Rather, I think they will want to differentiate by building better or smoother experiences. W3C should focus on core service and leave a lot of room and levers for innovation.

IJ: Ok, so within the first two levels, where should W3C focus?

TP: Identify what is the narrowest service that must be delivered. One will involve delivery of payments credentials. As Drew and Siva have mentioned, tokenization will play an important role, and the browser (or plugin) would securely store credentials. Over time, we could see direct fill of information in the background, which could further increase security by reducing attacks like keylogging. Ultimately it would be create to standardize the interactions with merchant checkouts, so that sites have a common, secure approach that works across browsers, speeding up transactions, and enabling financial institutions opportunities to add value.

SN: I agree we should start by securing authentication and the transaction. W3C is working on a Web Crypto API that gives developers access to cryptographic operations from JavaScript. I think there’s an assumption in the browser community today that the only token that browsers will support is FIDO Alliance-based. But I think we need greater interoperability. We do need to be looking at secure elements, but chips in phones are not the only way to achieve that. There is a large existing infrastructure for security and we need to extend those capabilities to the Web to achieve scale and success.

IJ: What are the most important bits of infrastructure that you think we should be leveraging for the Web? What should we be connecting to?

SN and DJ simultaneously: Tokenization!

SN: There is no better security than hardware in someone’s hands. Browsers should allow integration of the security modules available through these portable devices. These security models might be running in the cloud, as software on a device, or in a secure element in hardware. For example, browsers might provide access to secure element via a password (and the risk and reward will be different according to application).

DJ: I agree we should be connecting to existing infrastructure, especially around tokenization, and that there should not be a single solution for all applications. Different needs will drive different solutions.

SN: In my view, anything other than tokenization to secure the transaction is probably not going to be acceptable to the financial industry. We also want to see convergence between ecommerce and in-store purchases.

IJ: What will that require?

SN: Building on top of existing tokenization standards will also facilitate convergence. There are questions about liability between physical and ecommerce transactions, but there are well-understood rules of engagement between banks.

DJ: The key point for us is that W3C has a unique opportunity to provide underlying infrastructure standards that leverage existing work around tokenization. That is the biggest pain point for us today: tokenization doesn’t exist easily online, and we need greater security online. We think browsers can play a role in bringing this together. We also see opportunities around improved authentication and identification of the real user.

IJ: What will be the biggest benefits to banks if we can do this?

DJ: Improved security, both in terms of user perception and also protection of assets. And merchants and banks will both benefit from lower rates of shopping cart abandonment if we¹re able to build an infrastructure that helps reduce the friction that exists in today¹s checkout experience.

IJ: Thank you all for your time!

by Ian Jacobs at April 08, 2015 05:12 PM

March 30, 2015

W3C Blog

Linked Data Platform WG Open Meeting

A special open meeting of the W3C Linked Data Platform (LDP) Working Group to discuss potential future work for the group. The deliverable from the workshop will be a report that the LDP WG will take into consideration as it plans its way forward.

LDP offers an alternative vision to data lockdown, providing a clean separation between software and data, so access to the data is simple and always available. If you run a business, using LDP means your vital data isn’t locked out of your reach anymore. Instead, every LDP data server can be accessed using a standard RESTful API, and every LDP-based application can be integrated. If you develop software, LDP gives you a chance to focus on delivering value while respecting your customer’s overall needs. If you are an end user, LDP software promises to give you choice and freedom in the new online world.

So how will this vision become reality? LDP 1.0 has recently become a W3C Recommendation, but there’s still a lot of work to do. Come join the conversation about where we are and what happens next, on April 21st in San Francisco.

See the event wiki page for details.

by Andrei Sambra at March 30, 2015 07:48 PM

February 23, 2015

W3C Blog

W3C, Mobile World Congress 2015 and you

The past year has seen been one of significant milestones in W3C with the highlight being that HTML5 became a Recommendation in late October of 2014. We have seen the Open Web Platform continuing to have an impact on a diverse set of Industries. Within W3C this is manifest by the continued growth of our Digital Publishing Interest Group and the launch of both our Web Payments Interest Group and Automotive Working Group. When you look at the hottest thing going on across Industries you have to recognize the Internet of Things as a movement that has gained an incredible amount of traction. This work within W3C is happening in our Web of Things Interest Group. When you look at the Exhibitors and Attendees as well as the Sessions being held during Mobile World Congress 2015 there is a huge overlap in these conversations. In fact, many of the GSMA Members that also Members of W3C are active participants in one or more of these groups – shouldn’t you be?

W3C will be at Mobile World Congress 2015 but we will not have a booth this year. We will be represented by Dominique Hazael-Massieux, our Mobile Guru, and J. Alan Bird, our Global Business Development Leader. They will be on-site at Fira Gran Via in Barcelona, Spain from the afternoon of Monday, 02 March 2015 through Thursday, 05 March 2015. If you are in any of the industries that are being touched by the Open Web Platform we’d love to have a conversation with you. If you’re not in one of the Industries mentioned above but are curious about W3C, we’d also love to have a conversation with you. The best way to schedule that is to send an e-mail to abird@w3.org.

We look forward to seeing you in Barcelona and working together to help the Web reach its Full Potential!

by J. Alan Bird at February 23, 2015 07:04 PM

February 06, 2015

W3C Blog

W3C Updates General Document License

W3C announced today an update to liberalize its general document license. The updated license —applied today to all documents the W3C has published under its general document license— permits the creation of derivative works not for use as technical specifications, and the excerpting of Code Components under the W3C Software License.

When writing Recommendations, we want to encourage contribution toward and implementation of standards. We also want to encourage consistent implementation of standards and limit the likelihood of confusion or non-interoperability from divergent versions of a single specification. The updated license works to balance these concerns. Accordingly, this update facilitates the re-use of code, including in packages licensed under the GNU GPL. It also grants clear permissions to enable those documenting software, writing guides and tutorials, and implementing specifications to use excerpts of W3C documents as authoritative source material. The copyright license does not permit the modification of W3C documents to create competing technical specifications.

This license update stems from numerous discussions in the Patents & Standards Interest Group, from experimentation in other groups such as the Second Screen Working Group and the HTML Working Group, and discussion of re-licensing of unfinished specifications. Recognizing that this change may not satisfy all users or use cases, we will continue to seek consensus on options that meet even more needs.

by Wendy Seltzer at February 06, 2015 09:45 PM

January 30, 2015

W3C Blog

This week: W3C TAG on Securing the Web, Web Of Things Interest Group, YouTube defaults to HTML5 video, etc.

This is the 23-30 January 2015 edition of a “weekly digest of W3C news and trends” that I prepare for the W3C Membership and public-w3c-digest mailing list (publicly archived). This digest aggregates information about W3C and W3C technology from online media — a snapshot of how W3C and its work is perceived in online media.

This is the last edition for a while. I hope you have all learned useful information via this digest.

W3C and HTML5 related Twitter trends

[What was tweeted frequently, or caught my attention. Most recent first]

Other news

W3C in the Press (or blogs)

10 articles since the 23-Jan digest; a selection follows. You may read all articles in our Press Clippings page.

by Coralie Mercier at January 30, 2015 04:10 PM

W3C and NETmundial

On 24 December 2014, the Inaugural Coordination Council of the NETmundial Initiative was announced.

The NMI is being set up to implement the Internet Governance NETmundial Principles and, according to their recent declaration, will start with a phase of community discussions on their exact charter and roles.

Jean-François Abramatic, former W3C Chairman and currently W3C Fellow seconded by Inria was selected (by the NMI Organizing Partners, CGI.br – Brazilian Internet Steering Committee, ICANN and the WEF - World Economic Forum) to be part of the Council.

The W3C Team, with advice from the W3C Advisory Board, has prepared a W3C and NETmundial Initiative FAQ to give our community some context on this selection.


by Daniel Dardailler at January 30, 2015 08:13 AM

January 29, 2015

ishida >> blog

Bopomofo on the Web

Three bopomofo letters with tone mark.

Light tone mark in annotation.

A key issue for handling of bopomofo (zhùyīn fúhào) is the placement of tone marks. When bopomofo text runs vertically (either on its own, or as a phonetic annotation), some smarts are needed to display tone marks in the right place. This may also be required (though with different rules) for bopomofo when used horizontally for phonetic annotations (ie. above a base character), but not in all such cases. However, when bopomofo is written horizontally in any other situation (ie. when not written above a base character), the tone mark typically follows the last bopomofo letter in the syllable, with no special handling.

From time to time questions are raised on W3C mailing lists about how to implement phonetic annotations in bopomofo. Participants in these discussions need a good understanding of the various complexities of bopomofo rendering.

To help with that, I just uploaded a new Web page Bopomofo on the Web. The aim is to provide background information, and carry useful ideas from one discussion to the next. I also add some personal thoughts on implementation alternatives, given current data.

I intend to update the page from time to time, as new information becomes available.

by r12a at January 29, 2015 12:07 PM

January 23, 2015

W3C Blog

This week: W3C WoT initiative, Accessibility Research, Cory Doctorow Rejoins EFF, etc.

This is the 16-23 January 2015 edition of a “weekly digest of W3C news and trends” that I prepare for the W3C Membership and public-w3c-digest mailing list (publicly archived). This digest aggregates information about W3C and W3C technology from online media —a snapshot of how W3C and its work is perceived in online media.

W3C and HTML5 related Twitter trends

[What was tweeted frequently, or caught my attention. Most recent first]

Other news

W3C in the Press (or blogs)

4 articles since the 16-Jan digest; a selection follows. You may read all articles in our Press Clippings page.

by Coralie Mercier at January 23, 2015 02:19 PM

January 18, 2015

ishida >> blog

Bengali picker & character & script notes updated

Screen Shot 2015-01-18 at 07.42.56

Version 16 of the Bengali character picker is now available.

Other than a small rearrangement of the selection table, and the significant standard features that version 16 brings, this version adds the following:

  • three new buttons for automatic transcription between latin and bengali. You can use these buttons to transcribe to and from latin transcriptions using ISO 15919 or Radice approaches.
  • hinting to help identify similar characters.
  • the ability to select the base character for the display of combining characters in the selection table.

For more information about the picker, see the notes at the bottom of the picker page.

In addition, I made a number of additions and changes to Bengali script notes (an overview of the Bengali script), and Bengali character notes (an annotated list of characters in the Bengali script).

About pickers: Pickers allow you to quickly create phrases in a script by clicking on Unicode characters arranged in a way that aids their identification. Pickers are likely to be most useful if you don’t know a script well enough to use the native keyboard. The arrangement of characters also makes it much more usable than a regular character map utility. See the list of available pickers.

by r12a at January 18, 2015 08:10 AM

January 16, 2015

W3C Blog

This week: W3C TAG election, HTML5 Japanese CG, W3C in figures (2014), etc.

This is the 9-16 January 2015 edition of a “weekly digest of W3C news and trends” that I prepare for the W3C Membership and public-w3c-digest mailing list (publicly archived). This digest aggregates information about W3C and W3C technology from online media —a snapshot of how W3C and its work is perceived in online media.

W3C and HTML5 related Twitter trends

[What was tweeted frequently, or caught my attention. Most recent first]

Net Neutrality & Open Web

  • n/a

W3C in the Press (or blogs)

4 articles since the 9-Jan digest; see one below. You may read all articles in our Press Clippings page.

by Coralie Mercier at January 16, 2015 03:00 PM

January 13, 2015

ishida >> blog

Initial letter styling in CSS


The CSS WG needs advice on initial letter styling in non-Latin scripts, ie. enlarged letters or syllables at the start of a paragraph like those shown in the picture. Most of the current content of the recently published Working Draft, CSS Inline Layout Module Level 3 is about styling of initial letters, but the editors need to ensure that they have covered the needs of users of non-Latin scripts.

The spec currently describes drop, sunken and raised initial characters, and allows you to manipulate them using the initial-letter and the initial-letter-align properties. You can apply those properties to text selected by ::first-letter, or to the first child of a block (such as a span).

The editors are looking for

any examples of drop initials in non-western scripts, especially Arabic and Indic scripts.

I have scanned some examples from newspapers (so, not high quality print).

In the section about initial-letter-align the spec says:

Input from those knowledgeable about non-Western typographic traditions would be very helpful in describing the appropriate alignments. More values may be required for this property.

Do you have detailed information about initial letter styling in a non-Latin script that you can contribute? If so, please write to www-style@w3.org (how to subscribe).

by r12a at January 13, 2015 12:13 PM

January 12, 2015

W3C Blog

2014 in figures

Infographic showing numbers for W3C activity in 2014

Text only version

Notes & assumptions

  • Average orbital height of the International Space Station is estimated at 415 km (258 miles).
  • The world’s largest hypostyle hall is the temple of Amon-Re at Karnak, in Egypt.
  • Sumo wrestler weight is 149 kg (328 lb), calculated as the average of the current three yokozuna (grand champions) — Hakuho, Harumafuji and Kakuryu.
  • Emails are assumed to be individually printed on A4 paper (80 g/m2).
  • Airliner capacity is based on Boeing 737-800 planes with 189 seats in a single-class layout.
  • Thank you to Openclipart for being a great source of SVG images.

by Daniel Davis at January 12, 2015 06:00 AM

January 09, 2015

W3C Blog

Last week: W3C and OGC to work on Spatial Data on the Web, WAI Tutorials, W3Training, etc.

This is the 2-9 January 2015 edition -after a hiatus on 19 December 2014- of a “weekly digest of W3C news and trends” that I prepare for the W3C Membership and public-w3c-digest mailing list (publicly archived). This digest aggregates information about W3C and W3C technology from online media —a snapshot of how W3C and its work is perceived in online media.

W3C and HTML5 related Twitter trends

[What was tweeted frequently, or caught my attention. Most recent first]

Net Neutrality

  • Ars Technica: Title II for Internet providers is all but confirmed by FCC chairmanFederal Communications Commission (FCC) Chairman Tom Wheeler implied that Title II of the Communications Act will be the basis for new net neutrality rules governing the broadband industry. […] proposed rules […] will be circulated within the Commission on February 5 and voted on on February 26.

W3C in the Press (or blogs)

5 articles since the last digest; a selection follows. You may read all articles in our Press Clippings page.

by Coralie Mercier at January 09, 2015 04:41 PM

January 06, 2015

ishida >> blog

The Combining Character Conundrum

I’m struggling to show combining characters on a page in a consistent way across browsers.

For example, while laying out my pickers, I want users to be able to click on a representation of a character to add it to the output field. In the past I resorted to pictures of the characters, but now that webfonts are available, I want to replace those with font glyphs. (That makes for much smaller and more flexible pages.)

Take the Bengali picker that I’m currently working on. I’d like to end up with something like this:


I put a no-break space before each combining character, to give it some width, and because that’s what the Unicode Standard recommends (p60, Exhibiting Nonspacing Marks in Isolation). The result is close to what I was looking for in Chrome and Safari except that you can see a gap for the nbsp to the left.


But in IE and Firefox I get this:


This is especially problematic since it messes up the overall layout, but in some cases it also causes text to overlap.

I tried using a dotted circle Unicode character, instead of the no-break space. On Firefox this looked ok, but on Chrome it resulted in two dotted circles per combining character.

I considered using a consonant as the base character. It would work ok, but it would possibly widen the overall space needed (not ideal) and would make it harder to spot a combining character by shape. I tried putting a span around the base character to grey it out, but the various browsers reacted differently to the span. Vowel signs that appear on both sides of the base character no longer worked – the vowel sign appeared after. In other cases, the grey of the base character was inherited by the whole grapheme, regardless of the fact that the combining character was outside the span. (Here are some examples ে and ো.)

In the end, I settled for no preceding base character at all. The combining character was the first thing in the table cell or span that surrounded it. This gave the desired result for the font I had been using, albeit that I needed to tweak the occasional character with padding to move it slightly to the right.

On the other hand, this was not to be a complete solution either. Whereas most of the fonts I planned to use produce the dotted circle in these conditions, one of my favourites (SolaimanLipi) doesn’t produce it. This leads to significant problems, since many combining characters appear far to the left, and in some cases it is not possible to click on them, in others you have to locate a blank space somewhere to the right and click on that. Not at all satisfactory.


I couldn’t find a better way to solve the problem, however, and since there were several Bengali fonts to choose from that did produce dotted circles, I settled for that as the best of a bad lot.

However, then i turned my attention to other pickers and tried the same solution. I found that only one of the many Thai fonts I tried for the Thai picker produced the dotted circles. So the approach here would have to be different. For Khmer, the main Windows font (Daunpenh) produced dotted circles only for some of the combining characters in Internet Explorer. And on Chrome, a sequence of two combining characters, one after the other, produced two dotted circles…

I suspect that I’ll need to choose an approach for each picker based on what fonts are available, and perhaps provide an option to insert or remove base characters before combining characters when someone wants to use a different font.

It would be nice to standardise behaviour here, and to do so in a way that involves the no-break space, as described in the Unicode Standard, or some other base character such as – why not? – the dotted circle itself. I assume that the fix for this would have to be handled by the browser, since there are already many font cats out of the bag.

Does anyone have an alternate solution? I thought I heard someone at the last Unicode conference mention some way of controlling the behaviour of dotted circles via some script or font setting…?

Update: See Marc Durdin’s blog for more on this topic, and his experiences while trying to design on-screen keyboards for Lao and other scripts.

by r12a at January 06, 2015 05:28 PM

January 05, 2015

ishida >> blog

Khmer character picker v16


I have uploaded a new version of the Khmer character picker.

The new version uses characters instead of images for the selection table, making it faster to load and more flexible. If you prefer, you can still access the previous version.

Other than a small rearrangement of the default selection table to accomodate fonts rather than images, and the significant standard features that version 16 brings, there are no additional changes in this version.

For more information about the picker, see the notes at the bottom of the picker page.

About pickers: Pickers allow you to quickly create phrases in a script by clicking on Unicode characters arranged in a way that aids their identification. Pickers are likely to be most useful if you don’t know a script well enough to use the native keyboard. The arrangement of characters also makes it much more usable than a regular character map utility. See the list of available pickers.

by r12a at January 05, 2015 10:12 AM

Devanagari, Gurmukhi & Uighur pickers available




I have updated the Devanagari picker, the Gurmukhi picker and the Uighur picker to version 16.

You may have spotted a previous, unannounced, version of the Devanagari and Uighur pickers on the site, but essentially these versions should be treated as new. The Gurmukhi picker has been updated from a very old version.

In addition to the standard features that version 16 of the character pickers brings, things to note include the addition of hints for all pickers, and automated transcription from Devanagari to ISO 15919, and vice versa for the Devanagari picker.

For more information about the pickers, see the notes at the bottom of the relevant picker page.

About pickers: Pickers allow you to quickly create phrases in a script by clicking on Unicode characters arranged in a way that aids their identification. Pickers are likely to be most useful if you don’t know a script well enough to use the native keyboard. The arrangement of characters also makes it much more usable than a regular character map utility. See the list of available pickers.

by r12a at January 05, 2015 09:45 AM

January 04, 2015

ishida >> blog

More picker changes: Version 16

A couple of posts ago I mentioned that I had updated the Thai picker to version 16. I have now updated a few more. For ease of reference, I will list here the main changes between version 16 pickers and previous versions back to version 12.

  • Fonts rather than graphics. The main selection table in version 12 used images to represent characters. These have now gone, in favour of fonts. Most pickers include a web font download to ensure that you will see the characters. This reduces the size and download time significantly when you open a picker. Other source code changes have reduced the size of the files even further, so that the main file is typically only a small fraction of the size it was in version 14.

    It is also now possible, in version 16, to change the font of the main selection table and the font size.

  • UI. The whole look and feel of the user interface has changed from version 14 onwards, and includes useful links and explanations off the top of the normal work space.

    In particular, the vertical menu, introduced in version 14, has been adjusted so that input features can be turned on and off independently, and new panels appear alongside the others, rather than toggling the view from one mode to another. So, for example, you can have hints and shape-based selectors turned on at the same time. When something is switched on, its label in the menu turns orange, and the full text of the option is followed by a check mark.

  • Transcription panels. Some pickers had one or more transcription views in versions below 16. These enable you to construct some non-Latin text when working from a Latin transcription. In version 16 these alternate views are converted to panels that can be displayed at the same time as other information. They can be shown or hidden from the vertical menu. When there is ambiguity as to which characters to use, a pop up displays alternatives. Click on one to insert it into the output. There is also a panel containing non-ASCII Latin characters, which can be used when typing Latin transcriptions directly into the main output area. This panel is now hidden by default, but can be easily shown from the vertical menu.

  • Automated transcription. Version 16 pickers carry forward, and in some cases add, automated transcription converters. In some cases these are intended to generate only an approximation to the needed transcription, in order to speed up the transcription process. In other cases, they are complete. (See the notes for the picker to tell which is which.) Where there is ambiguity about how to transcribe a sequence of characters, the interface offers you a choice from alternatives. Just click on the character you want and it will replace all the options proposed. In some cases, particularly South-East Asian scripts, the text you want to transcribe has to be split into syllables first, using spaces and or hyphens. Where this is necessary, a condense button it provided, to quickly strip out the separators after the transcription is done.

  • Layout The default layout of the main selection table has usually been improved, to make it easier to locate characters. Rarely used, deprecated, etc, characters appear below the main table, rather than to the right.

  • Hints Very early versions of the pickers used to automatically highlight similar and easily confusable characters when you hovered over a character in the main selection table. This feature is being reintroduced as standard for version 16 pickers. It can be turned on or off from the vertical menu. This is very helpful for people who don’t know the script well.

  • Shape-based selection. In previous versions the shape-based view replaced the default view. In version 16 the shape selectors appear below the main selection table and highlight the characters in that table. This arrangement has several advantages.

  • Applying actions to ranges of text. When clicking on the Codepoints and Escapes buttons, it is possible to apply the action to a highighted range of characters, rather than all the characters in the output area. It is also possible to transcribe only highlighted text, when using one of the automated transcription features.

  • Phoneme bank. When composing text from a Latin transcription in previous versions you had to make choices about phonetics. Those choices were stored on the UI to speed up generation of phonetic transcriptions in addition to the native text, but this feature somewhat complicated the development and use of the transcription feature. It has been dropped in version 16. Hopefully, the transcription panels and automated transcription features will be useful enough in future.

  • Font grid. The font grid view was removed in version 16. It is of little value when the characters are already displayed using fonts.

  • About pickers: Pickers allow you to quickly create phrases in a script by clicking on Unicode characters arranged in a way that aids their identification. Pickers are likely to be most useful if you don’t know a script well enough to use the native keyboard. The arrangement of characters also makes it much more usable than a regular character map utility. See the list of available pickers.

by r12a at January 04, 2015 12:53 PM

December 26, 2014

ishida >> blog

Language Subtag Lookup tool updated

This update to the Language Subtag Lookup tool brings back the Check function that had been out of action since last January. The code had to be pretty much completely rewritten to migrate it from the original PHP. In the process, I added support for extension and private use tags, and added several more checks. I also made various changes to the way the results are displayed.

Give it a try with this rather complicated, but valid language tag: zh-cmn-latn-CN-pinyin-fonipa-u-co-phonebk-x-mytag-yourtag

Or try this rather badly conceived language tag, to see some error messages: mena-fr-latn-fonipa-biske-x-mylongtag-x-shorter

The IANA database information is up-to-date. The tool currently supports the IANA Subtag registry of 2014-12-17. It reports subtags for 8,081 languages, 228 extlangs, 174 scripts, 301 regions, 68 variants, and 26 grandfathered subtags.

by r12a at December 26, 2014 08:11 AM