Web Fonts Working Group Teleconference -- 04 Nov 2019

<Vlad> scribenick: Garret

Vlad want to look at the open action items.

Vlad 212, 213, 214 are all belonging to Myles

Myles been bmaking good progress but don't have anything to present

Myles first is interacting with the model the Google team is creating.

Myles the other two are research based, making progress but nothing to present.

Vlad: when would you ahve something?

Myles: couple of weeks

Vlad: how about Dec. 2nd?

Myles: sounds good

<myles> https://www.w3.org/Fonts/WG/track/actions/open

<myles> Garret: I've got a basic implementation of patch and subset client and server. You can simulate a session and have a font get enriched over timed. Next goal is to publish to W3C's repo that Chris made. We also want to run an analysis against it, which would allow myles to plug in his stuff too.

<myles> Vlad: Was there any updated on the status of the subsetter?

<myles> Garret: Yes! lots of updates. We've been workign on the various layout types. The lastest one is GSUB type 6.

<myles> Vlad: Do they have their own repo?

<myles> Garret: It's on Harfbuzz's open source repo.

https://github.com/harfbuzz/harfbuzz

<myles> Vlad: Thank you.

<myles> Garret: No progress on the analysis framework, but I'd like to start that this week. Patch and subset is partly implemented. Enough that we can start doing basic testing. No progress on current state of the art transfer methods. December 2nd date would be fine to revisiting these to see progress.

<myles> Vlad: *cautiously* okay....

Vlad: had early discussion about using monotype's fonts for testing.
... want to create a test user account + what fonts are needed.

<myles> Vlad: I had a discussion internally to give access to our font stack in use. The question that was asked was, what if we gave a test user account, which fonts would be granted. This would need to be not just <missed> fonts but also downloaded fonts.

Vlad: would need to be testable not only as web fonts but as downloaded fonts as well. For example for Myle's approach.

Myles: working on two pieces, the compression does need the font data, and the sorting only needs to have info on glyph sizes.
... if you couldn't provide the font data we could get part way there with cmap, loca.
... if there's a lot of composite glyphs that's an issue, but if there's not it's fine.

Vlad: that depends on the type of font, for example some CJK files will have lot's of composites.

Myles: in that case if the glyphs can be flattened that would work.

Vlad: still figuring out what we can share.

Myles: would be interested in hearing what the script breakdown for the available fonts is.
... primarily interested in CJK since they are the biggest.

Vlad: CJK files are most likely to have the largest # of glyphs as composites.
... updated all AI's to dec. 2nd and that will likely be the date of our next call.
... unless something else comes up that needs discussion.
... next agenda item is web page corpus.

Myles: been doing research, to know if a font is sorted well we need a corpus.
... been a long time since the Google corpus has been presented.
... we'll need some stuff in the mean time to do work for my research.
... mine is research agnostic. Can plug in other corpuses. I used a web crawler to grab a few thousand pages.
... would likely need more.

<chris> yes, we certainly should

Myles: two questions, should we investigate a corpus beyond the Google?

Vlad: yes I think we should, not sure how we can get there.

Myles: one way is to use wikipedia, or run a web crawler as the w3c, or third use internet archive.
... don't want perfect to be enemy of the good. Should pursue some idea.
... wikipedia publishes dumps of the whole site.
... don't think we have to operate at a huge scale. Was able to gather a few thousand pages manually.
... doesn't need to be up to date. If our content is stale that's fine.

Garret: suggest looking at http archive.

Chris: may have a heavy english focus.

Persa: another piece that's going to be interesting beyond length. Is how the content shows up in different orders. Wikipedia is mostly static. In our implementation highly dynamic pages are particularily interesting.
... don't know if we want to also capture that in the analysis.

Myles: we should, that should be a stated goal.

Garret: can hopefully simulate dynamic pages by breaking static ones into pieces.

Vlad: some of the Yahoo publications have comment sections.
... looking at the sequence of comments you cna recreate the dynamic nature of it.
... Yahoo is an example, but any page with comment sections gives us a clue how it was modified over time.

Myles: I can do an investigation on this, but already have 3 action items so other volunteers?
... Ok guess add it to my queue and give me some extra time to deliver it.

Jason: I'm happy to help if I can.

Myles: would you like to drive or just assist me?

Jason: probably don't know enough to drive.

Vlad: will add action item for Myles.

action Myles investigate the creation of a data corpus for use in the analysis.

<trackbot> Created ACTION-220 - Investigate the creation of a data corpus for use in the analysis. [on Myles Maxfield - due 2019-11-11].

Vlad: due date?

Myles: after Dec. 2nd.
... last monday I work in this year is the 16th. So either the 16th or after that.

Vlad: Ok will set 16th, we can push if needed.
... that was the last topic.

<myles> https://github.com/litherum

Garret: github username is garretrieger

<Vlad> my github username is vlevantovsky

<Persa_Zula_> my github username is pzula

- DRAFT -

Web Fonts Working Group Teleconference

04 Nov 2019

Attendees

Contents

Summary of Action Items

Summary of Resolutions

Scribe.perl diagnostic output