13:52:37 RRSAgent has joined #poiwg 13:52:37 logging to http://www.w3.org/2011/09/08-poiwg-irc 13:52:39 RRSAgent, make logs public 13:52:39 Zakim has joined #poiwg 13:52:41 Zakim, this will be UW_POI 13:52:41 ok, trackbot; I see UW_POI(POIWG)10:00AM scheduled to start in 8 minutes 13:52:42 Meeting: Points of Interest Working Group Teleconference 13:52:42 Date: 08 September 2011 13:52:46 zakim, ping me in 6 13:52:46 ok, matt 13:56:25 hey matt 13:57:01 great thanks...you? 13:58:46 matt, you asked to be pinged at this time 13:58:51 zakim, thanks 13:58:51 you are very welcome, matt 13:59:02 sure...lets talk later 14:00:12 zakim, dial matt-voip 14:00:12 ok, matt; the call is being made 14:00:13 UW_POI(POIWG)10:00AM has now started 14:00:15 +Matt 14:01:09 +??P11 14:01:09 cperey has joined #poiwg 14:01:11 +??P14 14:01:26 zakim, ??p11 is ahill2 14:01:30 +ahill2; got it 14:01:31 zakim, ??p14 is robman 14:01:32 +robman; got it 14:01:44 ahill2 has joined #poiwg 14:02:14 + +1.617.848.aaaa 14:02:29 848.aaaa is Christine 14:02:31 zakim, aaaa is cperey 14:02:35 +cperey; got it 14:03:42 rsingh2 has joined #poiwg 14:04:03 + +1.212.939.aabb 14:04:25 zakim, aabb is Henning 14:04:25 +Henning; got it 14:05:29 + +1.617.764.aacc 14:05:34 called BuildAR 14:05:38 zakim, aacc is rsingh2 14:05:38 +rsingh2; got it 14:07:20 zakim, who is on the phone? 14:07:20 On the phone I see Matt, ahill2, robman, cperey, Henning, rsingh2 14:07:23 danbri has joined #poiwg 14:08:23 zakim, mute me 14:08:23 Matt should now be muted 14:08:25 Scribe: Matt 14:08:40 Topic: Identifying and Categorizing POIs, presented by Henning Schulzrinne 14:09:02 -> http://lists.w3.org/Archives/Member/member-poiwg/2011Sep/att-0005/poi-urn.pdf Slides 14:09:10 Henning: pulled some use cases out of the draf 14:09:14 zakim, who is noisy? 14:09:25 matt, listening for 10 seconds I heard sound from the following: ahill2 (15%) 14:09:32 Henning: The ones I pulled out seemed to be about finding categories of things. 14:09:37 zakim, mute ahill2 temporarily 14:09:37 ahill2 should now be muted 14:09:48 zakim, who is noisy? 14:09:52 ahill2 should now be unmuted again 14:09:59 matt, listening for 11 seconds I heard sound from the following: 6 (77%) 14:10:35 Henning: How can we divide the millions of POIs into manageable categories for searching? 14:10:45 Henning: Some problems we won't be solving: 14:11:22 Henning: Properties of POIs vs category, e.g. "restaurant that takes credit cards" isn't a category. 14:12:29 restaurant a favorite category :-) 14:12:31 danbri has joined #poiwg 14:12:37 Henning: The distinction between what is or isn't a category is somewhat arbitrary. You can claim that "Italian Restaurant" could be a category, or it could be a cuisine attribute of a restaurant. 14:12:54 Henning: It's fuzzy, and one has to make pragmatic choices about what user's typically expect. 14:13:27 Henning: The two characteristics I see defining categories is: 1. ?? and 2. that categories are not interchangeable 14:13:59 Henning: A "gas station" and "restaurant" is not interchangeable, but even this gets tricky, "synagogue" and "Christian Church". 14:14:35 Henning: North American Industry Classification System (NAICS) is one standard, the only one I've noticed. Based on Census. 14:15:13 Henning: Very much comes out of industry, designed for classical manufacturing type industries, i.e. "this establishment produces cutlery" 14:15:23 Henning: It struggles today with identifying services. 14:16:59 Henning: While it's a fine example of categories, it isn't what we'd want to use though. Restaurants for instance have just two classifications: full-service and limited service. 14:17:38 Henning: That may be okay, but is somewhat limiting, and not really what I'd expect users to care about. 14:18:06 Henning: Great from a statistical perspective though. 14:18:20 Henning: Many of things you'd want to look up, aren't in the system at all 14:18:21 zakim, mute me 14:18:24 Matt was already muted, matt 14:18:54 Henning: I tried some things that are common from GPS POIs and they don't appear at all in NAICS, e.g. ATMs, wifi hotspots, monuments, etc. 14:19:27 zakim,unmute me 14:19:27 Matt should no longer be muted 14:19:32 zakim,mute me 14:19:32 Matt should now be muted 14:20:23 Henning: One alternative is to say we've got Google, just use free-text. That works, and is probably better than many alternatives, but free text is also hard to translate into other languages, and the same service has many names. 14:20:41 Henning: e.g. ATM, cash machine, automated teller machine, etc 14:21:36 Henning: Then there are also things like distinguishing between a diner, a café, coffee shop, etc. McDonalds calls itself a restaurant, but most of us would think of it as another term. 14:21:41 McDonalds =? Restaurant? 14:21:51 Henning: So you might get lots of things that you wouldn't expect showing up on a list. 14:22:10 Henning: There's also properties, such as "public", "university", of library. 14:22:27 Henning: and hierarchy is missing too: French vs French Restaurant. 14:22:50 Henning: Another option is map overlay labels. These are usually used to label topographical features and not services, like restaurants, ATMs. 14:22:58 Henning: GPS POIs are much more consumer applicable. 14:23:28 Henning: As far as I can, there isn't any standardization for POI labels, at best informally standardized. I haven't confirmed with every vendor though. 14:24:11 Henning: Some of the categories are a bit odd too, sometimes very broad: all government services labeled as one, even if it's everything from libraries to prisons, but sometimes libraries are separate etc. It's inconsistent and not clear if it's official or just made up. 14:24:20 Henning: And lastly, the Yellow Pages labels. 14:24:53 Henning: I've heard every region does there own thing and often picked in a way that businesses would appear in as many places as possible, rather than where users might expect them. 14:25:30 Henning: Yellow Pages wouldn't contain bathrooms and ATMs or other things without a phone number. 14:25:50 Henning: Coming from another direction, we had a similar problem when we started redesigning the emergency calling system in the US. 14:26:24 Henning: One of the big problems is that every country pretty much has, for historical reasons, used a different digit pattern. 14:27:27 Henning: Confusing scenario where in some cases a number is used generically for all services, i.e. 911 (excepting poison control), but other countries have different numbers for police, fire, ambulance, etc. So, there was no hope of a standard for a number identifier. 14:27:54 Henning: We started on something different, RFC 5031. It's a URN for services. urn:sos 14:28:06 Henning: They're extensible via IANA. 14:28:22 -> http://tools.ietf.org/html/rfc5031 RFC 5031 14:28:41 s/sos/service:sos/ 14:28:48 Henning: These are internal to the system, used for call routing. 14:29:10 Henning: Allows devices to have an entry that gets to the right people. 14:29:44 Henning: NG911 and NG122 have been moving in this direction. 911 it seems to be accepted and getting deployment. 14:30:12 Henning: We determined that this was extensible to other services too. 14:30:45 Henning: N11 services -- always reserved 3 digit numbers other than 0 and 1 and end in 11, e.g. 211, 311 14:31:41 611 is used by the cell company for their information 14:32:07 Henning: In that same spirit, we explored extending it to non-communication services. 14:33:38 Henning: Designed for consumer use, things that we can label, e.g. "food", "fuel", "business", "communication" (wifi hotspots, internet café). Design not to have thousands of categories, but still similar to a GPS POI finder. 14:34:19 Henning: 13 top level categories that then have further detail within. 14:34:31 Henning: e.g. transportation.airport 14:34:52 Henning: So, remaining issues and to-dos: 14:35:23 Henning: Are there systems out there already? Can we extend it without breaking? Maintained in some way that there is coherence in the labeling? If not, and we go forward with the URN model, would we register these things? With IANA? 14:35:52 Henning: IANA would be just a database, would we sub-delegate that? 14:35:58 Henning: How would maintenance be done on that? 14:36:23 Henning: How is it maintained? I'm partial to something like the Olson time zone database model. 14:37:08 Henning: Not an official group, or a government thing. It used to just be one person, but now there's a mailing list with consensus process. 14:37:28 Henning: It will be maintained on a long term basis through IANA. 14:37:44 -> http://tools.ietf.org/html/draft-lear-iana-timezone-database-02.html IANA Procedures for Maintaining the Timezone Database 14:38:25 Henning: So that's my brief summary of what I'm trying to do. 14:38:56 Henning: So two things: is there obviously related work that someone else has done? Or if not is there a group of people that might be interested in forming a nucleus of an organization to do this? 14:39:15 Henning: If people take it up, yes, great, if not the harm is fairly limited. 14:39:19 zakim, unmute me 14:39:19 Matt should no longer be muted 14:39:39 robman: Two questions: how would URNs support non-English? 14:39:48 Henning: Are you familiar with IETF i18n document? 14:40:33 Henning: This falls in the category of protocol label, not meant for human consumption. Each label would be called something different in each language, but as a protocol label for routing and querying is that you should stick to one language. 14:41:00 Henning: At the IETF, it's been English. There's nothing inherently in the labels to prevent other languages, modulo the i18n URN problems. 14:41:39 robman: Because it's much more into the meaning than normal labels, it'd have labels that mean things just in certain cultures. 14:42:19 Henning: Right, if there was a category that doesn't have a good label in English, it's still plausible. It's not intended to preclude anything, but with it being a protocol label there's less concern over it at the moment. 14:42:54 robman: This is very top down. What if people could create random labels? And if they're not used, they'll die off or be redundant anyway? 14:43:51 Henning: There's likely not going to be a way to solve the language labeling problem. I'm not opposed to the free text model. Because of the need for i18n, and because it's not used as a search term, that I have some doubt that free text will be successful. 14:44:46 [[I was just thinking of schema.org given this conversation, and now I see: http://www.schema.org/Place]] 14:45:03 Henning: I'm not thinking the urn would be typed into a search for instance. 14:46:21 Henning: Right now we don't have network databases, but these static ones that vendors maintain. 14:46:31 robman: I think the international bridging point you made is quite a good argument for it. 14:46:49 ahill2: How do you see this translation between labels and URNs happening? 14:47:25 ahill2: I can imagine a world where Google says "these are the categorizations" and translate them to URNs, but for an ordinary individual, what services are available to them to translate their terms and searches into these categories? 14:48:23 Henning: I imagine the software built by say a tourist app or a GPS vendor, would in turn, depending on the appropriate interface, would build some subset of these labels into their system and translate and expose them appropriately. 14:50:57 ahill2: You think this translation between common labels and URNs happens ad-hoc? No central database of any sort? 14:51:26 Henning: I hadn't thought that far ahead, but if we were more ambitious, there would be a description for each term, geared towards localization. 14:51:41 Henning: Nothing prevents that if we get enough people together. 14:53:04 ahill2: Isn't that what is happening today with Google? Isn't it the translator today? 14:53:56 ahill2: If this knowledge is being crowd sourced somewhere, Google or OSM, we should use that. 14:54:03 Henning: I was unimpressed by OSM process. 14:55:40 ahill2: I'm envisioning a search where these urns are available, e.g. every result has a category. That would be neat, because then the search would be the translator between free text and labels and urns. 14:56:17 Henning: If someone could do that I'd be delighted. 14:56:21 http://en.wikipedia.org/wiki/Web_Ontology_Language 14:56:32 robman: What about these taxonomy communities that are working in their own domains, like OWL. 14:56:42 robman: It's not just locations we're talking about, but we're cross domain here. 14:57:04 Henning: Yes, that's part of why I was asking for pointers to these communities. Once we get into the ontology side more, that would be helpful. 14:58:03 Henning: I'm unsure that the type of work, if it's property attributes, etc, if it's directly applicable, or if sub-pieces of that can be pulled out. We don't, from a POI perspective, want a complete ontology that crosses categories and properties. 14:58:48 Henning: If there are communities we should know about, please let me know. We looked about a year ago into this, building a system than could combine ontologies, e.g. find a specific movie and dinner with a cuisine type. We didn't find anything then, but we might have looked in the wrong place. 14:59:17 ahill2: Can we remind ourselves of some of the other categorization efforts that we discussed. I believe Library of Congress was discussed and a number of them had URLs involved. 15:00:09 Henning: Any pointers you have, please pass along. One difference between identifying specific objects and categorization is one-to-one vs one-to-many. 15:00:14 zakim, unmute me 15:00:14 Matt was not muted, matt 15:00:50 Raj: Geonames 15:00:56 ahill2: Does that do categorization? 15:01:12 rsingh2: Yes, but they're just categorizing places, not business classifications. 15:01:26 rsingh2: They started with USGS classification system, but theirs is much smaller problem than ours. 15:01:44 karls has joined #poiwg 15:01:48 Henning: Looking at geonames, they've got postal codes as the lowest level I see. 15:01:50 rrsagent, draft minutes 15:01:50 I have made the request to generate http://www.w3.org/2011/09/08-poiwg-minutes.html matt 15:01:54 rrsagent, make logs public 15:02:06 Henning: School, post office, cemetery, etc. Not sure how many features they have. 15:02:12 rsingh2: That's right out of USGS. 15:02:18 hi 15:02:40 ahill2: What did we propose to use? 15:02:56 Henning: The doc is all I know is from the doc NAICS. 15:03:44 rsingh2: I think coming up with a single country classification scheme is easy, but what's harder is a POI system like for AT&T, where they want you to search for say where to get phone cards. 15:03:48 + +1.312.894.aadd 15:03:52 rsingh2: That's at one type of business in the USA, but another type in other countries. 15:04:22 rsingh2: Reconciling that between countries is very difficult. 15:04:33 karls: There's a ton of work on brand binding and chain binding to help that work. 15:04:43 karls: That side-steps classification though. 15:05:06 karls: Using NAICS is mostly for information exchange, most of the time these are hand tuned by the app devs. Many schemes that are app specific. 15:05:18 karls: The low-level standards are just used for hand-off so people can do mappings. 15:05:28 Henning: That's what I've seen as well. 15:06:04 karls: It's useful to carry around NAICS codes in terms of the spec, as our spec is about exchanging information, but in terms of customer facing stuff, it's pretty open ended. Our model should be we'll support the structure, but you make it up. 15:06:15 rsingh2: My instinct is similar, we're not ready to tackle that in version 1. 15:06:20 zakim, aadd is rsingh2 15:06:20 +rsingh2; got it 15:06:34 zakim, rsingh2 is karls 15:06:34 +karls; got it 15:06:38 zakim, who is on the phone? 15:06:38 On the phone I see Matt, ahill2, robman, cperey, Henning, karls, rsingh2.a 15:07:01 rsingh2: We might be overstepping the bounds of what innovative developers would build. 15:07:21 karls: Typically these systems were done for handhelds, or constrained environments. I think search trumps all though. 15:07:33 karls: The conversation at Microsoft/Nokia/NavTEQ is do we care about categorization anymore? 15:08:14 Henning: The search experience is good from large providers, but it requires a fair amount of user skill to get what you want. Looking at Restaurant, you have two things like Google maps, but also specific ones like Urban Spoon. 15:08:25 Henning: There's more relevant hits in the latter. 15:08:58 karls: Here's what I see: one end there are proprietary category systems, on the other there's web page crawling for open ended search. 15:09:17 karls: In between, you've got a lot of POI gazetteers who are doing meta tagging, as it facilitates parametric search. 15:09:42 karls: The middle ground is the tagging. I thought the spec addressed that capability to open endedly do the metatagging. 15:09:56 ahill2: Can you elucidate on that a bit more karls? 15:10:11 karls: Take a service like Open Table, where they have restaurant categories and sub-categories. 15:10:55 +1 to link based structured data 8) 15:10:56 karls: You're not going to get that information out of scraping a web page. That information is best consumed by an application by OT if a POI has a pre-set, open-ended list of terms that describe it well. It's tantamount to the meta tag on HTML pages. 15:11:12 danbri has joined #poiwg 15:11:39 karls: Gazetteers are doing field ops, web scraping, crowd sourcing, etc, to distill down to ten or twenty keywords that are the most descriptive to put in the POI. 15:11:53 parametric search = faceted search 15:11:54 http://en.wikipedia.org/wiki/Faceted_search 15:11:56 karls: Typically the app tier puts a parametric search on top of that: hours, beer, etc. 15:12:17 ahill2: We're talking about somewhere between category only search and free text. 15:12:36 karls: You could argue that it's all categories or parameters, e.g. 24 hour restaurant could be a category or a property. 15:12:49 rsingh2: The popular term would be faceted search. 15:13:14 -karls 15:13:15 Henning: Close, but not quite, you might have things like types of credit cards accepted, and it might be labels drawn from a set, or specific information that isn't categorized: e.g. open hours. 15:13:18 I'm late for another call. Bye all. 15:13:52 robman: That's why we were thinking open ended links, because it is so closely tied to the users' mind space when they search. 15:14:02 robman: If we approach it as a categorization problem we have to approach it differently. 15:14:19 Henning: I think I differ on that. If you look at OT, they do do categorization, they do much better than just crowd source tagging. 15:14:44 karls: I think what we want to do is to be able to have OT exchange their POIs outside their business sphere. 15:15:07 karls: So, we want to make sure the spec can support rich and proprietary tagging, without defining the facets ourselves. 15:15:31 Henning: Why not some of the facets? I think I've demonstrated that some are viable. 15:16:03 ahill2: One of the things we've been careful about is making sure that there are multiple categorizations that could apply to a POI. 15:16:23 Henning: It could have multiple category schemes too. 15:18:10 ahill2: In your proposal, are you open to the idea that NAICS ends up adding some of these categorizations that are facets as opposed to routing to a specific business? 15:18:43 ahill2: That is: if there were a number of different categorizaties that a business has, would NAICS be the appropriate place to build up a category? 15:19:18 danbri has joined #poiwg 15:19:24 Henning: I'm not part of NAICS, but given that they're part of the census, I imagine they wouldn't be looking at these properties. I can't say what they should do, but my perception is that their mission is industry classification statistics. 15:19:39 Henning: eg. how many people work in fast food restaurants, rather than say what credit cards they take. 15:19:50 danbri has joined #poiwg 15:19:51 karls: They're also missing juicy POIs, like golf courses, transit stops, etc. 15:20:14 Henning: Yes, so far it seems outside their mission of what they're doing. 15:21:08 ahill2: Sorry, I think I asked the question wrong. In your URN proposal, would you see those categories, which are facets outside of a category being appropriate, e.g. hours, or all the way down to the kind of information from crowd sourcing. 15:21:13 ahill2: Where do you draw the line? 15:22:23 Henning: A URN to my mind is not as suitable for these non-categorization models. You've identified some binary things, but many are not easily represented in the same fashion. That said, we have separately, and I didn't talk about it here as it's preliminary, in the system we built, that has the ability to retrieve an XML type document with suitable tags that have that information. 15:22:53 Henning: We could envision that being useful for us to agree on labeling to enable exchange. 15:23:41 thanks, that answers my question 15:23:43 Henning: There's an opportunity there, didn't discuss it here, and it's to some extent orthogonal, but there's a need for that as well, maybe industry specific bodies, which might be in a position to do that more appropriately. 15:24:25 zakim, who is on the phone? 15:24:25 On the phone I see Matt, ahill2, robman, cperey, Henning, rsingh2.a 15:24:49 Henning: I look forward to the mailing list conversation. 15:25:14 cperey: As for next steps, Matt will publish the minutes of the meeting. It's almost a transcript. 15:25:46 cperey: He publishes that as a URL, it becomes archives for the group. That gets it out to a larger audience, but after that it's kind of up to this group. We're having our F2F in two weeks. 15:26:01 cperey: We should work on this at the F2F and followup with actions from that. 15:26:13 zakim, dial matt-voip 15:26:13 ok, matt; the call is being made 15:26:15 +Matt.a 15:26:21 -Matt 15:26:42 Henning: There's no dependency here, so that's fine. 15:27:06 Henning: Right now, I don't even see it as appropriate to include it in the doc, as it's not specific to this effort. But, I would like to look for a community of interest to take it to the next level of specificity. 15:27:21 rrsagent, draft minutes 15:27:21 I have made the request to generate http://www.w3.org/2011/09/08-poiwg-minutes.html matt 15:27:32 Henning: I'm not asking the WG to take on this particular task, it's probably outside the immediate scope. 15:28:31 -cperey 15:28:33 -rsingh2.a 15:28:33 -Henning 15:28:33 matt: Could be a CG perhaps? POI WG decided not to do this. 15:28:37 matt: Thank you! 15:28:45 -robman 15:28:46 Henning: Thanks, and thanks to Christine for arranging this. 15:28:55 zakim, who is on the phone? 15:28:55 On the phone I see ahill2, Matt.a 15:29:00 rrsagent, draft minutes 15:29:00 I have made the request to generate http://www.w3.org/2011/09/08-poiwg-minutes.html matt 15:31:04 hey matt - just ping me an email if you'd like my help next week with integrating some of the linked data approach to the spec 15:34:06 -Matt.a 15:34:58 zakim, drop ahill2 15:34:58 ahill2 is being disconnected 15:34:59 UW_POI(POIWG)10:00AM has ended 15:35:01 Attendees were Matt, ahill2, robman, +1.617.848.aaaa, cperey, +1.212.939.aabb, Henning, +1.617.764.aacc, +1.312.894.aadd, karls, Matt.a 15:35:09 cool 15:35:20 thanks rob