07:51:36 <RRSAgent> RRSAgent has joined #mlwrome
07:51:36 <RRSAgent> logging to http://www.w3.org/2013/03/13-mlwrome-irc

07:51:50 <DomJones> meeting: mlw workshop rome, day 2
07:51:55 <philr> philr has joined #mlwrome
07:52:03 <DomJones> scribe: DomJones
07:52:14 <DomJones> rrsagent, make log public
07:52:30 <DomJones> topic: multilingual linked open data patterns
07:52:39 <DomJones> rrsagent, draft minutes
07:52:39 <RRSAgent> I have made the request to generate http://www.w3.org/2013/03/13-mlwrome-minutes.html DomJones
07:52:47 <DomJones> chair: arle
07:53:23 <DomJones> agenda: http://www.multilingualweb.eu/en/documents/rome-workshop/rome-program
07:53:31 <DomJones> waiting for the meeting to start ...
07:53:47 <DomJones> present+ many, many, many, people
07:53:51 <DomJones> rrsagent, draft minutes
07:53:51 <RRSAgent> I have made the request to generate http://www.w3.org/2013/03/13-mlwrome-minutes.html DomJones
07:57:16 <Arle> Arle has joined #mlwrome

07:58:07 <DomJones> Jose: best practices for MLOD given at last workshop. One pattern is a solution to a problem. Good to have catalog of patterns for selection. Common vocabs.
07:58:11 <daveL> daveL has joined #mlwrome
07:58:14 <fsasaki> fsasaki has joined #mlwrome
07:58:19 <DomJones> ... best solutions for Multlingual Linked open data
07:58:38 <DomJones> ... each pattern has description, example, discussion.
07:59:28 <DomJones> ... Patterns have name, dereference, long desc, linkings and refuse factors.
07:59:46 <daveL> www.weso.es/MLODPPatterns
07:59:51 <DomJones> ... 20 patterns, for community to add to and adapt
08:00:36 <DomJones> ... person is an armenian and professor at uni of leon. person has birthplace, postion and worksat
08:01:23 <DomJones> ... 1st select a uri scheme. URI is human readable ASCII characters
08:02:04 <tadej> tadej has joined #mlwrome
08:02:15 <DomJones> ... another pattern is opaque URIs where local names are not human readable. These are indepednant from natural language implementation
08:02:32 <DomJones> ... These are hard to handle by developers
08:02:58 <DomJones> ... So descriptive URIs, Opaque URIs and Full IRIS
08:03:40 <DomJones> ... internationalised local names. Domain name is ASCII chars but local name is in local chars
08:04:04 <daveL> correction: http://www.weso.es/app/webroot/MLODPatterns/
08:04:14 <DomJones> ... another pattern is to include language tag in the URI
08:04:17 <RRSAgent> I have made the request to generate http://www.w3.org/2013/03/13-mlwrome-minutes.html fsasaki
08:05:23 <DomJones> ... Dereference: return labels based on language code of the user
08:05:33 <RRSAgent> I have made the request to generate http://www.w3.org/2013/03/13-mlwrome-minutes.html fsasaki
08:05:43 <DomJones> ... semantic equiv of data needs to be identified.
08:06:31 <DomJones> ... Labelling - label everything including using multilingual labels. ML labels have a problem when querying looks for mono-linugal labels.
08:06:46 <DomJones> ... solutions = labels with no lang tag
08:07:37 <DomJones> ... with this which language is the default? Longer descriptions are difficult to handle, better to have finer grained descriptions to seperate out labels.
08:08:26 <DomJones> ... for longer descriptions there is the possibility of structured litterals.
08:08:47 <RRSAgent> I have made the request to generate http://www.w3.org/2013/03/13-mlwrome-minutes.html fsasaki
08:09:48 <DomJones> ... linking same concepts in different languages which are identified as being the same. However contradictions exist. Link linguistic meta-data exists, 1st class lang annotations.
08:09:50 <RRSAgent> I have made the request to generate http://www.w3.org/2013/03/13-mlwrome-minutes.html fsasaki
08:10:25 <DomJones> ... resuse: vocabs are generally mono-lingual. Multlingual vocabs are more difficult to maintain
08:10:53 <DomJones> ... can create new localised vocabs
08:11:25 <RRSAgent> I have made the request to generate http://www.w3.org/2013/03/13-mlwrome-minutes.html fsasaki
08:11:41 <DomJones> ... future work - session on best practices for ML LOD, opportunity to improve catalog / add to / remove from catalog

08:12:23 <DomJones> topic: multilingualism in Linked Data
08:12:32 <RRSAgent> I have made the request to generate http://www.w3.org/2013/03/13-mlwrome-minutes.html fsasaki
08:12:54 <DomJones> Asuncion: all ML concepts should be addressed in LD generation
08:13:11 <DomJones> ... model is simple, everything is in rdfs
08:13:55 <DomJones> ... subject, property, value. Unique identifiers, URIs are used. Subjects are represented by URIs.
08:14:02 <DomJones> ... using equiv links to link data sets
08:14:58 <RRSAgent> I have made the request to generate http://www.w3.org/2013/03/13-mlwrome-minutes.html fsasaki
08:15:46 <DomJones> ... lots of info sources in different languages. RDF generation and linked data allows for graphical representation of ML LOD sets
08:16:02 <DomJones> ... currently looking at million literals data set
08:16:31 <DomJones> ... numbers of literals with langauge tags has increased from 2011 to 2012
08:17:32 <DomJones> ... still mostly in english. Data in other languages are simular. Most data is in English as not many countries are providing LD in languages other than English
08:18:09 <DomJones> ... in LD cloud ML queries is achieved through 6 stages.
08:19:01 <RRSAgent> I have made the request to generate http://www.w3.org/2013/03/13-mlwrome-minutes.html fsasaki
08:19:29 <RRSAgent> I have made the request to generate http://www.w3.org/2013/03/13-mlwrome-minutes.html fsasaki
08:20:37 <DomJones> ... 1) specification, how to model data sets. 2) Translate labels of ontology into other languages, align vocabs of other languages. Reuse / align existing vocabs. 3) RDF generation use richer models for applications
08:20:45 <RRSAgent> I have made the request to generate http://www.w3.org/2013/03/13-mlwrome-minutes.html fsasaki
08:21:01 <tadej> tadej has joined #mlwrome
08:21:10 <DomJones> ... 4) link generation - how to discover cross lingual links - how to represent cross-lingual links - how to store and reuse links.
08:22:25 <RRSAgent> I have made the request to generate http://www.w3.org/2013/03/13-mlwrome-minutes.html fsasaki
08:23:05 <DomJones> ... concepts are tagged in langauage-based ontology, these ontologies are linked, cross-lingual links. Properties describe medicine
08:24:13 <DomJones> ... ontology in german and spanish, translate german into spanish and check for alignments or use cross-lingual-ontology matching across both
08:24:27 <RRSAgent> I have made the request to generate http://www.w3.org/2013/03/13-mlwrome-minutes.html fsasaki
08:24:55 <DomJones> ... 5) publication - links can be discovered at run time of offline, some storage method is needed for links already discovered.
08:25:42 <RRSAgent> I have made the request to generate http://www.w3.org/2013/03/13-mlwrome-minutes.html fsasaki
08:25:57 <DomJones> ...  6) Exploitation how to adapt semantic query to linguistic and cultural background of a user. Also how should results of semantic query be adapted.
08:27:49 <DomJones> ... For ML LOD many services need to exist from generation through to consumption - ML LOD should be provided through service translation but now we should start including lang features in the generation of data

08:28:38 <DomJones> Topic: Public Linked Open Data
08:29:26 <DomJones> Peter: Large repository for public linked open data
08:29:48 <RRSAgent> I have made the request to generate http://www.w3.org/2013/03/13-mlwrome-minutes.html fsasaki
08:30:32 <RRSAgent> I have made the request to generate http://www.w3.org/2013/03/13-mlwrome-minutes.html fsasaki
08:30:42 <DomJones> ... publicatons office of EU is a publisher of EU institutions, legislations and non-legislation documents. Whole process of document management. Finally moving from paper to electronic model and from publisher to data provider
08:31:04 <DomJones> ... shift from paper to elec makes the electronic version of EU journal legally binding.
08:32:02 <DomJones> ... Multilingualism is core, 23 languages used. Every EU member state requires publication in their own language. For example 2600 pages per document * 23 langs
08:32:36 <DomJones> ... ML supports all member states equally therefore ML public websites must exist. For Law, procurement, CORDIS and general publication bookshop.
08:32:53 <DomJones> ... Four systems for the ML semantic web
08:33:10 <DomJones> ... CELLAR, EU Data Portal, Eurovoc, MDR
08:33:56 <RRSAgent> I have made the request to generate http://www.w3.org/2013/03/13-mlwrome-minutes.html fsasaki
08:34:58 <DomJones> ... 1) CELLAR in currently production, not yet public, being loaded / populated, some key concepts - repos is defined by common data model (ontology). Semantic model is built up by these components. Loading is standardised, 30TB of data
08:35:25 <RRSAgent> I have made the request to generate http://www.w3.org/2013/03/13-mlwrome-minutes.html fsasaki
08:35:55 <DomJones> ... in repos content is stored in top level, meta-data is linked to this. Distribution side and SPARQL end point.
08:36:22 <DomJones> ... 700M triples in the store. Mainly PDF, XML and XHTML.
08:36:39 <RRSAgent> I have made the request to generate http://www.w3.org/2013/03/13-mlwrome-minutes.html fsasaki
08:36:59 <DomJones> ... accessible through RESTFul API or through SPARQL endpoint
08:37:47 <ivan> ivan has joined #mlwrome
08:38:25 <DomJones> ... 2) EU Data Portal. Single point of access to all structured data for linking and reuse of commercial and non-commerical data
08:38:25 <lmatteis_> hi
08:38:39 <RRSAgent> I have made the request to generate http://www.w3.org/2013/03/13-mlwrome-minutes.html fsasaki
08:39:07 <RRSAgent> I have made the request to generate http://www.w3.org/2013/03/13-mlwrome-minutes.html fsasaki
08:39:15 <DomJones> ... RDF based interface for upload of meta-data
08:39:45 <DomJones> ... 3) EUROVOC avaliable in SKOS/RDF or XML format.
08:40:45 <DomJones> 4) Meta-data Registry (MDR) for concepts which have been validated they are published through CELLAR, Controlled vocabs etc
08:40:58 <RRSAgent> I have made the request to generate http://www.w3.org/2013/03/13-mlwrome-minutes.html fsasaki
08:41:44 <DomJones> 5) For english all the languages of the EU are presented, translations are discussed between all units in the EU and therefore offical transation (by member states) exist
08:42:10 <lmatteis> which presentator is this? i'm watching live and just opened the live stream?
08:42:26 <DomJones> ... European Legislation Identifier (ELI) follows W3C RDF / XML to provide data in standardised way.
08:43:14 <RRSAgent> I have made the request to generate http://www.w3.org/2013/03/13-mlwrome-minutes.html fsasaki

08:43:38 <fsasaki> topic: Multilingual Issues in the Representation of International Bibliographic Standards for the Semantic Web
08:43:40 <RRSAgent> I have made the request to generate http://www.w3.org/2013/03/13-mlwrome-minutes.html fsasaki
08:44:08 <DomJones> Gordon: IFLA body which maintains global standards for library and biblio enviroment.
08:44:52 <RRSAgent> I have made the request to generate http://www.w3.org/2013/03/13-mlwrome-minutes.html fsasaki
08:45:04 <DomJones> ... Seperate to IFLA is ISBD and UNIMARC all three relate to library / bibilio standards,
08:45:09 <DomJones> ... all three use internationally.
08:45:56 <DomJones> ... IFLA has own namespace for standards. Supports conversion from library linked data without loss of information.
08:47:13 <DomJones> ...IFLA has 7 languages. Standards generally written in English and then translated into the 7 languages.
08:47:40 <DomJones> ... ML website launched in spanish, partial doc in spanish of what exists in spanish already.
08:47:47 <RRSAgent> I have made the request to generate http://www.w3.org/2013/03/13-mlwrome-minutes.html fsasaki
08:48:33 <DomJones> ... Open meta data reg is used to store classes, URIs for each maintainers. These are Opaque as to avoid lang bias when used in RDF
08:48:53 <RRSAgent> I have made the request to generate http://www.w3.org/2013/03/13-mlwrome-minutes.html fsasaki
08:49:44 <Monica> Monica has joined #mlwrome
08:50:07 <DomJones> ... ISBD elements - problems occurred when namespace was translated. Translation into spanish became guidelines for doing future translations. Contains much info on the problems / issues of translations.
08:50:13 <RRSAgent> I have made the request to generate http://www.w3.org/2013/03/13-mlwrome-minutes.html fsasaki
08:50:30 <nwaltham> nwaltham has joined #mlwrome
08:51:29 <DomJones> ... Problems 1) scope, what is transalted first and what is most useful. (developers - element definitions, labels) (Users - what they see labels of concepts in value vocabs).
08:51:35 <RRSAgent> I have made the request to generate http://www.w3.org/2013/03/13-mlwrome-minutes.html fsasaki
08:52:03 <DomJones> ... 2) Style: Verbal phrasing, CamelCasing etc
08:52:51 <DomJones> ... hasAuthor, hasTitle does not translate perfectly into other languages. CamelCasing looks bad in other languages, whats ok in one language may not work in another language.
08:53:03 <lmatteis> DomJones: are you a bot?
08:53:16 <lmatteis> fsasaki: can i ask questions to speakers from irc?
08:53:59 <DomJones> 3) Disambig methods for creating labels may vary between languages.
08:54:05 <fsasaki> lmatteis, you can ask the question and I can relay your question in the q/a session - which starts 10:15
08:54:38 <lmatteis> ok thanks!
08:55:03 <DomJones> 4) Language Inflection
08:56:16 <DomJones> ... Partial translations only preferred label translated, have to track status of translation through a number of stages, schedules and status tracking are required.
08:57:08 <DomJones> ... MulDiCat for authorative translations of IFLA standards, avliable in open meta-data repository as well. More than 26 langs represneted.

08:57:48 <fsasaki> topic: Language Technology Tools for Supporting the Multilingual Web
08:57:52 <RRSAgent> I have made the request to generate http://www.w3.org/2013/03/13-mlwrome-minutes.html fsasaki
08:57:59 <lmatteis> "What's the reason behind having 'opaque' URIs, and translating RDF predicates? They are merely identifiers, and as long as 'label' and 'definitions' have been properly translated, I see no reason of further complicating RDF vocabularies with multiple translations"
08:58:02 <lmatteis> fsasaki: that would be mine :)
08:58:13 <nwaltham_> nwaltham_ has joined #mlwrome
08:58:13 <fsasaki> ok
08:58:37 <lmatteis> maybe just the first part ;) if it's too long
08:58:49 <fsasaki> np, will be ok
08:59:19 <DomJones> Thierry: on the web ML pages, dictionaries, tools. not every document is avaliable in every language. When I access web in german or french I dont often get docs in other languages. Mono-lingual search
09:00:14 <RRSAgent> I have made the request to generate http://www.w3.org/2013/03/13-mlwrome-minutes.html fsasaki
09:00:59 <DomJones> ... semantic resources are already available on the web. We have ML web, pages, resources but we want the Sem Web to run in combination with lang tech so we can annotate text
09:02:15 <DomJones> ... GICS - classIDs, Labels, these labels use non-standard formats etc.
09:03:19 <DomJones> ... towards ML linguistic Semantic Web so labels can be encoded in RDF using Lemon model - also want to mention Linguistic Linked Open Data.
09:04:21 <DomJones> ... annotate text avaliable in mutliple languages 1) take all labels, analyise, combine in semantic repos using Lemon and apply to running annotated text. Can also be stored in querable tool.
09:04:45 <RRSAgent> I have made the request to generate http://www.w3.org/2013/03/13-mlwrome-minutes.html fsasaki
09:04:59 <DomJones> ... in one ontology you display suggestion for ML labels encoded in ontology
09:06:14 <RRSAgent> I have made the request to generate http://www.w3.org/2013/03/13-mlwrome-minutes.html fsasaki
09:06:35 <DomJones> ... NooJ can be used to test NLP analysis of labels, difference is way natural langage can be expressed
09:08:25 <DomJones> ... need to harmonise and modify a label for NLP. Terminological expansion of labels provide taxonomies for preferred labels. From 1 label 5 labels can be generated annoted using LEMON and exported
09:09:25 <DomJones> ... triggering of ellipsis resolution to cross-lingual labels in other languages. Labels are expanded based on property of another language.
09:10:16 <DomJones> ... From this we discovered semantic annotation of web documents in many languages.
09:11:02 <DomJones> ... Text from spanish stock market, two simular taxonomy generates two annotations, both labels point to same concept but are textually different.
09:11:52 <RRSAgent> I have made the request to generate http://www.w3.org/2013/03/13-mlwrome-minutes.html fsasaki
09:12:00 <DomJones> ... labels can be displayed in many other languages and allows for annotations in higher level languages.
09:12:18 <DomJones> ... needs to make sure these are compliant in terms of standardisation.

09:12:34 <DomJones> Topic: Question And Answer
09:14:00 <fsasaki> question from lmatteis: "What's the reason behind having 'opaque' URIs, and translating RDF predicates? They are merely identifiers, and as long as 'label' and 'definitions' have been properly translated, I see no reason of further complicating RDF vocabularies with multiple translations"
09:14:09 <RRSAgent> I have made the request to generate http://www.w3.org/2013/03/13-mlwrome-minutes.html fsasaki
09:14:37 <DomJones> Gordon: Several reasons for opqaue URIs. 1) Not opaque must be based on something, therefore is the label changes the URI cant change so its more confusing. 2) The favouring of any language over another is not good practice. 3) When translating property and class labels we're using opaque URIs
09:14:55 <lmatteis> fsasaki: thanks!
09:15:07 <fsasaki> np
09:15:59 <DomJones> Ivan: Linked Data community doesnt know "anything" you guys are doing. Until Larger LD community is aware of your work I dont see anything changing. For devs to take ML LD into account they need to be aware of your work
09:16:21 <lmatteis> isn't Ivan the leader of w3c
09:16:34 <lmatteis> ?
09:17:01 <DomJones> Assuncion: Ontology-Lexical WG is being proposed to be used for representing. Big countires investing in LD are english speaking and are not immediately interested in ML LD.
09:17:19 <fsasaki> lmatteis, ivan is the semantic web activity lead at w3c
09:17:39 <DomJones> ... From SW perspective we need a road-map to push these ML issues. White Paper for community addressing these issues.
09:18:31 <DomJones> Ivan: W3C working groups are not suited for this. For example schema.org represents vocabs that are used we cant ignore them. Need to try to get the authors of schema.org to think about ML data
09:18:47 <DomJones> ... labelling and documentation in ML form would be a huge step
09:18:50 <RRSAgent> I have made the request to generate http://www.w3.org/2013/03/13-mlwrome-minutes.html fsasaki
09:19:03 <DomJones> Jose: I agree with your point, hence cataglog of patterns has been produced.
09:19:32 <DomJones> ... need to educate hence BP practices for ML LOD
09:19:35 <Monica> Monica has joined #mlwrome
09:20:35 <DomJones> Asuncion: Trying to analysie how languages are used and how these lingusitic choices are applied to data sets
09:22:06 <DomJones> chaals: annotating other peoples vocabs are socially difficult. Opauque URIs avoid having a language bias? No, the bias exists in the model, opaque URIs hide this from the top level view. We should be publishing annotations on other peoples vocabs that are broken
09:23:42 <RRSAgent> I have made the request to generate http://www.w3.org/2013/03/13-mlwrome-minutes.html fsasaki
09:23:56 <DomJones> jose: in the case of annotating and translating the label that you want.
09:24:16 <nwaltham_> nwaltham_ has joined #mlwrome
09:24:16 <DomJones> ... labelsforall.info simular to prefix.cc for label and translation recording.
09:26:18 <RRSAgent> I have made the request to generate http://www.w3.org/2013/03/13-mlwrome-minutes.html fsasaki
09:26:49 <DomJones> Q: Different communities I second Ivans views. In terms of ML-LOD cloud, when someone asks where is ML-ism? A URI is a resource that can be in many languagues. Dimensions, Peter S takes of TB of LOD. Many people talk in terms of one record. In ML-LOD 1) concept
09:27:51 <RRSAgent> I have made the request to generate http://www.w3.org/2013/03/13-mlwrome-minutes.html fsasaki
09:28:14 <DomJones> r12a: Aim of workshop is not just for talks but to get people together networking to move things forward.
09:29:51 <DomJones> christian L: 2 things 1- how far does the work you are driving / continuing, effect the content authors. user cataloging, etc. Also how far is the reviewing activity considered a general reviewers toolkit
09:30:33 <DomJones> Peter: no direct connections to author services, everything is translated, we're just proof reading. IN being efficent we work with coded data and catagloging.
09:31:49 <DomJones> thierry: our work has implication on labels, taxonomies, in terms of impact important we provide impact to provide reocmmendation to change terminology to make it more applicable.
09:32:23 <DomJones> q: relation between work of the speakers and repositories like free-base?
09:34:07 <DomJones> Feiyu: instance of freebase can be used as a kind of interlingual can be really useful for ML-LOD
09:34:19 <RRSAgent> I have made the request to generate http://www.w3.org/2013/03/13-mlwrome-minutes.html fsasaki
10:03:38 <philr> scribe: philr

10:03:41 <philr> topic: Users
10:03:51 <RRSAgent> I have made the request to generate http://www.w3.org/2013/03/13-mlwrome-minutes.html fsasaki

10:04:24 <philr> Pat Kane (Verisgn)
10:04:39 <philr> Internationalized Domain Names
10:05:07 <nwaltham_> nwaltham_ has joined #mlwrome
10:05:07 <fsasaki> topic: Internationalized Domain Names - pat kane
10:05:16 <philr> ...focus on end users
10:05:17 <RRSAgent> I have made the request to generate http://www.w3.org/2013/03/13-mlwrome-minutes.html fsasaki
10:05:36 <philr> ...Users want to use their own scripts
10:06:06 <RRSAgent> I have made the request to generate http://www.w3.org/2013/03/13-mlwrome-minutes.html fsasaki
10:06:41 <philr> ...growth in Asia Pacific driving non-English domain names and urls
10:07:25 <philr> ...1m+ international domain names registered in first six months
10:07:25 <RRSAgent> I have made the request to generate http://www.w3.org/2013/03/13-mlwrome-minutes.html fsasaki
10:07:40 <philr> ...50% cjk idiographs
10:08:10 <philr> ...Armenian scripts under-served
10:08:37 <philr> ...major browsers handle idm's quite well
10:09:17 <philr> ...email addresses used a lot as identifiers in log in's
10:10:01 <RRSAgent> I have made the request to generate http://www.w3.org/2013/03/13-mlwrome-minutes.html fsasaki
10:10:13 <philr> ...What's hindering domain registrations? greater user awareness, registrar's
10:10:37 <philr> ...better mobile browser support, management tools
10:11:32 <philr> ...results in a lack of trust (intent for a user to register)
10:12:13 <philr> ...users want full idn support
10:12:31 <philr> ...lack of ubiquity an issue
10:13:05 <philr> ...idn's are second class domains, users are suspicious of them
10:13:53 <philr> ...not comfortable with idn.ascii
10:14:21 <philr> ...SME's in China are more open to idn.idn
10:14:30 <RRSAgent> I have made the request to generate http://www.w3.org/2013/03/13-mlwrome-minutes.html fsasaki
10:15:56 <philr> ...5 key insights: more utility needed, initial resistance to adoption, translation preferences,
10:16:10 <RRSAgent> I have made the request to generate http://www.w3.org/2013/03/13-mlwrome-minutes.html fsasaki
10:17:11 <philr> ...moderate interest in registration and registrar channel expectations.
10:17:28 <RRSAgent> I have made the request to generate http://www.w3.org/2013/03/13-mlwrome-minutes.html fsasaki
10:17:36 <philr> Chinese want idn.idn not idn.ascii
10:18:10 <philr> In India respondents do not visit idn.idn
10:18:36 <philr> In Japan comfortable with ascii.ascii
10:18:42 <RRSAgent> I have made the request to generate http://www.w3.org/2013/03/13-mlwrome-minutes.html fsasaki
10:19:13 <philr> Korea more pationate about idn.idn
10:20:13 <philr> Need multi-disciplinary groups to push adoption
10:20:23 <RRSAgent> I have made the request to generate http://www.w3.org/2013/03/13-mlwrome-minutes.html fsasaki
10:20:37 <labra> labra has joined #mlwrome
10:21:27 <philr> ...Key roles: Registries, Registrar's, content creators, application developers, Governments and businesses
10:21:36 <philr> ...and standards organisations
10:22:06 <RRSAgent> I have made the request to generate http://www.w3.org/2013/03/13-mlwrome-minutes.html fsasaki
10:22:13 <philr> ...circle of dependency: adoption -- ubiquity
10:22:55 <philr> ...change ecosystem to enhance user experience
10:23:01 <philr> ...ubiquity drives trust
10:24:23 <RRSAgent> I have made the request to generate http://www.w3.org/2013/03/13-mlwrome-minutes.html fsasaki
10:24:23 <philr> ...ubiquity means not just desktop but also mobile
10:24:39 <philr> ...mobile applications are much less capable of handling IDN's
10:26:05 <philr> next speaker, Richard Ishida who is a late change to the programme
10:26:20 <philr> topic: What's in a name?
10:26:40 <RRSAgent> I have made the request to generate http://www.w3.org/2013/03/13-mlwrome-minutes.html fsasaki

10:27:08 <philr> Richard Ishida concerned about data and data formats: specifically people's names
10:27:35 <philr> ...web sites usually ask for "first" and "last"name
10:28:11 <philr> ...Use "given" and "family" name
10:28:20 <philr> ...names are more complicated than we generally think
10:28:20 <RRSAgent> I have made the request to generate http://www.w3.org/2013/03/13-mlwrome-minutes.html fsasaki
10:29:38 <philr> ...applications want to parse names and do things with them - e.g. in salutations, search and sorting
10:29:42 <RRSAgent> I have made the request to generate http://www.w3.org/2013/03/13-mlwrome-minutes.html fsasaki
10:30:32 <RRSAgent> I have made the request to generate http://www.w3.org/2013/03/13-mlwrome-minutes.html fsasaki
10:30:42 <philr> ...Björk's "surname" is actually her father's name
10:31:06 <RRSAgent> I have made the request to generate http://www.w3.org/2013/03/13-mlwrome-minutes.html fsasaki
10:31:08 <philr> ..."bin" == son of
10:31:40 <philr> ...Mao Ze Dong - Ze == generational name
10:32:10 <RRSAgent> I have made the request to generate http://www.w3.org/2013/03/13-mlwrome-minutes.html fsasaki
10:32:14 <philr> ...How you would address him depends on a lot of things
10:32:56 <philr> ...typically he would use a western name to make things easier for western people
10:33:26 <RRSAgent> I have made the request to generate http://www.w3.org/2013/03/13-mlwrome-minutes.html fsasaki
10:33:35 <philr> ...multiple family names: given name plus two family names
10:34:21 <philr> ...father's name first, mother's name second - varies by country
10:34:43 <RRSAgent> I have made the request to generate http://www.w3.org/2013/03/13-mlwrome-minutes.html fsasaki
10:34:59 <philr> ...Variant word forms indicating gender
10:35:13 <RRSAgent> I have made the request to generate http://www.w3.org/2013/03/13-mlwrome-minutes.html fsasaki
10:36:14 <philr> ...how names are inherited varies
10:36:53 <RRSAgent> I have made the request to generate http://www.w3.org/2013/03/13-mlwrome-minutes.html fsasaki
10:37:39 <philr> ...nicknames used often to help
10:37:39 <RRSAgent> I have made the request to generate http://www.w3.org/2013/03/13-mlwrome-minutes.html fsasaki
10:38:01 <philr> ...written forms can be ambiguous
10:38:36 <RRSAgent> I have made the request to generate http://www.w3.org/2013/03/13-mlwrome-minutes.html fsasaki
10:38:48 <philr> ...many asian names can be transcribed identically
10:39:30 <philr> Recommendation: ask people how you would like to be addessed
10:39:54 <philr> ...this topic needs a lot more work
10:40:06 <RRSAgent> I have made the request to generate http://www.w3.org/2013/03/13-mlwrome-minutes.html fsasaki
10:40:35 <philr> ...need an authoritive guidence on the problems of handling names
10:41:31 <RRSAgent> I have made the request to generate http://www.w3.org/2013/03/13-mlwrome-minutes.html fsasaki

10:42:07 <philr> next speaker, Sebastien Hellman (another late programme change)
10:42:55 <philr> topic: LOD2 Stack and the NLP2RDF project
10:43:01 <RRSAgent> I have made the request to generate http://www.w3.org/2013/03/13-mlwrome-minutes.html fsasaki
10:43:21 <philr> ...LOD == Linked Open Data cloud
10:43:57 <philr> ...http://lod-cloud.net data sets published on the net
10:44:31 <RRSAgent> I have made the request to generate http://www.w3.org/2013/03/13-mlwrome-minutes.html fsasaki
10:44:31 <philr> ...free, open and open licensed
10:44:43 <RRSAgent> I have made the request to generate http://www.w3.org/2013/03/13-mlwrome-minutes.html fsasaki
10:45:33 <RRSAgent> I have made the request to generate http://www.w3.org/2013/03/13-mlwrome-minutes.html fsasaki
10:47:12 <RRSAgent> I have made the request to generate http://www.w3.org/2013/03/13-mlwrome-minutes.html fsasaki
10:48:12 <RRSAgent> I have made the request to generate http://www.w3.org/2013/03/13-mlwrome-minutes.html fsasaki
10:49:56 <RRSAgent> I have made the request to generate http://www.w3.org/2013/03/13-mlwrome-minutes.html fsasaki
10:50:52 <fsasaki> sebastian going through the lod2 stack
10:51:26 <RRSAgent> I have made the request to generate http://www.w3.org/2013/03/13-mlwrome-minutes.html fsasaki
10:52:18 <philr> philr has joined #mlwrome
10:52:18 <fsasaki> now about NIF format
10:52:23 <philr> scribe: philr
10:52:52 <RRSAgent> I have made the request to generate http://www.w3.org/2013/03/13-mlwrome-minutes.html fsasaki
10:53:50 <philr> ...linguistic LOD cloud
10:54:31 <RRSAgent> I have made the request to generate http://www.w3.org/2013/03/13-mlwrome-minutes.html fsasaki
10:55:17 <philr> ...in NIF use fragment identifiers to address primary data
10:55:25 <RRSAgent> I have made the request to generate http://www.w3.org/2013/03/13-mlwrome-minutes.html fsasaki
10:55:56 <philr> ...can query NIF components as a web service
10:56:44 <philr> ...OLiA: Vocabulary Module - mapping of over 50 Tagsets
10:57:41 <philr> ...NIF 2.0 plans - links to ITS 2.0, Lemon ontolgy, XPath uri scheme
10:57:41 <philr> ...NIF will be free and open
10:58:11 <philr> ...looking for contributors

10:59:03 <philr> next speaker: Fernando Serván of FAO
10:59:27 <philr> topic: Reorganizing Information in a Multilingual Website
11:00:11 <philr> Fernando Serván: FAO has presence in 82 country offices
11:00:22 <philr> ...uses 6 official EU languages
11:00:32 <RRSAgent> I have made the request to generate http://www.w3.org/2013/03/13-mlwrome-minutes.html fsasaki
11:01:50 <philr> ...FAO users language primarily English followed by Spanish and French
11:02:26 <RRSAgent> I have made the request to generate http://www.w3.org/2013/03/13-mlwrome-minutes.html fsasaki
11:02:54 <philr> ...currently reorganizing content to focus on decentralization and partnerships
11:03:26 <philr> ...need to accommodate locally generated content
11:05:36 <philr> ...Issues faced: a lot of unstructured content, web content, language versions do not match, no localized uri's, low reuse of content
11:06:43 <philr> ...lack of mono- and multilingual ontologies to drive navigation
11:07:13 <lmatteis> I still don't understand the need of having opaque URIs or URIs with numbers instead of meaningful words. This isn't going to solve the multilingual issue, it's going to make it worst if anything. Because at least the URI is readable in english, instead of not readable at all
11:07:16 <philr> ...do have a stable geopolitical ontology
11:09:13 <philr> ...need to make best use of existing content, identify normative, use CMS-independant content, use MT (for Arabic and Chinese), better (intended) nderstand users
11:11:05 <philr> ...want to utilize standards and best practices: XLIFF, RDF, ITS 2.0, learn from translation workflows, get social - on-demand translation
11:11:55 <philr> ...allow users to vote for pages that should be translated
11:12:44 <philr> ...have a set of short term and more longer term goals
11:12:54 <tadej> tadej has joined #mlwrome
11:13:38 <philr> ...want to prioritize for Chinese, Russian, etc.

11:14:25 <philr> final speaker of this session is Paula Shannon of Lionbridge
11:14:48 <philr> topic: The Globalization Penalty
11:15:08 <philr> Paula Shannon: International SEO
11:16:06 <philr> ...McKinsey - "Strong multinationals seem less healthy..."
11:16:56 <philr> ...Local firms in emerging markets succeed where Multinationals fail
11:17:54 <philr> ...Marketing defined as a key function
11:18:05 <philr> ...global means complexity which means cost
11:18:26 <philr> ...balance central vs local
11:18:37 <philr> ...The Consumer Decision Journey
11:19:20 <philr> ...written about in the Harvard Business Review
11:20:17 <philr> ...many people in teh digital age already know what they want to buy before they go to purchase it.
11:21:07 <philr> ...consumers in the digital age trust social marketing
11:21:43 <philr> ...push branding is irrelevant in the digital age
11:23:13 <philr> ...so how do you form your pre-purchase opinion? Search
11:25:07 <philr> ...changing the rules: The Global Customer Lifecycle
11:25:53 <philr> ...71% decide based on in-language search and peer recommendations
11:26:33 <philr> ...3 Biggest Problems: Traffic, Conversions, Management
11:27:59 <philr> ...when search is bad: "Can't find... won't buy"
11:30:00 <philr> ...Web Localization Maturity Model
11:31:05 <philr_> philr_ has joined #mlwrome
11:31:35 <philr_> ...scribe: philr_
11:31:48 <philr_> SEO localization generates 15-40% more traffic
11:32:22 <philr_> ...increase search rankings and traffic
11:33:32 <philr_> ...be in the top 3 search results by benchmarking against competitors
11:34:32 <philr_> ...SEO optimised translations is an iterative process
11:35:01 <philr_> ...look at baseline, keyword resaecrh, translation, QA, repeat
11:37:17 <philr_> ...percolate keywords throughout content
11:37:37 <philr_> ...analytics and reporting against competitors
11:38:43 <philr_> ...it's not just Google. 6/10 popular social networks in China.
11:39:00 <philr_> Yandex expanding out of Russia
11:39:36 <philr_> ...pace of change is accelorating
11:40:15 <philr_> ...global companies need to be hyper-local.
11:41:00 <philr_> ...utilize local search term experts
11:42:20 <philr_> ...Long Tail of search terms
11:44:05 <philr_> ...Methods for executing multi-lingual Pay Per Click: MT, Local Offices, Human Translation, Localization and Optimisation
11:44:34 <philr_> ...hosting 1.5 million pages for clients

11:46:00 <philr_> Q&A
11:46:53 <philr_> Reinhard Schaler: existing notion of "give up the illusion of control".
11:48:02 <philr_> ...What is stopping the localization industry from handing over to the users?
11:49:49 <philr_> Paula Shannon: Localization is teh step-child. No clear ROI. Localization has been a cost center thus focused on efficiency and cost reduction
11:50:39 <philr_> Des Oates: Some companies are making steps to user empowerment
11:51:10 <nwaltham_> nwaltham_ has joined #mlwrome
11:51:23 <philr_> ...Adobe has ceeded control of certain products to users
11:52:17 <philr_> Paula Shannon: text enrichment can help
11:53:03 <philr_> Fernando Serván: monitor demand, traffic analytics
11:53:13 <philr_> ...demand to drive translation
11:53:59 <philr_> Chaarles: when you prioritise translation on demand, how do you decide?
11:54:18 <philr_> ...how do you balance those things?
11:54:30 <philr_> ...your goal is to server existing users
11:55:06 <philr_> Fernando: it is tricky, I agree but we are trying to understand users better
11:55:54 <philr_> ...time will tell
11:56:34 <philr_> Richard Ishida: In India most people are used to ascii.ascii. Yet small percentage of users speak English
11:56:51 <philr_> ...should the market for IDN's be bigger?
11:57:22 <philr_> Pat Kane: the biggest challenge is the number of languages and scripts
11:58:13 <philr_> Globalization vs LOcalization vs Multiculturism
11:58:30 <philr_> ...we ignore the multicultural component
11:58:58 <philr_> ...Fernando mentioned using MT for Arabic and Chinese
11:59:25 <philr_> ...these are difficult MT languages, do you have specific reasons for using MT for these languages?
11:59:47 <philr_> Fernando: we know Spanish and French are easier
12:00:13 <philr_> ...it is difficult to find the volume of translators for Chinese and Arabic
12:01:22 <philr_> ...SDL: it's important to consider user feedback
12:02:46 <philr_> Arle: Session ended, break for lunch. After lunch breakout into parallel breakout sessions.
12:05:10 <philr_> rssagent generate draft minutes
12:05:33 <philr_> rssagent, generate draft minutes
12:06:51 <philr_> rssagent, draft minutes
12:07:53 <philr_> rrsagent, draft minutes
12:07:53 <RRSAgent> I have made the request to generate http://www.w3.org/2013/03/13-mlwrome-minutes.html philr_
13:35:48 <daveL> daveL has joined #mlwrome
13:35:52 <daveL> scrobe: daveL
13:35:57 <daveL> scribe: DaveL

13:36:18 <daveL> ... breakout sessions
13:36:19 <daveL> Des Oates:
13:36:26 <daveL> ... itnernational domain names, chaired by pat kane
13:36:47 <daveL> .... best practice in multilingual linked open data - chaired dom jones
13:37:10 <daveL> ... translation quality - charied by arle lommel
13:37:23 <daveL> from floor
13:37:48 <daveL> ... interest in translation quality on postediting as well as human translation
13:38:09 <daveL> des: the ball is now over to audience to propose now toher topics
13:38:39 <daveL> ... aim is the not just discuss but to propose action plans to deliver upon later
13:38:53 <daveL> Des: some personal finding from workshop to date
13:39:49 <daveL> ... input from content creators, advances using ITS2.0 from cocomore, also with joomla and use with xliff in drupal and dita
13:39:49 <daveL> ... also real world use cases with spanish tax office
13:40:31 <daveL> Des: the other big topic is multilignual search and ML SEO, including insughts form Paula
13:40:41 <daveL> ... a big issue for adobe related to keyword and term management
13:45:03 <nwaltham_> nwaltham_ has joined #mlwrome
13:48:37 <nwaltham___> nwaltham___ has joined #mlwrome
13:51:31 <daveL> daveL has joined #mlwrome
13:51:46 <daveL> topic: deciding discussion topics for breakout sessions

Chair: Des Oates – adobe
Topic: selecting topics
International Domain Names
ML-LOD (Dom)
Trans Quality (Arle) in  library ~25
Standards – XLIFF, ITS, OASIS, W3C ~4
Crowdsourcing and non-market stratergies ~3
Names session (Richard) ~3


MLW-LOD breakout note
Working doc at: http://goo.gl/Th2VA

Topic: Ivan Herman: sem web tech at W3C, updated version
LOD cloud
Community needs more deployment, use cases, data and _linked_ data 
Underlying tech needs to be seen as stable, so people are not waiting for next bi thig, so W3C is not planning anything more to stack
W3C won’t standardise things, but will rely on community groups, e.g. Open Annotation CG
W3C want to extend this and perhaps host vocabs with stable URIs, with registry of meta-data. Not a value measure, just cataloguing meta-data and governance and version control of vocab – will include localisation quality
Need better validation, RDF is not well suited to this, so need structural validation (schema-like) and quality validation – but how to validate a multilingual vocab – a question for this vocab
Disconnect between LOD and non-LOD , e,g CSV, text files etc. We site developer use data, not linked data
Reference London Open Data on the Web workshop
Multilinguality in non-RDF data

Topic: Gordon Dunsire: IFLA – similar to earlier presentation in plenary session
Have a few trillion triples potentially, a large high quality collection
Some translated, some not, and partial translations
Have a ML disctionay for authoritative publication of pub categories, but not very accessible to end users or web developers

Topic: Jose Labra: 
Practical ML LOD guidelines 
Naming guidelines: extrapolated from mono lingual guidelines
Opaque URI raises some controversy

Topic: Charles Neville:
Tools need to be good
Opaque URI not helpful
Yandex is in schema.org, but only look at microdata rather than rdfa because it was easier, but now might be regretting this and rdfa might be better. But because of this, in Russia microdata is more common than microdata.
At yanex, people tend to add label/meta-data in English, but it was a better process to do it in Russian – on the whole you perform better in your own English.
Topic: Roberto Navigli:
BabelNet – wide ML semantic network, with encyclopedic and lexiogrpahic from Wikipedia and Wordnet 
http://babelnet.org/
6 languages cover, moving to 40+
3 million synsets
Planning to integrate babelnet into linguistic linked open data cloud
Contribution to LOD: 
Make available in lemon, real large ML LOD example

Topic: Haofen Wang: apex labs, china– data and knowledge management labs
http://zhishi.me
Chinese LOD (CLOD) – 8 million instances, 1 billion triples, chinese wikipedia, baike.com and baidu encyclopedia site
Issues: need to use IRI, but limited by use of older browsers 
Naming resoruces, Wikipedia uses traditional Chinese rather than simplified Chinese
Integrating with e-commerce sites 360buy, taobao and soc net weibo and dianping to motivate more open LOD data streams
Align with schema.org

 Group Reports:
Topic: Jose: ML-LOD group report
Notes taken on google docs.
Topics: Naming: descriptive vs opaque, depends on the use case, useful for both
Labelling: should always have language tag
Interlinks: sameAs and see also may not be useful in all cases, lexical/lingusitic resource interlinking not always the same as conceptual interlinking
Will start community group on best practice in ML LOD
Richard Ishida notes this is easy to set up and join

Topic: Arle: translation quality group report 
Need to decouple production method from end use
Source quality is an issues, not always the translators fault
Quality is dependend on the step in workflow where it is used
Expectation need to be clear
Are existing metrics actually valid and reproducible? Some are academic and not useful for production
Need some process metrics to track these. Ethnographic studies of posteditors.
Additional factors, see slides.
So what does multilingual web do to help, three points:
Context, audeience and use – common methods for HT, MT and PR MT – but need to be broadened out beyondQT Launchpad (wee workshop tomorrow)
Don’t reinvent the wheel, harmonise parallel work
An ongoing effort needed perhaps centred on MLW community at W3C


Topic: Pat Kane: International Domain Names group report
It is an ecosystem problems, not just for W3C but other bodies, other voices needed also.
Perhaps w3C working group could be a startingpoint

Topic: David Filip: Standards group report
Gabor, Ionannis, btryan, Yves, DF
CMS-L10n roundtrip, term management, and harmonisation efforts
Seemed to cover many issues in existing ITS2.0-XLIFF mapping, e.g. terminology (usage and forbidden)
But do need some standardised API, connectors and brokers
In terminology need a message broker. Especially in interactive scenario with multiple terminology systems in real time.
Topic: Juan: Names group report
Focus on 3 use cases: 
1)	Recognition, e.g. named entity recognition and resolution, focussed on person names, for MT, for search and also segmentation (over line boundaries)
2)	Display: sorting names in lists. Contextual usage – formal, familial, full (postal), autocompletion, abbreviation (e.g. in paper author list), text to speech
3)	Capturing names: transliteration, speech to text, input form – size, order labels
Problems listed
Propose perhaps define an ontology of names

Topic: Reinhard Schaler - Crowdsourcing group report (Easyling and FAO)
Discussed different scenarios, e.g. for commercial and for non-market/non-profit, people motivation and associated support systems
Practical implementations: environments need to be easy to use, looking at Easyling, FAO and SOLAS
Too few to set up a bigger group, but invite other to participate.

Topic: Des wrap up and questions
… asks group chairs to come on stage to take questions
Question: Christian Liesk – why do we need an additional terminology related standard, can’t we reuse existing LOD mechanisms
David F – agrees that linked data can helps but need specific support for terminology. This area also suffers from many poorly adopted standards so a new one might make sense.
Christian: good to bring LOD, terminology communities together
David F: agrees, standards harmonisation is key to going forward. But also there is a gap in the API level
Felix: looks for more standardisation people and localisation companies in the ML LOD best practice group.
Des: supports this call
… asks about CMIS
David Filip: yes, work 
Pedro: we are going to GALA, so if there is a clear message we will
Ionannis:  also visiting a industrial term  working 
Dave: lets open group now, so we have a concrete URL to point people at
 Christian: how will names discussion advance
Richard: no specific plans
David: asks about locatives in names
Richard: not addressed this yet, as there were more immediate use cases, nor inflection, or other context 
Dave: asks if I18n interest group is a good place for this
Richard: community group can be more focussed, IG may reach more people but interest can be more focussed
Felix: agree we need to think about where best to place this. But in all cases we need hero to drive it forward – the ML-LOD BP seems to have two to three
Richard: +1 need committed driving person
Des: wrap up there, thanks everyone

Topic: Arle – closing remarks
Slide will be available soon, by next week. Linked from programme page
There is streaming video available already provided by FAO, and better quality lectures available from Video Lectures
 Report will be produced soon, based on scribes
Thanks to sponsors, Verisign who support workshop and dinner, and QTLaunchpad, who will having, FAO for local support, DFKI and Neives, and Felix. Thanks to EC and their sponsorships, W3C for logistical home, programme committee, organising group, speakers, chairs and scribe (especially Felix).
Funding for conference series comes from MLW-LT which finished end of year. Waiting to hear on further funding from EU projects, but if anyone has further opportunities for funding or is willing to host future events please talk to Felix or Arle.