W3C

Semantic Web

In addition to the classic “Web of documents” W3C is helping to build a technology stack to support a “Web of data,” the sort of data you find in databases. The ultimate goal of the Web of data is to enable computers to do more useful work and to develop systems that can support trusted interactions over the network. The term “Semantic Web” refers to W3C’s vision of the Web of linked data. Semantic Web technologies enable people to create data stores on the Web, build vocabularies, and write rules for handling data. Linked data are empowered by technologies such as RDF, SPARQL, OWL, and SKOS.

Linked Data Header link

The Semantic Web is a Web of data — of dates and titles and part numbers and chemical properties and any other data one might conceive of. RDF provides the foundation for publishing and linking your data. Various technologies allow you to embed data in documents (RDFa, GRDDL) or expose what you have in SQL databases, or make it available as RDF files.

Vocabularies Header link

At times it may be important or valuable to organize data. Using OWL (to build vocabularies, or “ontologies”) and SKOS (for designing knowledge organization systems) it is possible to enrich data with additional meaning, which allows more people (and more machines) to do more with the data.

Query Header link

Query languages go hand-in-hand with databases. If the Semantic Web is viewed as a global database, then it is easy to understand why one would need a query language for that data. SPARQL is the query language for the Semantic Web.

Inference Header link

Near the top of the Semantic Web stack one finds inference — reasoning over data through rules. W3C work on rules, primarily through RIF and OWL, is focused on translating between rule languages and exchanging rules among different systems.

Vertical Applications Header link

W3C is working with different industries — for example in Health Care and Life Sciences, eGovernment, and Energy — to improve collaboration, research and development, and innovation adoption through Semantic Web technology. For instance, by aiding decision-making in clinical research, Semantic Web technologies will bridge many forms of biological and medical information across institutions.

News Atom

See the program.The keynote speaker will be Alolita Sharma, Director of Language Engineering from the Wikimedia Foundation. She is followed by a strong line up in sessions entitled Developers, Creators, Localizers, Machines, and Users, including speakers from Microsoft, Wikimedia Foundation, the UN FAO, W3C, Yandex, SDL, Lionbridge, Asia Pacific TLD, Verisign, DFKI, and many more. On the afternoon of the second day we will hold Open Space breakout discussions. Abstracts and details about an additional poster session will be provided shortly.

The program will also feature an LD4LT event on May 8-9, focusing on text analytics and the usefulness of Wikipedia and Dbpedia for multiilngual text and content analytics, and on language resources and aspects of converting selected types of language resources into RDF.

Participation in both events is free. See the Call for Participation for details about how to register for the MultilingualWeb workshop. The LD4LT event requires a separate registrationand you have the opportunity to submit position statements about language resources and RDF.

If you haven’t registered yet, note that space is limited, so please be sure to register soon to ensure that you get a place.

The MultilingualWeb workshops, funded by the European Commission and coordinated by the W3C, look at best practices and standards related to all aspects of creating, localizing and deploying the multilingual Web. The workshops are successful because they attract a wide range of participants, from fields such as localization, language technology, browser development, content authoring and tool development, etc., to create a holistic view of the interoperability needs of the multilingual Web.

We look forward to seeing you in Madrid!

Register now for the recently announced workshop on Linked Data, Language Technologies and Multilingual Content Analytics (8-9 May, Madrid). A preliminary agenda has been created and the registration formis available.

If you are interested in contributing a position statement please indicate this in the dedicated field in the registration form. The workshop organizers will come back to you with questions to answer in the position statement. We then will select which statements are appropriate for presentations on 9 May, and inform you by 28 April.

We are looking forward to see you in Madrid, both for this event and the MultilingualWeb workshop!

This updatebrings the article in line with recent developments in HTML5, and reorganizes the material so that readers can find information more quickly. This led to the article being almost completely rewritten.

The article addresses the question: Which character encoding should I use for my content, and how do I apply it to my content?

German, Spanish, Brazilian Portuguese, Russian, Swedish and Ukrainian translators are asked to update their translation of this article within the next month, otherwise the translations will be removed per the translation policy, since the changes are substantive.

Aligned with the MultilingualWeb workshop (7-8 May, Madrid), the LIDER project is organizing a roadmapping workshop 8-9 May . The 8 May afternoon session will provide a keynote by Seth Grimesand also focus on the topic of Wikipedia for multilingual Web content. Via several panels including contributions from key Wikipedia engineers, we will discuss cross lingual analytics and intelligent multilingual content handling in Wikipedia. On 9 May, a 1/2 day session will focus on aspects of migrating language resources into linked data.

Mark your calendar now! A dedicated registration form including ways to contribute to the workshop agenda will be made available soon.

sandhawke

11 March 2014

from Decentralyze

Lots of people can’t seem to understand the relationship of the Web to the Internet. So I’ve come up with a simple analogy:

The Web is to the Internet as Beer is to Alcohol.

For some people, sometimes, they are essentially synonymous, because they are often encountered together. But of course they are fundamentally different things

In this analogy, Email is like Wine: it’s the other universally popular use of the Internet/Alcohol.

But there are lots of other uses, too, somewhat more obscure. We could say the various chat protocols are the various Whiskeys. IRC is Scotch; XMPP is Bourbon.

gopher is obscure and obsolete, …. maybe melomel.

ssh is potato vodka.

I leave the rest to your imagination.

Note that the non-technician never encounters raw Internet, just like they never encounter pure alcohol. They wouldn’t know what it was if it stepped on their foot. Of course, chemists are quite familiar with pure alcohol, and network technicians and programmers are familiar with TCP, UDP, and IP.

The familiar smell of alcohol, that you can detect to some degree in nearly everything containing alcohol — that’s DNS.


We would like to remind you that the deadline for speaker proposals for the 7th MultilingualWeb Workshop (May 7–8, 2014, Madrid, Spain) is on Friday, March 14, at 23:59 UTC.

Featuring a keynote by Alolita Sharma (Director of Engineering, Wikipedia) and breakout sessions on linked open data and other critical topics, this Workshop will focus on the advances and challenges faced in making the Web truly multilingual. It provides an outstanding and influential forum for thought leaders to share their ideas and gain critical feedback.

While the organizers have already received many excellent submissions, there is still time to make a proposal, and we encourage interested parties to do so by the deadline. With roughly 200 attendees anticipated for the Workshop from a wide variety of profiles, we are certain to have a large and diverse audience that can provide constructive and useful feedback, with stimulating discussion about all of the presentations.

For more information and to register, please visit the Madrid Workshop Call for Participation.

sandhawke

27 February 2014

from Decentralyze

The world of computing has a huge problem with surveillance. Whether you blame the governments doing it or the whistleblowers revealing it, the fact is that consumer adoption and satisfaction is being inhibited by an entirely-justified lack of trust in the systems.

Here’s how the NSA can fix that, increase the safety of Americans, and, I suspect, redeem itself in the eyes of much of the country. It’s a way to act with honor and integrity, without betraying citizens, businesses, or employees. The NSA can keep doing all the things it feel it must to keep America safe (until/unless congress or the administration changes those rules) and by doing this additional thing it would be helping protect us all from the increasing dangers of cyber attacks. And it’s pretty easy.

The proposal is this: establish a voluntary certification system, where vendors can submit products and services for confidential NSA review. In concluding its review, the NSA would enumerate for the public all known security vulnerabilities of the item. It would be under no obligation to discover vulnerabilities. Rather, it would simply need to disclose to consumers all the vulnerabilities of which it happens know, at that time and on an ongoing basis, going forward.

Vendors could be charged a reasonable fee for this service, perhaps on the order 1% gross revenue for that product.

Crucially, the NSA would accept civil liability for any accidental misleading of consumers in its review statements. Even more important: the NSA chain of command from the top down to the people doing the review would accept criminal liability for any intentionally misleading statements, including omissions. I am not a lawyer, but I think this could be done easily by having the statements include sworn affidavits stating both their belief in these statements and their due diligence in searching across the NSA and related entities. I’m sure there are other options too.

If congress wants to get involved, I think it might be time to pass an anti zero day law, supporting NSA certification. Specifically, I’d say that anyone who knows of a security vulnerability in an NSA certified product must report it immediately to the NSA or the vendor (which must tell each other). 90 days after reporting it, the person who reported it would be free to tell anyone / everyone, with full whistleblower protection. Maybe this could just be done by the product TOS.

NSA certified products could still include backdoors and weaknesses of all sorts, but their existence would no longer be secret. In particular, if there’s an NSA back door, a cryptographic hole for which they believe they have the only key, they would have to disclose that.

That’s it. Dear NSA, can you do this please?

For the rest of you, if you work at the kind of company the Snowden documents reveal to have been compromised, the companies who somehow handle user data, would you support this? Would your company participate in the program, regaining user trust?


We are please to announce that Alolita Sharma, Director of Engineering for Internationalization and Localization at Wikipedia, will deliver the keynote at the 7th Multilingual Web Workshop, “New Horizons for the Multilingual Web,” in Madrid, Spain (7–8 May 2014).

With over 30 million articles in 286 languages as of January 1, 2014, Wikipedia has now become one of the largest providers of multilingual content in the world. Because of its user-generated and constantly changing content, many traditional processes for managing multilingual content on the web either do not work or do not scale well for Wikipedia. Alolita Sharma’s keynote will highlight Wikipedia’s diversity in multilingual user-generated content and the language technologies that Wikipedia has had to develop to support its unprecedented growth of content. She will also discuss the many challenges Wikipedia faces in providing language support for the mobile web.

The Multilingual Web Workshop series brings together participants interested in the best practices, new technologies, and standards needed to help content creators, localizers, language tools developers, and others address the new opportunities and challenges of the multilingual Web. It will provide for networking across communities and building connections.

Registration for the Workshop is free, and early registrationis recommended since space at the Workshop is limited.

There is still opportunity for individuals to submit proposals to speak at the workshop. Ideal proposals will highlight emerging challenges or novel solutions for reaching out to a global, multilingual audience. The deadline for speaker proposals is March 14, but early submission is strongly encouraged. See the Call for Participationfor more details.

This workshop is made possible by the generous support of the LIDER project, which will organize a roadmapping workshop on linked data and content analytics as one of the tracks at Multilingual Web Workshop.

The MultilingualWeb-LTWorking Group has been closed, since it successfully completed the work in its charter.

We thank the co-chairs, the editors, implementers and the Working Group for achieving the goal to publish Internationalization Tag Set (ITS) 2.0as a W3C Recommendation, and for doing so ahead of the original schedule.

Work on enlarging the community around ITS, gathering feedback and requirements for future work will now continue in the ITS Interest Group.

The LD4LT (Linked Data for Language Technology) Workshop will be held on 21 March, in Athens, Greece, aligned with the European Data Forum 2014 . See the agenda.

The workshop is a free community event – there is no admission fee for participants, but registration is required.

You are encouraged to provide a title for a position statement in your registration form. This is a simple, short statement that summarizes your ideas / technologies / use cases related to Linked Data and Language Technology.

The meeting is supported by the LIDER project, the MultilingualWeb community , the NLP2RDF project , the Working Group for Open Data in Linguistics as well as the DBpedia Project.

As input to the discussion and the work of the LD4LT group, you may also want to fill in the first LIDER survey.