<?xml version="1.0"?>
<?xml-stylesheet type="text/css" href="http://www.w3.org/wiki/skins/common/feed.css?207"?>
<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/">
	<channel>
		<title>W3C Wiki - User contributions [en]</title>
		<link>http://www.w3.org/wiki/Special:Contributions/Rcygania2</link>
		<description>From W3C Wiki</description>
		<language>en</language>
		<generator>MediaWiki 1.15.5</generator>
		<lastBuildDate>Fri, 24 May 2013 18:30:36 GMT</lastBuildDate>
		<item>
			<title>ConverterToRdf</title>
			<link>http://www.w3.org/wiki/ConverterToRdf</link>
			<description>&lt;p&gt;Rcygania2:&amp;#32;clean up info around CSV/TSV/Excel/spreadsheets; add SDMX&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;__NOTOC__&lt;br /&gt;
A Converter to RDF is a tool which converts application data from an application-specific format into RDF for use with RDF tools and integration with other data. Converters may be part of a one-time migration effort, or part of a running system which provides a semantic web view of a given application. See also: [[RDFImportersAndAdapters]]&lt;br /&gt;
&lt;br /&gt;
__TOC__&lt;br /&gt;
&lt;br /&gt;
Please add converters as you make them or hear of them.&lt;br /&gt;
&lt;br /&gt;
== Formats ==&lt;br /&gt;
&lt;br /&gt;
in alphabetical order:&lt;br /&gt;
&lt;br /&gt;
=== [[BibTex]] ===&lt;br /&gt;
&lt;br /&gt;
[[BibTex]] is the format for bibliographic references in TeX.&lt;br /&gt;
&lt;br /&gt;
* [http://data.bibbase.org BibBase] transforms BibTeX files (given in a URL) into Linked Data with RDF/XML output support. &lt;br /&gt;
** Uses a custom  [http://purl.org/bibbase/ontology BibTeX ontology] but provides a table of mappings to other ontologies&lt;br /&gt;
** Also provides [http://bibbase.org HTML interface] and RSS feed&lt;br /&gt;
* [http://simile.mit.edu/repository/RDFizers/bibtex2rdf/ bibtex2rdf] transforms BibTEX files into RDF/XML. (Simile)&lt;br /&gt;
* [http://www.l3s.de/~siberski/bibtex2rdf/ bibtex2rdf] - A configurable BibTeX to RDF Converter by  Wolf Siberski. &lt;br /&gt;
* [http://www.cs.vu.nl/%7Emcaklein/bib2rdf/ An online service] set up at the Vrije Universiteit in Amsterdam, the Netherlands, following the [http://ontoweb.aifb.uni-karlsruhe.de/ OntoWeb portal] vocabulary. The [http://www.cs.vu.nl/%7Emcaklein/bib2rdf/bib2rdf perl source] can also be downloaded.&lt;br /&gt;
* [http://www.aifb.uni-karlsruhe.de/WBS/pha/bib/index.html Java BibTeX-To-RDF Converter] based on the [http://ontobroker.semanticweb.org/ontos/swrc.html SWRC] terminology.&lt;br /&gt;
&lt;br /&gt;
=== Bittorrent ===&lt;br /&gt;
&lt;br /&gt;
* http://www.inf.unideb.hu/~jeszy/rdfizers is alas now 404 (in 2007). This was a link from RDFizers but may be incorrect.&lt;br /&gt;
&lt;br /&gt;
=== CSV (Comma-Separated Values) ===&lt;br /&gt;
&lt;br /&gt;
See also: [[#Flat files|Flat Files]] and [[#TSV|TSV]]&lt;br /&gt;
&lt;br /&gt;
* An [http://lab.linkeddata.deri.ie/2010/grefine-rdf-extension/ RDF Extension] is available for [http://code.google.com/p/google-refine/ Google Refine]. It can convert Excel, CSV, and other tabular data to RDF. The schema mapping can be defined in a graphical UI.&lt;br /&gt;
* [http://rdf123.umbc.edu/ RDF123] has Windows and Linux applications to download, a Java application and servlet. &lt;br /&gt;
* [http://xlwrap.sourceforge.net XLWrap] wraps spreadsheets (including cross tables) to arbitrary RDF graphs; supports Excel/OpenDocument/CSV streamed processing, local/HTTP loading, expressions similar to Excel/OpenOffice Calc, custom functions, usage via API or SPARQL endpoint&lt;br /&gt;
* [http://purl.org/twc/id/software/csv2rdf4lod csv2rdf4lod] uses declarative RDF enhancement parameters to specify how to transform tabular data into well-structured, well-connected RDF. The tool uses identifiers for ''source'' organization, ''dataset'', and ''version'' to establish default namespaces for all URIs created and provides VoID and provenance metadata as part of the conversion output.&lt;br /&gt;
* [http://github.com/cygri/tarql Tarql] is a command-line application that converts CSV to RDF with a user-defined mapping. The mapping is written in standard SPARQL 1.1.&lt;br /&gt;
&lt;br /&gt;
=== Debian  ===&lt;br /&gt;
&lt;br /&gt;
The package information in Debian and similar systems (Ubuntu, Fink, etc), with its general usefulness and its graph-like nature, is a clear candidate for conversion to RDF.&lt;br /&gt;
&lt;br /&gt;
See [http://blog.drinsama.de/erich/en/xml/2007011204-rdf-representation-of-packages VitaVoni blog] about this.&lt;br /&gt;
&lt;br /&gt;
* [http://www.w3.org/2000/10/swap/util/fink2n3.py finkn3.py] Takes Fink (OS-X port of Debian packaging) dependencies and converts to to RDF/N3. (SWAP) No idea whether this would be a quick hack to export debian data.&lt;br /&gt;
* [http://github.com/nbarrientos/steamy STEAMY] converts Debian packages to RDF.&lt;br /&gt;
&lt;br /&gt;
=== Email (RFC822 headers) ===&lt;br /&gt;
&lt;br /&gt;
* [http://simile.mit.edu/repository/RDFizers/email2rdf/ email2rdf ] transforms email mbox files into RDF/XML. (Simile)&lt;br /&gt;
* [http://www.w3.org/2000/04/maillog2rdf/aboutMsg.py aboutMsg.py] converts email metadata to RDF. (SWAP)&lt;br /&gt;
* [http://swaml.berlios.de/ SWAML] transforms a mailing list into RDF/XML and XHTML+RDFa using [[SIOC]].&lt;br /&gt;
** And [http://linkedmarkmail.wikier.org/ LinkedMarkMail] live transforms into RDF/XML the mailing lists' archives indexed by [http://markmail.org/ MarkMail].&lt;br /&gt;
* [http://search.cpan.org/dist/Email-MIME-XMTP/ Email::MIME::XMTP] Perl extension to read and write [http://www.openhealth.org/xmtp/ XMTP] &lt;br /&gt;
* [http://aperture.sourceforge.net aperture.sf.net] IMAP crawler&lt;br /&gt;
&lt;br /&gt;
There are others in this vein which run over IMAP or mailbox files.@@&lt;br /&gt;
&lt;br /&gt;
=== Excel ===&lt;br /&gt;
&lt;br /&gt;
* Cambridge Semantics' [http://www.cambridgesemantics.com/products/anzo_for_excel Anzo for Excel] extracts RDF data from Excel spreadsheets while keeping the spreadsheet in-sync with the underlying data as things change&lt;br /&gt;
* [http://xlwrap.sourceforge.net XLWrap] wraps spreadsheets (including cross tables) to arbitrary RDF graphs; supports Excel/OpenDocument/CSV streamed processing, local/HTTP loading, expressions similar to Excel/OpenOffice Calc, custom functions, usage via API or SPARQL endpoint&lt;br /&gt;
* [http://www.topbraidcomposer.com TopBraid Composer] can convert Excel spreadsheets into instances of an RDF schema.&lt;br /&gt;
* [http://github.com/Data2Semantics/TabLinker TabLinker] can convert non-standard Excel spreadsheets to the Data Cube vocabulary, e.g. Excel files that contain hierarchical information in row and column headers etc.&lt;br /&gt;
* [http://oeg-dev.dia.fi.upm.es/nor2o/ NOR2O] can convert excel to Scovo and Data Cube Vocabulary.&lt;br /&gt;
* [http://www.mindswap.org/%7Erreck/excel2rdf.shtml Esxcel2rdf] is a Microsoft Windows program (exe) that converts Excel files into valid RDF. It has been tested on Windows 98, and Windows 2000 Professional. ([[MindSwap]]) Export can be done via comma- or tab- separated values. See Flat Files above.&lt;br /&gt;
* [http://aperture.sourceforge.net aperture.sf.net] includes Java crawler for Excel and open document. Does only extract plaintext and basic metadata, though.&lt;br /&gt;
* [http://www.tao-project.eu/researchanddevelopment/demosanddownloads/RDBToOnto.html RDBToOnto], see description below under SQL section.&lt;br /&gt;
&lt;br /&gt;
=== EXIF ===&lt;br /&gt;
&lt;br /&gt;
See JPEG.&lt;br /&gt;
&lt;br /&gt;
=== File Systems ===&lt;br /&gt;
&lt;br /&gt;
* [http://www.cs.univie.ac.at/publication.php?pid=5750 TripFS] exposes an entire file system as linked data, tracks changes, and links files to external data sources.&lt;br /&gt;
&lt;br /&gt;
=== Flickr data ===&lt;br /&gt;
&lt;br /&gt;
* Dave Becketts [http://librdf.org/flickcurl/ flickurl] library can access Flickr information (including machine tags) and convert it to RDF &lt;br /&gt;
&lt;br /&gt;
=== Flat files ===&lt;br /&gt;
&lt;br /&gt;
See also: [[#CSV|CSV]] and [[#TSV|TSV]]&lt;br /&gt;
&lt;br /&gt;
* [http://simile.mit.edu/repository/RDFizers/flat2rdf/ flat2rdf]  converts classic unix text database files, like /etc/passwd, into RDF/N3 (Simile)&lt;br /&gt;
&lt;br /&gt;
=== GPS ===&lt;br /&gt;
&lt;br /&gt;
* [http://www.hackdiary.com/archives/000040.html garmin2rdf.py] Reads a Garmin GOPS receiver, dumping the contents in RDF/XML. (Matt Biddulph)&lt;br /&gt;
* [http://www.w3.org/2000/10/swap/pim/fromGarmin.py fromGarmin.py] Downloads GPS data from a Garmin on a serial link to an RDF/N3 file. (SWAP)&lt;br /&gt;
&lt;br /&gt;
=== iCalendar ===&lt;br /&gt;
&lt;br /&gt;
iCalendar is an IETF standard for calendar (event and to-do list) data.  &lt;br /&gt;
Icalendar files typically are stored with a .ics extension.&lt;br /&gt;
&lt;br /&gt;
* [http://www.w3.org/2002/12/cal/fromIcal.py fromIcal.py] converts iCalendar form to RDF&lt;br /&gt;
* [http://www.w3.org/2000/10/swap/pim/toIcal.py toIcal.py] converts RDF back into iCalendar.&lt;br /&gt;
* [http://aperture.sourceforge.net aperture.sf.net] java converter for iCalendar included&lt;br /&gt;
* [http://torrez.us/ics2rdf/] iCal to RDF Service&lt;br /&gt;
&lt;br /&gt;
=== Java bytecode ===&lt;br /&gt;
&lt;br /&gt;
* [http://simile.mit.edu/repository/RDFizers/java2rdf/ java2rdf] scans [http://java.sun.com/ java] bytecode for method calls and creates a description of the dependencies between classes and the package/archive encoded in RDF/N3. (Simile)&lt;br /&gt;
&lt;br /&gt;
=== Javadoc ===&lt;br /&gt;
&lt;br /&gt;
* [http://simile.mit.edu/repository/RDFizers/javadoc2rdf/ javadoc2rdf] is a doclet that makes javadoc output metadata about your code (structure of the classes, methods, comments, etc.) encoded in RDF/N3. (Simile) &lt;br /&gt;
&lt;br /&gt;
=== Issue tracking: [http://www.atlassian.com/software/jira/ Jira] ===&lt;br /&gt;
&lt;br /&gt;
* [http://simile.mit.edu/repository/RDFizers/jira2rdf/ jira2rdf]  transforms Atlassian Jira's events about bug reports and issue tracking into RDF/N3.&lt;br /&gt;
&lt;br /&gt;
=== JPEG ===&lt;br /&gt;
&lt;br /&gt;
The metadata within JPEG photo is encoded in the EXIF standard.&lt;br /&gt;
&lt;br /&gt;
* [http://simile.mit.edu/repository/RDFizers/jpeg2rdf/ jpeg2rdf] scans a folder for JPEG files, parses the EXIF and IPCT metadata found in those files and dumps an RDF/N3 representation of it into a file. (Simile)&lt;br /&gt;
* [http://www.w3.org/2000/10/swap/pim/jhead/ An adapted version of jhead] extracts RDF data form the EXIT encoded in JPEG files within a directory. Generates RDF/N3. (SWAP)&lt;br /&gt;
&lt;br /&gt;
=== LDIF ===&lt;br /&gt;
&lt;br /&gt;
This is format used for contact information in LDAP server system. It is for example exported by Thunderbird's address-book.&lt;br /&gt;
&lt;br /&gt;
* [http://www.w3.org/2000/10/swap/pim/ldif2n3.py ldif2n3.py]  Very incomplete, but useful. Generates foaf. Hides email addresses by hashing in the FOAF style if -m command flag is given. (SWAP)&lt;br /&gt;
&lt;br /&gt;
=== Makefile ===&lt;br /&gt;
&lt;br /&gt;
The unix Makefile syntax expresses dependencies between files in a software build.&lt;br /&gt;
&lt;br /&gt;
* [http://www.w3.org/2000/10/swap/util/make2n3.py make2n3.py]  Convert the makefiles in several directories in RDF and merge them to get the big picture. (SWAP)&lt;br /&gt;
&lt;br /&gt;
=== MARC ===&lt;br /&gt;
&lt;br /&gt;
* [http://simile.mit.edu/repository/RDFizers/marcmods2rdf/ marcmods2rdf] &lt;br /&gt;
transforms [http://www.loc.gov/marc/ MARC] records from Z39.2 format into [http://www.loc.gov/standards/mods/ MODS] and then from MODS to an RDF representation of MODS.&lt;br /&gt;
* [http://www.oeg-upm.net/index.php/en/technologies/228-marimba MARiMbA] is a command-line tool, designed with librarians in mind, to transform MARC (MAchine-Readable Cataloging) records to RDF, following Linked Data best practices.&lt;br /&gt;
&lt;br /&gt;
=== Meteographical ===&lt;br /&gt;
* [http://inamidst.com/sw/meteo/ Meteo] is UK weather forecast data in RDF, extracted from NOAA's public domain GRIB files. Example: [http://inamidst.com/sw/meteo/rdf/London London].&lt;br /&gt;
&lt;br /&gt;
=== Microformats ===&lt;br /&gt;
* [http://www/incubator.apache.org/any23/ Apache Any23] is a Java library web service and command line tool for parsing multiple document formats and extracting structured data in RDF format from a variety of Web documents. It is used by [http://sindice.com/ Sindice.com]. The microformats support is [http://sindice.com/developers/microformat detailed in the Sindice.com documentation].&lt;br /&gt;
&lt;br /&gt;
=== Multimedia ===&lt;br /&gt;
&lt;br /&gt;
Following the [http://en.wikipedia.org/wiki/Don't_repeat_yourself DRY principle], a pointer to tools in the realm of multimedia (origin: [http://www.w3.org/2005/Incubator/mmsem MMSEM-XG]):&lt;br /&gt;
&lt;br /&gt;
* [http://www.w3.org/2005/Incubator/mmsem/wiki/Tools_and_Resources Multimedia Semantics Tools]&lt;br /&gt;
&lt;br /&gt;
=== OAI-PMH ===&lt;br /&gt;
&lt;br /&gt;
* [http://simile.mit.edu/repository/RDFizers/oai2rdf/ oai2rdf] harvests an [http://www.openarchives.org/OAI/openarchivesprotocol.html OAI-PMH] repository and transforms the captured metadata in an RDF representation thru pluggable XSLT stylesheets.&lt;br /&gt;
&lt;br /&gt;
=== Outlook ===&lt;br /&gt;
&lt;br /&gt;
Microsoft Outlook contains contact and event data, and so on in a proprietary format.&lt;br /&gt;
&lt;br /&gt;
* [http://www.w3.org/2000/10/swap/pim/lookout.py Lookout.py] convers the Microsoft Outlook calendar and address format into RDF.  (SWAP)&lt;br /&gt;
* [http://aperture.sourceforge.net aperture.sf.net] includes Java crawler for MS Outlook&lt;br /&gt;
&lt;br /&gt;
=== Open Financial Exchange (OFX) ===&lt;br /&gt;
&lt;br /&gt;
[http://www.ofx.net/ OFX] is the format for downloaded bank statements and other financial information.&lt;br /&gt;
There are various levels of OFX, the early ones being HTTP headers followed by SGML, the later ones being HTTP-like headers followed by XML.&lt;br /&gt;
&lt;br /&gt;
* [http://www.w3.org/2000/10/swap/pim/financial/OFX-to-n3.py OFX-to-n3.y] converts OFX format to RDF/N3. The conversion is only syntactic.  The OFX modeling is pretty well thought out, so taking it as defining an RDF ontology seems to make sense. Rules can then be used to define mapping into your favorite ontology.&lt;br /&gt;
&lt;br /&gt;
=== Open [[CourseWare]] ===&lt;br /&gt;
&lt;br /&gt;
* [http://simile.mit.edu/repository/RDFizers/ocw2rdf/ ocw2rdf] harvests metadata from the MIT [http://ocw.mit.edu/ OpenCourseWare] web site and transforms it into an RDF representation of [http://meta.wikimedia.org/wiki/IEEE_LOM  IEEE LOM].&lt;br /&gt;
&lt;br /&gt;
=== Palm OS ===&lt;br /&gt;
&lt;br /&gt;
* [http://dev.w3.org/cvsweb/2001/palmagent Palmagent] converts the calendar format of PalmOS into RDF. (SWAP)&lt;br /&gt;
&lt;br /&gt;
=== plist ===&lt;br /&gt;
&lt;br /&gt;
The Apple OS-X property list (.plist) filetype is an XML fromat for arbitrary structured data.&lt;br /&gt;
Numeric keys are used as local IDs.   OS X applications store many kinds uf data in these files, including configuration data,  iPhoto almum and photo data, iTunes metadata, and so on.&lt;br /&gt;
&lt;br /&gt;
To convert plists well, added information is necessary, such as a namespace for the properties.&lt;br /&gt;
&lt;br /&gt;
[http://dev.w3.org/cvsweb/2000/10/swap/util/plist2rdf.xsl plist2rdf.xsl] is an XSLT script to convert a plist file into RDF/XML. It does not add namespaces to the exported data.&lt;br /&gt;
&lt;br /&gt;
=== Quicken Interchange Format (QIF) ===&lt;br /&gt;
&lt;br /&gt;
* [http://www.w3.org/2000/10/swap/pim/qif2n3.py qif2n3.py] Takes Quicken interchange format and converts to to RDF/N3. (SWAP)&lt;br /&gt;
&lt;br /&gt;
=== Quick and Dirty CSV to RDF Converter (QUIDICRC) ===&lt;br /&gt;
&lt;br /&gt;
* [http://www.mindswap.org/~anant/quidicrc/ quidicrc] A perl script for rapidly transferring csv to RDF with some translation in the middle. (not actively being maintained, available open source -- SWAP)&lt;br /&gt;
&lt;br /&gt;
=== Random ===&lt;br /&gt;
&lt;br /&gt;
Seriously.&lt;br /&gt;
&lt;br /&gt;
* [http://simile.mit.edu/repository/RDFizers/random2rdf/ random2rdf] generates synthetic random graphs encoded in RDF/N3.&lt;br /&gt;
&lt;br /&gt;
=== SDMX ===&lt;br /&gt;
&lt;br /&gt;
[http://sdmx.org/ SDMX] is an XML-based exchange format for statistical data and metadata, used by major statistics-producing organizations such as Eurostat, the World Bank, OECD, and the IMF.&lt;br /&gt;
&lt;br /&gt;
* [https://github.com/csarven/sdmx-to-qb SDMX to QB] is an XSLT-based converter that turns SDMX data sets and data structure definitions into RDF, using the Data Cube Vocabulary.&lt;br /&gt;
&lt;br /&gt;
=== Spreadsheet ===&lt;br /&gt;
&lt;br /&gt;
See [[#CSV]] and [[#Excel]].&lt;br /&gt;
&lt;br /&gt;
=== SQL ===&lt;br /&gt;
&lt;br /&gt;
SQL databases are rich stores of relational data ideal for export as RDF. &lt;br /&gt;
Conference tracks and many papers cover this subject from different angles. See also: [[RdfAndSql]]&lt;br /&gt;
&lt;br /&gt;
* [http://d2rq.org/ D2RQ] provides a mapping from a SQL server (tested with several brands), producing both linked virtual RDF data files and a SPARQL service.  Uses a configuration file in Turtle. (DERI and FU Berlin)&lt;br /&gt;
* [http://www.w3.org/2000/10/swap/dbork/dbview.py dbview.py] provides a mapping from a SQL server (tested with mySQL), producing linked virtual RDF data files. Uses a configuration file in N3. (SWAP)&lt;br /&gt;
* [[VirtuosoUniversalServer|OpenLink Virtuoso]]'s [http://virtuoso.openlinksw.com/wiki/main/Main/VOSSQLRDF declarative N3/Turtle based Metaschema Language] enables the creation of RDF Instance Data for associated RDF Ontologies via RDF VIEWs of ODBC, JDBC, ADO.NET, and OLE-DB accessible SQL Data. It is important to note that these VIEWs also apply to Native Virtuoso Data and/or Heterogeneous Data from other Web Services, HTTP/WebDAV, NNTP, and other Data Sources known to Virtuoso. This is an enhancement of the traditional SQL VIEW concept than enables multiple use of the same base SQL Data from a variety of data access points.&lt;br /&gt;
* [http://triplify.org Triplify] is a small plugin for Web applications, which reveals the semantic structures encoded in relational databases by making database content available as RDF, JSON or Linked Data.&lt;br /&gt;
* [http://www.tao-project.eu/researchanddevelopment/demosanddownloads/RDBToOnto.html RDBToOnto] is a full-fledged conversion tool that can produce accurate RDF/OWL models from various types of relational databases and Excel spreadsheets. The conversion is fully automated while various parameters can be set through the user interface to refine the resulting models (e.g., derivation of rich class hierarchies, proper naming of instances, database optimization before conversion, etc).  &lt;br /&gt;
* [https://github.com/jpcik/morph morph] or [https://github.com/boricles/morph morph] implement [http://www.w3.org/2001/sw/rdb2rdf/r2rml/ R2RML] and perform a transformation from RDB to RDF.&lt;br /&gt;
&lt;br /&gt;
Some RDF Triple stores are implemented using SQL databases, but that is not covered here.&lt;br /&gt;
&lt;br /&gt;
=== Subversion ===&lt;br /&gt;
&lt;br /&gt;
[http://subversion.tigris.org/ Subversion] is a code-management system.&lt;br /&gt;
&lt;br /&gt;
* [http://simile.mit.edu/repository/RDFizers/svn2rdf/ svn2rdf] A pair of scripts; one can be used in a post-commit subversion hook to generate RDF/N3 with each commit, the other on a working copy. (Simile)&lt;br /&gt;
&lt;br /&gt;
=== TSV (Tab-Separated Values) ===&lt;br /&gt;
&lt;br /&gt;
See also: [[#Flat files|Flat Files]] and [[#CSV|CSV]]&lt;br /&gt;
&lt;br /&gt;
* [http://www.w3.org/2000/10/swap/tab2n3.py tab2n3.py]  Takes Tab-separated text (as typically output by all kinds of things including Microsoft Output and Spreadsheets) and converts it to N3, using the column headings to generate property URIs.  (SWAP)&lt;br /&gt;
* [http://www.topbraidcomposer.com TopBraid Composer] can convert tab-separated spreadsheet files into an RDF/OWL class with corresponding properties and instances.&lt;br /&gt;
* XLWrap, [http://xlwrap.sourceforge.net] wraps CSV files (and spreadsheets) to arbitrary RDF graphs; supports local/HTTP loading, expressions similar to Excel/OpenOffice Calc, custom functions, usage via API or SPARQL endpoint&lt;br /&gt;
&lt;br /&gt;
=== Talis SW Format Converter ===&lt;br /&gt;
&lt;br /&gt;
* [http://convert.test.talis.com/ Talis' converter], convert from various format to various formats (including RDF-&amp;gt;RDF with various serializations, RDF-&amp;gt;HTML, etc)&lt;br /&gt;
&lt;br /&gt;
=== UML ===&lt;br /&gt;
&lt;br /&gt;
* [http://www.topbraidcomposer.com TopBraid Composer] can convert UML Class Diagrams (XMI format) into RDF/OWL models.&lt;br /&gt;
* [http://eulergui.sourceforge.net/ EulerGUI] is a lightweight IDE that translates on the fly UML and eCore XMI into N3. Moreover there are N3 rules to convert UML to OWL.&lt;br /&gt;
&lt;br /&gt;
=== VCARD, Addressbook, … ===&lt;br /&gt;
&lt;br /&gt;
VACRD is a standard for interchange of contact data, such as business cards and address books.&lt;br /&gt;
&lt;br /&gt;
[http://www.w3.org/TR/vcard-rdf &amp;quot;Representing vCard Objects in RDF/XML&amp;quot;] is a W3C note defining an [http://www.w3.org/2006/vcard/ns ontology] for VCARD. FOAF is widely used ontology covering some of the domain.&lt;br /&gt;
&lt;br /&gt;
* [http://www.holygoat.co.uk/applications/address-book-foaf/projects/ab/ab.py code to convert your Apple Addressbook into FOAF file] (Richard Newman)&lt;br /&gt;
* [http://people.no-distance.net/ol/software/ab-foaf/ ab-foaf] does the same. &lt;br /&gt;
* [http://search.cpan.org/dist/XML-FOAFKnows-FromvCard/ XML::FOAFKnows::FromvCard], Perl extension to create FOAF dumps from vCards. Does not attempt to create a full model, just foaf:knows. It also has some privacy features. In addition to the module, which conforms with the Formatter API specification, comes with a command-line tool. &lt;br /&gt;
&lt;br /&gt;
=== Weather ===&lt;br /&gt;
&lt;br /&gt;
* [http://simile.mit.edu/repository/RDFizers/weather2rdf/ weather2rdf] Given a US city or ZIP code, retrieves weather report data from weather.com and returns it in RDF. (Simile)&lt;br /&gt;
&lt;br /&gt;
=== XML ===&lt;br /&gt;
&lt;br /&gt;
* '''GRDDL:''' Any XML files can be marked up with pointers to XSLT files which convert them to RDF.  The standard for this is [http://www.w3.org/TR/grddl/ GRDDL].  A GRDDL pointer can even be put in an XML schema, so that automatically all XML documents written to that schema will have a defined RDF mapping which any GRDDL-aware processor will benefit from. Several XSLT conversion transformations can be found linked from [[MicroModels]]&lt;br /&gt;
* &amp;lt;span id=&amp;quot;krextor&amp;quot;&amp;gt;&amp;lt;/span&amp;gt;[http://kwarc.info/projects/krextor/ Krextor] is a framework for extracting RDF in various notations from various XML languages and can easily be extended for additional input languages.  Support for RDFa and some mathematical markup languages is built in.  The implementation is done in XSLT, with a command-line frontend and a Java wrapper.&lt;br /&gt;
* [http://www.topbraidcomposer.com TopBraid Composer] can convert XML Schema (and their XML instance files) into RDF/OWL models.&lt;br /&gt;
* Rhizomik [http://rhizomik.net/redefer/ ReDeFer] includes XSD2OWL and XML2RDF plus MPEG-7 to RDF (all XSLT-based)&lt;br /&gt;
* '''XHTML:''' Convert ''existing'' pages to RDF. For example, see [[HtmlToRdf]].&lt;br /&gt;
* [http://www.dblab.ntua.gr/~bikakis/SPARQL2XQuery.html SPARQL2XQuery] The SPARQL2XQuery Framework provides mechanisms for: (a) Query translation (SPARQL to XQuery) (b) Mapping specification &amp;amp; generation (Ontology to XML Schema) (c) Schema transformation (XML Schema to OWL) and (d) Data Transformation (XML to RDF and vice versa)&lt;br /&gt;
&lt;br /&gt;
=== XMP ===&lt;br /&gt;
&lt;br /&gt;
[http://www.adobe.com/products/xmp/ XMP] is an Adobe-sponsored specification for putting RDF metadata in virtually any form of file, including binary formats.  XMP metadata is RDF data in fact, but it has to be extracted from the file.&lt;br /&gt;
&lt;br /&gt;
* [http://www.inf.unideb.hu/~jeszy/xmp/ xmpextractor] extracts XMP data. ([http://www.inf.unideb.hu/~jeszy/ Jeszenszky Péter])&lt;br /&gt;
* [http://dev.w3.org/cvsweb/~checkout~/2004/PythonLib-IH/xmp.py A python script to extract XMP]. There is also a service to do that on-line, see [http://www.ivan-herman.net/WebLog/WorkRelated/SemanticWeb/xmpextract.html separate page]&lt;br /&gt;
&lt;br /&gt;
== Frameworks ==&lt;br /&gt;
&lt;br /&gt;
The following are general tools which provide conversion from many formats.&lt;br /&gt;
&lt;br /&gt;
=== AnnoCultor ===&lt;br /&gt;
&lt;br /&gt;
[http://annocultor.eu/ AnnoCultor] was built during several years of practical work on porting various datasets to RDF. It allows converting data from the following data sources:&lt;br /&gt;
* databases via SQL and JDBC;&lt;br /&gt;
* XML files, also in batch;&lt;br /&gt;
* RDF files,&lt;br /&gt;
* Solr servers,&lt;br /&gt;
* custom formats, via format-specific parsers written in Java.&lt;br /&gt;
&lt;br /&gt;
AnnoCultor is specifically suited for the situations where XSLT is not sufficient.&lt;br /&gt;
&lt;br /&gt;
It comes with built-in converters for Geonames and Getty vocabularies (AAT, ULAN, TGN), that are ready to use. &lt;br /&gt;
Several additional specific converters illustrate advanced use: converters for collections of Louvre and Joconde, &lt;br /&gt;
Institute Collection Netherlands, Dutch Museum of Asian Ceramics, Tropenmuseum Amsterdam.&lt;br /&gt;
&lt;br /&gt;
As part of conversion, AnnoCultor can semantically tag (enrich) data with links to various vocabularies, with advanced customised disambiguation and term processing possibilities. &lt;br /&gt;
These vocabularies should be represented in RDF or SKOS to be imported via SPARQL queries. &lt;br /&gt;
AnnoCultor comes with built-in tagging with Geonames and a custom time ontology.&lt;br /&gt;
&lt;br /&gt;
AnnoCultor is written in Java, but conversion rules are written in XML. They are extendible with either small Java snippets, or custom rules implementions in Java.&lt;br /&gt;
AnnoCultor has been practically used with datasets ranging from a few records to more than ten millions, containing up to dozens fields each.&lt;br /&gt;
&lt;br /&gt;
=== Apache Any23 ===&lt;br /&gt;
&lt;br /&gt;
[http://incubator.apache.org/any23/ Apache Any23] is a Java library web service and command line tool for parsing multiple document formats and extracting structured data in RDF format from a variety of Web documents. Currently it supports the following input formats:&lt;br /&gt;
&lt;br /&gt;
* RDF/XML, Turtle, Notation 3&lt;br /&gt;
* RDFa&lt;br /&gt;
* Microformats: Adr, Geo, hCalendar, hCard, hListing, hResume, hReview, License, XFN and Species&lt;br /&gt;
&lt;br /&gt;
Apache Any23 is used in major Web of Data applications such as [http://sindice.com/ sindice.com] and [http://sig.ma/ sig.ma].&lt;br /&gt;
&lt;br /&gt;
=== Aperture ===&lt;br /&gt;
&lt;br /&gt;
* [http://aperture.sourceforge.net/ Aperture] is a project written in Java gathering RDF extractors for many formats, mentioned in the list above.&lt;br /&gt;
Aperture supports crawling, making it not a converter but a framework to crawl updates of data (like rsync).&lt;br /&gt;
&lt;br /&gt;
=== [[PiggyBank]] ===&lt;br /&gt;
&lt;br /&gt;
* [http://simile.mit.edu/piggy-bank/ Piggy-bank] is a [http://simile.mit.edu/ Simile] project which allows the Firefox-based clent to automatically load &amp;quot;[http://simile.mit.edu/RDFizers/ RDFizers]&amp;quot;, javascript-based converters to RDF. &lt;br /&gt;
Piggy-bank associates given scarping scripts with given web sites. (How?) &lt;br /&gt;
&lt;br /&gt;
=== Triplr ===&lt;br /&gt;
&lt;br /&gt;
[http://triplr.org/ Triplr] is a general “Stuff in, triples out” system by Dave Beckett. Triplr handles GRDDL, RSS, Atom, and other formats.&lt;br /&gt;
&lt;br /&gt;
=== Virtuoso Sponger ===&lt;br /&gt;
&lt;br /&gt;
[[OpenLinkSoftware|OpenLink Software]] via the &amp;quot;[http://virtuoso.openlinksw.com/dataspace/dav/wiki/Main/VirtSponger Sponger]&amp;quot; component of [[VirtuosoUniversalServer|Virtuoso]]'s SPARQL Processor and Proxy Web Service (used by default by [[OpenLinkDataExplorer| OpenLink Data Explorer]]) provides RDFization for:&lt;br /&gt;
* RDFa &lt;br /&gt;
* GRDDL&lt;br /&gt;
* Amazon Web Services&lt;br /&gt;
* eBay Web Services&lt;br /&gt;
* Freebase Web Services&lt;br /&gt;
* Facebook Web Services&lt;br /&gt;
* Yahoo! Finance&lt;br /&gt;
* XBRL Instance documents&lt;br /&gt;
* DOI (includes a custom resolver for HTTP)&lt;br /&gt;
* OAI&lt;br /&gt;
* RSS/Atom Feeds&lt;br /&gt;
* Digital Music Files (various formats via ID3 Tags)&lt;br /&gt;
* Image Files&lt;br /&gt;
* vCard&lt;br /&gt;
* iCalendar&lt;br /&gt;
* Microformats - hCard, hCalendar&lt;br /&gt;
* HR-XML Resumes &lt;br /&gt;
* Flickr &lt;br /&gt;
* Del.icio.us&lt;br /&gt;
* Bugzilla &lt;br /&gt;
* ODBC or JDBC accessible SQL Data&lt;br /&gt;
* [http://virtuoso.openlinksw.com/dataspace/dav/wiki/Main/VirtSpongerCartridgeSupportedDataSources Many others]&lt;br /&gt;
&lt;br /&gt;
= Notes =&lt;br /&gt;
&lt;br /&gt;
Historically, this list was made from a lists of [http://simile.mit.edu/RDFizers/ RDFizers] and&lt;br /&gt;
[http://www.w3.org/2002/Talks/09-lcs-sweb-tbl/slide31-1.html SWAP converters].&lt;br /&gt;
It has grown significantly from community input since then.&lt;br /&gt;
&lt;br /&gt;
This should be in a data format like Semantic Media Wiki or in N3 -- TimBL&lt;br /&gt;
&lt;br /&gt;
&amp;gt; Would there an advantage to have this kind of list in an RDF file specifically to make queries on it. Maybe if we add a format on how to declare it here, we could create a converter to RDF. -- [[KarlDubost]]&lt;br /&gt;
&lt;br /&gt;
&amp;gt; The task force [http://esw.w3.org/topic/SweoIG/TaskForces/InfoGathering InfoGathering] from SWEO works on such a vocabulary, if you want to rewrite this list using this vocab, look here: [http://esw.w3.org/topic/SweoIG/TaskForces/InfoGathering/DataVocabulary DataVocabulary] or contact me -- [[LeoSauermann]] on 22.1.2007&lt;br /&gt;
----&lt;br /&gt;
&lt;br /&gt;
----&lt;br /&gt;
[[Category:SwTools]] [[Category:SwTools]]&lt;/div&gt;</description>
			<pubDate>Thu, 07 Feb 2013 20:20:20 GMT</pubDate>			<dc:creator>Rcygania2</dc:creator>			<comments>http://www.w3.org/wiki/Talk:ConverterToRdf</comments>		</item>
		<item>
			<title>TPAC2012</title>
			<link>http://www.w3.org/wiki/TPAC2012</link>
			<description>&lt;p&gt;Rcygania2:&amp;#32;/* Session Grid */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;div class=&amp;quot;h-event vevent&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;span class=&amp;quot;p-name summary&amp;quot;&amp;gt;[http://www.w3.org/2012/10/TPAC/ TPAC 2012]&amp;lt;/span&amp;gt; (a [[Events|W3C event]]) takes place &amp;lt;span class=&amp;quot;dt-start dtstart&amp;quot;&amp;gt;&amp;lt;span class=&amp;quot;value-title&amp;quot; title=&amp;quot;2012-10-29&amp;quot;&amp;gt;29 Oct&amp;lt;/span&amp;gt;&amp;lt;/span&amp;gt; to &amp;lt;span class=&amp;quot;dt-end dtend&amp;quot;&amp;gt;&amp;lt;span class=&amp;quot;value-title&amp;quot; title=&amp;quot;2012-11-02&amp;quot;&amp;gt;2 Nov 2012&amp;lt;/span&amp;gt;&amp;lt;/span&amp;gt; in &amp;lt;span class=&amp;quot;p-location location h-adr adr&amp;quot;&amp;gt;&amp;lt;span class=&amp;quot;p-locality locality&amp;quot;&amp;gt;Lyon&amp;lt;/span&amp;gt;, &amp;lt;span class=&amp;quot;p-country-name country-name&amp;quot;&amp;gt;France&amp;lt;/span&amp;gt;&amp;lt;/span&amp;gt;.&lt;br /&gt;
&amp;lt;/div&amp;gt; &lt;br /&gt;
&lt;br /&gt;
&amp;lt;div class=&amp;quot;h-event vevent&amp;quot;&amp;gt;&lt;br /&gt;
On &amp;lt;span class=&amp;quot;dt-start dtstart&amp;quot;&amp;gt;&amp;lt;span class=&amp;quot;value-title&amp;quot; title=&amp;quot;2012-10-31&amp;quot;&amp;gt;Wednesday, 31 October&amp;lt;/span&amp;gt;&amp;lt;/span&amp;gt;,  we hold a &amp;quot;&amp;lt;span class=&amp;quot;p-name summary&amp;quot;&amp;gt;Plenary Day&amp;lt;/span&amp;gt;.&amp;quot; &amp;lt;span class=&amp;quot;p-description description&amp;quot;&amp;gt;As we did for [[TPAC2011]], we will organize most of the day as &amp;quot;camp-style&amp;quot; breakout sessions. The [[TPAC2012-Committee]] is [http://www.w3.org/wiki/TPAC2012-Planning planning] the day. We invite you to add to or comment on [[TPAC2012/SessionIdeas]] . The people at the meeting that day will build the agenda, drawing from ideas socialized in advance and new ideas proposed the day of the meeting. &amp;lt;/span&amp;gt;&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Questions? See the [http://www.w3.org/wiki/TPAC2012/FAQ FAQ].&lt;br /&gt;
&lt;br /&gt;
==Plenary Day Schedule==&lt;br /&gt;
&lt;br /&gt;
Draft schedule for Wednesday, 31-October 2012. &amp;quot;*&amp;quot; means &amp;quot;in plenary&amp;quot;&lt;br /&gt;
&lt;br /&gt;
* 08:30-09:00: W3C Update*, Jeff Jaffe ([http://www.w3.org/2012/Talks/jj-tpac2012-plenary.pptx slides])&lt;br /&gt;
* 09:00-09:30: New and upcoming work* (wgs, cgs, bgs, workshops).&lt;br /&gt;
** Web Performance Workshop, Philippe Le Hégaret ([http://www.w3.org/2012/Talks/1031-webperf-plh/ slides])&lt;br /&gt;
** SVG 2, WOFF 2, Chris Lilley ([http://www.w3.org/Talks/2012/ChrisLilley-TPAC-NewWork/cover-basic.svg slides])&lt;br /&gt;
** Pointer Events, Doug Schepers&lt;br /&gt;
** Web Platform Docs, Doug Schepers&lt;br /&gt;
** Publishing, Ivan Herman (no slides)&lt;br /&gt;
** TAG, Jeni Tennison ([http://www.w3.org/2012/Talks/jt-tpac2012-plenary/assets/fallback/index.html slides])&lt;br /&gt;
* 09:30-10:30: Agenda building*, Tantek Çelik and Ian Jacobs. See [http://www.w3.org/wiki/TPAC2012/FAQ#What_are_the_breakout_room_sizes.3F room availability]&lt;br /&gt;
* 10:30-11:00: Break&lt;br /&gt;
* 11:00-11:50: Breakouts&lt;br /&gt;
* 12:00-13:20: Lunch&lt;br /&gt;
* 13:30-14:20: Breakouts&lt;br /&gt;
* 14:30-15:20: Breakouts&lt;br /&gt;
* 15:20-16:00: Break &lt;br /&gt;
* 16:00-16:50: Breakouts&lt;br /&gt;
* 17:00-17:30: [http://www.w3.org/wiki/TPAC2012/FAQ#How_do_we_share_our_breakout_discussion_with_others.3F Sharing results]* (Ian Jacobs to chair)&lt;br /&gt;
* 17:30-17:40: Wrap-up*, thanks, TPAC 2013, Jeff Jaffe&lt;br /&gt;
* 18h30-21h30: Dinner&lt;br /&gt;
&lt;br /&gt;
===Ideas===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ul class=&amp;quot;show_items&amp;quot;&amp;gt;&lt;br /&gt;
* PGP key signing area (e.g., before dinner)?&lt;br /&gt;
* Ask groups for upcoming meeting info for calendar.&lt;br /&gt;
&amp;lt;/ul&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Session Grid ==&lt;br /&gt;
&lt;br /&gt;
{|  border=&amp;quot;1&amp;quot; cellpadding=&amp;quot;5&amp;quot; cellspacing=&amp;quot;0&amp;quot; width=&amp;quot;80%&amp;quot;&lt;br /&gt;
|-&lt;br /&gt;
! time&lt;br /&gt;
! PASTEUR&lt;br /&gt;
! RHONE 1&lt;br /&gt;
! RHONE 2&lt;br /&gt;
! RHONE 3&lt;br /&gt;
! RHONE 4&lt;br /&gt;
! SAINT CLAIR 1&lt;br /&gt;
! SAINT CLAIR 2&lt;br /&gt;
! SAINT CLAIR 3a&lt;br /&gt;
! SAINT CLAIR 3b&lt;br /&gt;
! SAINT CLAIR 4&lt;br /&gt;
|-&lt;br /&gt;
! 11:00-11:50&lt;br /&gt;
| [http://www.w3.org/wiki/index.php?title=Digital_Publishing Digital Publishing] (Ivan Herman) [http://www.w3.org/2012/10/31-digpub-irc #digpub]&lt;br /&gt;
| [[TPAC2012/Community Groups]] (Ian Jacobs) [http://www.w3.org/2012/10/31-community-irc #community]&lt;br /&gt;
| URL (Anne van Kesteren), [http://www.w3.org/2012/10/31-urlstandard-irc #urlstandard]&lt;br /&gt;
| | [http://www.w3.org/wiki/index.php?title=Social_Web Social Web] (Ann Bassetti, Wendy Seltzer) [http://www.w3.org/2012/10/31-social-irc #social]&lt;br /&gt;
| Stereoscope/3D Web (Dong-Yong Lee) [http://www.w3.org/2012/10/31-3dweb-irc #3dweb]&lt;br /&gt;
| Web Quality BP (Elie Sloem, Aurelia Levy) #&lt;br /&gt;
| [[TPAC2012/OfflineApps Offline Apps]] (Ashok Malhotra) [http://www.w3.org/2012/10/31-offline-irc #offline]&lt;br /&gt;
| Testing (Philippe Le Hégaret) #testing [http://www.w3.org/2012/10/31-testing-minutes.html IRC log/minutes]&lt;br /&gt;
| Web Intents (Claes) [http://www.w3.org/2012/10/31-intents-irc #intents]. See wiki [http://www.w3.org/wiki/index.php?title=TPAC2012/session-WebIntents-local-services Session on Web Intents and Web Intents for local services].&lt;br /&gt;
| .&lt;br /&gt;
|-&lt;br /&gt;
! 13:30-14:20&lt;br /&gt;
| Responsive Images (Marcos Caceres), [http://www.w3.org/2012/10/31-respimg-irc #respimg]&lt;br /&gt;
| Extended DRM (Toru and Kiyoshi), [http://www.w3.org/2012/10/31-edrm-irc #edrm]&lt;br /&gt;
| Digital Marketing (Karen Myers), [http://www.w3.org/2012/10/31-digitalm-irc #digitalm]&lt;br /&gt;
| Identity and Privacy on the Web (Henry Story), [http://www.w3.org/2012/10/31-identity-irc #identity]&lt;br /&gt;
| Restyling W3C Specs: boilerplate content and styling (fantasai, darobin, divya), [http://www.w3.org/2012/10/31-restyle-minutes #restyle]&lt;br /&gt;
| Intro of Sys Apps WG (Wonsuk), [http://www.w3.org/2012/10/31-sysapps-irc #sysapps]&lt;br /&gt;
| [[TPAC2012/agile W3C Process Agility]] (Steve Zilles), [http://www.w3.org/2012/10/31-agile-irc #agile]&lt;br /&gt;
| [http://adobe.github.com/web-platform/presentations/testtwf-tpac2012/#/ Test the Web Forward] (Rebecca Hauck, Alan Stearns), [http://www.w3.org/2012/10/31-testtwf-irc #testtwf]&lt;br /&gt;
| XML Memory/Change tracking (Steven Pemberton), [http://www.w3.org/2012/10/31-xmlmemory-minutes #xmlmemory]&lt;br /&gt;
| Linked Data/Gov Publishing (Bernadette Hyland) [http://www.w3.org/2012/10/31-linkeddata-irc #linkeddata]&lt;br /&gt;
|-&lt;br /&gt;
! 14:30-15:20&lt;br /&gt;
| How to be a good chair (Charles McCathieNevile), [http://www.w3.org/2012/10/31-chair-irc #chair]&lt;br /&gt;
| Performance in the real world (Paul Bakaus, Tobie Langel), [http://www.w3.org/2012/10/31-webperf-irc #webperf]&lt;br /&gt;
| [http://www.w3.org/2012/10/31-ld-dev-minutes.html Linked Data for Web Developers] (Francois Daoust), [http://www.w3.org/2012/10/31-ld-dev-irc #ld-dev]&lt;br /&gt;
| Browser Fingerprinting a lost cause? (Brad Hill), [http://www.w3.org/2012/10/31-fingerprint-irc #fingerprint]&lt;br /&gt;
| Social Web Workshop early planning (Ann Bassetti), [http://www.w3.org/2012/10/31-socwork-irc #socwork]&lt;br /&gt;
| APIs for trusted web apps (Dave Raggett), [http://www.w3.org/2012/10/31-sysapps-irc #sysapps]&lt;br /&gt;
| Web Platform Docs (Doug Schepers), [http://www.w3.org/2012/10/31-webplatform-irc #webplatform]&lt;br /&gt;
| Test Infrastructure (JGraham), [http://www.w3.org/2012/10/31-github-irc #github]&lt;br /&gt;
| Media and Disaster (Yosuke Funahashi), [http://www.w3.org/2012/10/31-disaster-irc #disaster]&lt;br /&gt;
| IndieUI (Rich Schwerdtfeger, Janina Sajka), [http://www.w3.org/2012/10/31-indie-ui-irc #indie-ui]&lt;br /&gt;
|-&lt;br /&gt;
! 16:00-16:50&lt;br /&gt;
| [http://www.w3.org/wiki/TPAC2012/SessionIdeas#Making_the_multilingual_web_work Multilingual Web] (Felix Sasaki see [http://www.w3.org/wiki/File:Mlw-tpac2012-slides-fsasaki.pdf session slides]), [http://www.w3.org/2012/10/31-mlw-irc #mlw]&lt;br /&gt;
| [http://www.w3.org/wiki/TPAC2012/SessionIdeas#End-to-end_W3C_APIs End to end W3C APIs] (Alexandre Morgaut), [http://www.w3.org/2012/10/31-jseverywhere-irc #jseverywhere]&lt;br /&gt;
| [[TPAC2012/Modern Guide]] (Ian Jacobs), [http://www.w3.org/2012/10/31-guide-irc #guide]&lt;br /&gt;
| Catch up on DNT (Nick Doty), [http://www.w3.org/2012/10/31-dnt-irc #dnt]&lt;br /&gt;
| [http://www.w3.org/wiki/TPAC2012/tvapis TV APIs] (JC Verdie), [http://www.w3.org/2012/10/31-tvapis-irc #tvapis]&lt;br /&gt;
| Smarter WebApps for Smarter Phones (Bryan Sullivan), [http://www.w3.org/2012/10/31-smarterwebapps-irc #smarterwebapps]&lt;br /&gt;
| Declarative 3D as Polyfill (Johannes Behr), [http://www.w3.org/2012/10/31-dec3d-irc #dec3d]&lt;br /&gt;
| Gov Open Data (Hadley Beeman), [http://www.w3.org/2012/10/31-opendata-irc #opendata]&lt;br /&gt;
| Future of W3C publishing process (Philippe Le Hégaret), [http://www.w3.org/2012/10/31-tr-irc #tr]&lt;br /&gt;
| Speech and HTML (Debbie Dahl), [http://www.w3.org/2012/10/31-speech-irc #speech]&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
OTHER SESSIONS OFF THE GRID:&lt;br /&gt;
&lt;br /&gt;
* 16:00-16:50 [http://microformats.org/wiki/microformats2 microformats2] (Tantek Celik, Ted O'Connor), Forum 2, [irc://irc.freenode.net/microformats #microformats]&lt;br /&gt;
&lt;br /&gt;
== Feedback ==&lt;br /&gt;
Please feel free to provide feedback on TPAC 2012:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ul class=&amp;quot;show_items&amp;quot;&amp;gt;&lt;br /&gt;
* [https://www.w3.org/2002/09/wbs/35125/tpac2012-feedback/ Official TPAC 2012 feedback form]&lt;br /&gt;
* [[TPAC2012/feedback|TPAC2012 feedback wiki page]]&lt;br /&gt;
&amp;lt;/ul&amp;gt;&lt;/div&gt;</description>
			<pubDate>Wed, 31 Oct 2012 14:28:35 GMT</pubDate>			<dc:creator>Rcygania2</dc:creator>			<comments>http://www.w3.org/wiki/Talk:TPAC2012</comments>		</item>
		<item>
			<title>TPAC2012/Community Groups</title>
			<link>http://www.w3.org/wiki/TPAC2012/Community_Groups</link>
			<description>&lt;p&gt;Rcygania2:&amp;#32;&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Community and Business Groups. A breakout session during Plenary Day at TPAC 2012.&lt;br /&gt;
&lt;br /&gt;
* [http://www.w3.org/2012/Talks/ij-tpac2012-plenary/ Ian Jacobs' slides]&lt;br /&gt;
&lt;br /&gt;
== Minutes ==&lt;/div&gt;</description>
			<pubDate>Wed, 31 Oct 2012 10:10:20 GMT</pubDate>			<dc:creator>Rcygania2</dc:creator>			<comments>http://www.w3.org/wiki/Talk:TPAC2012/Community_Groups</comments>		</item>
		<item>
			<title>RdfAndSql</title>
			<link>http://www.w3.org/wiki/RdfAndSql</link>
			<description>&lt;p&gt;Rcygania2:&amp;#32;/* Proprietary APIs and query languages */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;__NOTOC__&lt;br /&gt;
While SQL databases are sometimes used for [[StoringRDF]] (a la [[LargeTripleStores|Large Triple Stores]]) it's also quite interesting to take [[Rdb2RdfXG|SQL stores that were not designed with RDF in mind and export/expose them as RDF]]. With SPARQL, the connection becomes even more interesting.&lt;br /&gt;
&lt;br /&gt;
Some implementations map the database contents to an RDF vocabulary that is created automatically from the database schema. Others require a manual mapping of tables and columns to RDF properties and classes, but support use of existing vocabularies without an external rules engine.&lt;br /&gt;
&lt;br /&gt;
== SPARQL-based ==&lt;br /&gt;
&lt;br /&gt;
* [http://d2rq.org/d2r-server D2R Server] - SPARQL and [[LinkedData]] over HTTP; automatic, highly customizable mapping. Part of D2RQ (see below). &lt;br /&gt;
* [http://jena.sourceforge.net/SquirrelRDF/ SquirrelRDF] - SPARQL over API or HTTP, automatic or manual mapping. SquirrelRDF exports LDAP as well as SQL via SPARQL. See [[AdapterArchitecture]] for more along those lines.&lt;br /&gt;
* [http://www.w3.org/2005/05/22-SPARQL-MySQL/XTech SPASQL] - SPARQL over MySQL, automatic mapping&lt;br /&gt;
* [http://sourceforge.net/projects/rdquery RDQuery] - RDQL and SPARQL (?) over GUI, automatic mapping (same as Relational.OWL)&lt;br /&gt;
* [http://esw.w3.org/topic/DartGrid DartGrid] - is a Semantic Grid system, includes a SPARQL rewriting component using Datalog-like rules (more details in a [http://ccnt.zju.edu.cn/projects/dartgrid/files/presentation/%5B05-11-29-SKG%5D.ppt PPT presentation] and a [http://ccnt.zju.edu.cn/projects/dartgrid/files/publication/Huajun-ICDE2006.pdf Conference paper]&lt;br /&gt;
* [[VirtuosoUniversalServer|Virtuoso Universal Server]] - [http://virtuoso.openlinksw.com/dataspace/dav/wiki/Main/VOSSQL2RDF SPARQL-based Declarative Metaschema Language] for exposing any Virtuoso-housed data, as well as HTTP-, ODBC-, JDBC-, and otherwise-accessible SQL and XML data sources as URI Dereferenceable [[LinkedData|Linked RDF Instance Data]]. RDFized data may be retrieved as RDF/XML, JSON, N3, etc.&lt;br /&gt;
* [http://moustaki.org/p2r/ P2R] - provides dynamic access to Prolog knowledge bases, which may wrap SQL queries, calls to web services, XML parsing, etc.&lt;br /&gt;
* [http://www.cambridgesemantics.com/products/anzo_data_collaboration_server Anzo Data Collaboration Server] - from Cambridge Semantics, uses SPARQL to do integrated queries of RDBMSes in addition RDF stores, LDAP directories, etc. Also pushes RDF updates to SQL inserts&amp;amp;updates when possible.&lt;br /&gt;
&lt;br /&gt;
== SPARQL Across Federated Sources ==&lt;br /&gt;
&lt;br /&gt;
'''The Semantic Discovery System''': Provides the functionality to rapidly build solutions for non technical Users to create and execute Ad Hoc queries using the network Graph User Interface (SPARQL to SQL is auto generated). Integrates and interconnects ALL data silo types - providing a virtual Semantic Web interface to all RDBMS's, Web Services, Excel Spreadsheets, and any Hybrid File Systems.&lt;br /&gt;
* [http://www.semanticdiscoverysystems.com/ Semantic Discovery Systems (Main Site)]&lt;br /&gt;
* [http://www.meaning2go.com/ Semantic Discovery Systems (&amp;quot;Trailer&amp;quot;/Summary Web site)]&lt;br /&gt;
&lt;br /&gt;
== RDF/XML-based ==&lt;br /&gt;
&lt;br /&gt;
* [http://www.w3.org/2000/10/swap/dbork/dbview.py dbview] - Browsable RDF/XML ([[LinkedData]]) over HTTP, manual mapping (?)&lt;br /&gt;
* [http://www.dbs.cs.uni-duesseldorf.de/RDF/ Relational.OWL] - RDF/XML dump, automatic mapping&lt;br /&gt;
* [http://www.wiwiss.fu-berlin.de/suhl/bizer/d2rmap/D2Rmap.htm D2R Map] - RDF/XML dump, manual mapping&lt;br /&gt;
* [http://metamorphoses.sourceforge.net/ METAmorphoses] - Produces RDF/XML through a template language that selects a subgraph. Includes servlet for publishing the RDF/XML as [[LinkedData]]. Manual mapping.&lt;br /&gt;
* [http://triplify.org/Overview Triplify] is a small plug-in for Web applications, which reveals the semantic structures encoded in relational databases by making database content available as RDF, JSON or Linked Data.&lt;br /&gt;
* [[VirtuosoUniversalServer|Virtuoso Universal Server]] (in SPARQL-based section above) facilitates retrieval of RDBMS-hosted data in RDF Linked Data form, including RDF/XML serialization.&lt;br /&gt;
&lt;br /&gt;
* [http://www.dblab.ntua.gr/~bikakis/SPARQL2XQuery.html SPARQL2XQuery] - Querying XML Data with SPARQL.&lt;br /&gt;
&lt;br /&gt;
== Proprietary APIs and query languages ==&lt;br /&gt;
&lt;br /&gt;
* [http://d2rq.org/ D2RQ] - Java library, provides access to relational data from the command line, through SPARQL, and through the Jena API. Automatic, highly customizable mapping.&lt;br /&gt;
* [http://www.w3.org/2004/04/30-RDF-RDB-access/ FeDeRate] - Algae query language, manual mapping (?)&lt;br /&gt;
* [http://www.cs.man.ac.uk/~ocorcho/documents/SWDB2004_BarrasaEtAl.pdf R2O] - proprietary query language, manual mapping&lt;br /&gt;
* [http://jena.hpl.hp.com/juc2006/proceedings/wilkinson/paper.pdf Jena property tables] - Jena API, manual mapping&lt;br /&gt;
* [[VirtuosoUniversalServer|Virtuoso Universal Server]] - (in SPARQL-based section above) uses [[SPASQL|SPASQL]] to implement its own [http://virtuoso.openlinksw.com/dataspace/dav/wiki/Main/VOSSQL2RDF Declarative Metaschema Language] for exposing ODBC- or JDBC-accessible SQL data sources as URI Dereferenceable [[LinkedData|Linked RDF Instance Data]]. RDFized data may be retrieved as RDF/XML, JSON, N3, etc.&lt;br /&gt;
&lt;br /&gt;
== Other Related Dynamic RDF-&amp;gt;SQL Mappings ==&lt;br /&gt;
* [https://svn.rdflib.net/trunk/rdflib/store/FOPLRelationalModel/ FOPLRelationalModel] - Models a target relational [http://copia.ogbuji.net/files/N3RelationalModel.xml model] as a set of objects which dynamically generate optimized SQL queries (intersections, unions, etc..) from Basic Triple Patterns&lt;br /&gt;
&lt;br /&gt;
== Benchmarking RDB-to-RDF Tools ==&lt;br /&gt;
&lt;br /&gt;
* latest info is found in [[RdfStoreBenchmarking]]&lt;br /&gt;
* Martin Svihala, Ivan Jelinek: [http://metamorphoses.sourceforge.net/Papers/pdf/msvihla_dexa2007.pdf Benchmarking RDF Production Tools] Paper comparing the performance of relational database to RDF mapping tools (METAmorphoses, D2RQ, SquirrelRDF) with native RDF stores (Jena, Sesame)&lt;br /&gt;
&lt;br /&gt;
== Some History ==&lt;br /&gt;
&lt;br /&gt;
* TimBL wrote a [http://www.w3.org/DesignIssues/RDB-RDF.html design note on Relational Databases on the Semantic Web] in 1998. &lt;br /&gt;
* As a follow-up to some [http://dig.csail.mit.edu/breadcrumbs/node/140 semweb dev track stuff at WWW2006 (e.g., D2R Server talk)], a few of us ([[DanConnolly]], ericP, EliasT, TimBL, AndyS, chimezie) got together for an [http://lists.w3.org/Archives/Public/public-sparql-dev/2006AprJun/0016.html IRC/phone meeting 8 Jun], where we agreed to use [http://lists.w3.org/Archives/Public/public-sparql-dev/ sparql-dev] and this wiki topic.&lt;br /&gt;
* Chris B. and a few others had a [http://lists.w3.org/Archives/Public/public-sparql-dev/2006AprJun/0020.html lunch discussion at ESWC] a few days later.&lt;br /&gt;
* Note the [http://www.w3.org/2007/03/RdfRDB/ W3C Workshop on RDF Access to Relational Databases] 25 to 26 October 2007 Cambridge, MA, USA.&lt;br /&gt;
&lt;br /&gt;
== Related ==&lt;br /&gt;
* [[DynamicRDFizers|List of Dynamic RDFizers]] for RDFization of non/less-structured sources.&lt;br /&gt;
&lt;br /&gt;
----&lt;br /&gt;
[[Category:SwTools]] [[RdfRdbMappingExamples]]&lt;/div&gt;</description>
			<pubDate>Tue, 03 Apr 2012 10:25:30 GMT</pubDate>			<dc:creator>Rcygania2</dc:creator>			<comments>http://www.w3.org/wiki/Talk:RdfAndSql</comments>		</item>
		<item>
			<title>RdfAndSql</title>
			<link>http://www.w3.org/wiki/RdfAndSql</link>
			<description>&lt;p&gt;Rcygania2:&amp;#32;Update D2R Server link&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;__NOTOC__&lt;br /&gt;
While SQL databases are sometimes used for [[StoringRDF]] (a la [[LargeTripleStores|Large Triple Stores]]) it's also quite interesting to take [[Rdb2RdfXG|SQL stores that were not designed with RDF in mind and export/expose them as RDF]]. With SPARQL, the connection becomes even more interesting.&lt;br /&gt;
&lt;br /&gt;
Some implementations map the database contents to an RDF vocabulary that is created automatically from the database schema. Others require a manual mapping of tables and columns to RDF properties and classes, but support use of existing vocabularies without an external rules engine.&lt;br /&gt;
&lt;br /&gt;
== SPARQL-based ==&lt;br /&gt;
&lt;br /&gt;
* [http://d2rq.org/d2r-server D2R Server] - SPARQL and [[LinkedData]] over HTTP; automatic, highly customizable mapping. Part of D2RQ (see below). &lt;br /&gt;
* [http://jena.sourceforge.net/SquirrelRDF/ SquirrelRDF] - SPARQL over API or HTTP, automatic or manual mapping. SquirrelRDF exports LDAP as well as SQL via SPARQL. See [[AdapterArchitecture]] for more along those lines.&lt;br /&gt;
* [http://www.w3.org/2005/05/22-SPARQL-MySQL/XTech SPASQL] - SPARQL over MySQL, automatic mapping&lt;br /&gt;
* [http://sourceforge.net/projects/rdquery RDQuery] - RDQL and SPARQL (?) over GUI, automatic mapping (same as Relational.OWL)&lt;br /&gt;
* [http://esw.w3.org/topic/DartGrid DartGrid] - is a Semantic Grid system, includes a SPARQL rewriting component using Datalog-like rules (more details in a [http://ccnt.zju.edu.cn/projects/dartgrid/files/presentation/%5B05-11-29-SKG%5D.ppt PPT presentation] and a [http://ccnt.zju.edu.cn/projects/dartgrid/files/publication/Huajun-ICDE2006.pdf Conference paper]&lt;br /&gt;
* [[VirtuosoUniversalServer|Virtuoso Universal Server]] - [http://virtuoso.openlinksw.com/dataspace/dav/wiki/Main/VOSSQL2RDF SPARQL-based Declarative Metaschema Language] for exposing any Virtuoso-housed data, as well as HTTP-, ODBC-, JDBC-, and otherwise-accessible SQL and XML data sources as URI Dereferenceable [[LinkedData|Linked RDF Instance Data]]. RDFized data may be retrieved as RDF/XML, JSON, N3, etc.&lt;br /&gt;
* [http://moustaki.org/p2r/ P2R] - provides dynamic access to Prolog knowledge bases, which may wrap SQL queries, calls to web services, XML parsing, etc.&lt;br /&gt;
* [http://www.cambridgesemantics.com/products/anzo_data_collaboration_server Anzo Data Collaboration Server] - from Cambridge Semantics, uses SPARQL to do integrated queries of RDBMSes in addition RDF stores, LDAP directories, etc. Also pushes RDF updates to SQL inserts&amp;amp;updates when possible.&lt;br /&gt;
&lt;br /&gt;
== SPARQL Across Federated Sources ==&lt;br /&gt;
&lt;br /&gt;
'''The Semantic Discovery System''': Provides the functionality to rapidly build solutions for non technical Users to create and execute Ad Hoc queries using the network Graph User Interface (SPARQL to SQL is auto generated). Integrates and interconnects ALL data silo types - providing a virtual Semantic Web interface to all RDBMS's, Web Services, Excel Spreadsheets, and any Hybrid File Systems.&lt;br /&gt;
* [http://www.semanticdiscoverysystems.com/ Semantic Discovery Systems (Main Site)]&lt;br /&gt;
* [http://www.meaning2go.com/ Semantic Discovery Systems (&amp;quot;Trailer&amp;quot;/Summary Web site)]&lt;br /&gt;
&lt;br /&gt;
== RDF/XML-based ==&lt;br /&gt;
&lt;br /&gt;
* [http://www.w3.org/2000/10/swap/dbork/dbview.py dbview] - Browsable RDF/XML ([[LinkedData]]) over HTTP, manual mapping (?)&lt;br /&gt;
* [http://www.dbs.cs.uni-duesseldorf.de/RDF/ Relational.OWL] - RDF/XML dump, automatic mapping&lt;br /&gt;
* [http://www.wiwiss.fu-berlin.de/suhl/bizer/d2rmap/D2Rmap.htm D2R Map] - RDF/XML dump, manual mapping&lt;br /&gt;
* [http://metamorphoses.sourceforge.net/ METAmorphoses] - Produces RDF/XML through a template language that selects a subgraph. Includes servlet for publishing the RDF/XML as [[LinkedData]]. Manual mapping.&lt;br /&gt;
* [http://triplify.org/Overview Triplify] is a small plug-in for Web applications, which reveals the semantic structures encoded in relational databases by making database content available as RDF, JSON or Linked Data.&lt;br /&gt;
* [[VirtuosoUniversalServer|Virtuoso Universal Server]] (in SPARQL-based section above) facilitates retrieval of RDBMS-hosted data in RDF Linked Data form, including RDF/XML serialization.&lt;br /&gt;
&lt;br /&gt;
* [http://www.dblab.ntua.gr/~bikakis/SPARQL2XQuery.html SPARQL2XQuery] - Querying XML Data with SPARQL.&lt;br /&gt;
&lt;br /&gt;
== Proprietary APIs and query languages ==&lt;br /&gt;
&lt;br /&gt;
* [http://www.wiwiss.fu-berlin.de/suhl/bizer/D2RQ/ D2RQ] - Java library, provides access to relational data through the Jena API, Sesame API, SPARQL, RDQL. Automatic, highly customizable mapping.&lt;br /&gt;
* [http://www.w3.org/2004/04/30-RDF-RDB-access/ FeDeRate] - Algae query language, manual mapping (?)&lt;br /&gt;
* [http://www.cs.man.ac.uk/~ocorcho/documents/SWDB2004_BarrasaEtAl.pdf R2O] - proprietary query language, manual mapping&lt;br /&gt;
* [http://jena.hpl.hp.com/juc2006/proceedings/wilkinson/paper.pdf Jena property tables] - Jena API, manual mapping&lt;br /&gt;
* [[VirtuosoUniversalServer|Virtuoso Universal Server]] - (in SPARQL-based section above) uses [[SPASQL|SPASQL]] to implement its own [http://virtuoso.openlinksw.com/dataspace/dav/wiki/Main/VOSSQL2RDF Declarative Metaschema Language] for exposing ODBC- or JDBC-accessible SQL data sources as URI Dereferenceable [[LinkedData|Linked RDF Instance Data]]. RDFized data may be retrieved as RDF/XML, JSON, N3, etc.&lt;br /&gt;
&lt;br /&gt;
== Other Related Dynamic RDF-&amp;gt;SQL Mappings ==&lt;br /&gt;
* [https://svn.rdflib.net/trunk/rdflib/store/FOPLRelationalModel/ FOPLRelationalModel] - Models a target relational [http://copia.ogbuji.net/files/N3RelationalModel.xml model] as a set of objects which dynamically generate optimized SQL queries (intersections, unions, etc..) from Basic Triple Patterns&lt;br /&gt;
&lt;br /&gt;
== Benchmarking RDB-to-RDF Tools ==&lt;br /&gt;
&lt;br /&gt;
* latest info is found in [[RdfStoreBenchmarking]]&lt;br /&gt;
* Martin Svihala, Ivan Jelinek: [http://metamorphoses.sourceforge.net/Papers/pdf/msvihla_dexa2007.pdf Benchmarking RDF Production Tools] Paper comparing the performance of relational database to RDF mapping tools (METAmorphoses, D2RQ, SquirrelRDF) with native RDF stores (Jena, Sesame)&lt;br /&gt;
&lt;br /&gt;
== Some History ==&lt;br /&gt;
&lt;br /&gt;
* TimBL wrote a [http://www.w3.org/DesignIssues/RDB-RDF.html design note on Relational Databases on the Semantic Web] in 1998. &lt;br /&gt;
* As a follow-up to some [http://dig.csail.mit.edu/breadcrumbs/node/140 semweb dev track stuff at WWW2006 (e.g., D2R Server talk)], a few of us ([[DanConnolly]], ericP, EliasT, TimBL, AndyS, chimezie) got together for an [http://lists.w3.org/Archives/Public/public-sparql-dev/2006AprJun/0016.html IRC/phone meeting 8 Jun], where we agreed to use [http://lists.w3.org/Archives/Public/public-sparql-dev/ sparql-dev] and this wiki topic.&lt;br /&gt;
* Chris B. and a few others had a [http://lists.w3.org/Archives/Public/public-sparql-dev/2006AprJun/0020.html lunch discussion at ESWC] a few days later.&lt;br /&gt;
* Note the [http://www.w3.org/2007/03/RdfRDB/ W3C Workshop on RDF Access to Relational Databases] 25 to 26 October 2007 Cambridge, MA, USA.&lt;br /&gt;
&lt;br /&gt;
== Related ==&lt;br /&gt;
* [[DynamicRDFizers|List of Dynamic RDFizers]] for RDFization of non/less-structured sources.&lt;br /&gt;
&lt;br /&gt;
----&lt;br /&gt;
[[Category:SwTools]] [[RdfRdbMappingExamples]]&lt;/div&gt;</description>
			<pubDate>Tue, 03 Apr 2012 10:24:07 GMT</pubDate>			<dc:creator>Rcygania2</dc:creator>			<comments>http://www.w3.org/wiki/Talk:RdfAndSql</comments>		</item>
		<item>
			<title>ConverterToRdf</title>
			<link>http://www.w3.org/wiki/ConverterToRdf</link>
			<description>&lt;p&gt;Rcygania2:&amp;#32;update D2R Server/D2RQ link&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;__NOTOC__&lt;br /&gt;
A Converter to RDF is a tool which converts application data from an application-specific format into RDF for use with RDF tools and integration with other data. Converters may be part of a one-time migration effort, or part of a running system which provides a semantic web view of a given application. See also: [[RDFImportersAndAdapters]]&lt;br /&gt;
&lt;br /&gt;
Please add converters as you make them or hear of them.&lt;br /&gt;
&lt;br /&gt;
== Formats ==&lt;br /&gt;
&lt;br /&gt;
in alphabetical order:&lt;br /&gt;
&lt;br /&gt;
=== [[BibTex]] ===&lt;br /&gt;
&lt;br /&gt;
[[BibTex]] is the format for bibliographic references in TeX.&lt;br /&gt;
&lt;br /&gt;
* [http://data.bibbase.org BibBase] transforms BibTeX files (given in a URL) into Linked Data with RDF/XML output support. &lt;br /&gt;
** Uses a custom  [http://purl.org/bibbase/ontology BibTeX ontology] but provides a table of mappings to other ontologies&lt;br /&gt;
** Also provides [http://bibbase.org HTML interface] and RSS feed&lt;br /&gt;
* [http://simile.mit.edu/repository/RDFizers/bibtex2rdf/ bibtex2rdf] transforms BibTEX files into RDF/XML. (Simile)&lt;br /&gt;
* [http://www.l3s.de/~siberski/bibtex2rdf/ bibtex2rdf] - A configurable BibTeX to RDF Converter by  Wolf Siberski. &lt;br /&gt;
* [http://www.cs.vu.nl/%7Emcaklein/bib2rdf/ An online service] set up at the Vrije Universiteit in Amsterdam, the Netherlands, following the [http://ontoweb.aifb.uni-karlsruhe.de/ OntoWeb portal] vocabulary. The [http://www.cs.vu.nl/%7Emcaklein/bib2rdf/bib2rdf perl source] can also be downloaded.&lt;br /&gt;
* [http://www.aifb.uni-karlsruhe.de/WBS/pha/bib/index.html Java BibTeX-To-RDF Converter] based on the [http://ontobroker.semanticweb.org/ontos/swrc.html SWRC] terminology.&lt;br /&gt;
&lt;br /&gt;
=== Bittorrent ===&lt;br /&gt;
&lt;br /&gt;
* http://www.inf.unideb.hu/~jeszy/rdfizers is alas now 404 (in 2007). This was a link from RDFizers but may be incorrect.&lt;br /&gt;
&lt;br /&gt;
=== Debian  ===&lt;br /&gt;
&lt;br /&gt;
The package information in Debian and similar systems (Ubuntu, Fink, etc), with its general usefulness and its graph-like nature, is a clear candidate for conversion to RDF.&lt;br /&gt;
&lt;br /&gt;
See [http://blog.drinsama.de/erich/en/xml/2007011204-rdf-representation-of-packages VitaVoni blog] about this.&lt;br /&gt;
&lt;br /&gt;
* [http://www.w3.org/2000/10/swap/util/fink2n3.py finkn3.py] Takes Fink (OS-X port of Debian packaging) dependencies and converts to to RDF/N3. (SWAP) No idea whether this would be a quick hack to export debian data.&lt;br /&gt;
* [http://github.com/nbarrientos/steamy STEAMY] converts Debian packages to RDF.&lt;br /&gt;
&lt;br /&gt;
=== Email (RFC822 headers) ===&lt;br /&gt;
&lt;br /&gt;
* [http://simile.mit.edu/repository/RDFizers/email2rdf/ email2rdf ] transforms email mbox files into RDF/XML. (Simile)&lt;br /&gt;
* [http://www.w3.org/2000/04/maillog2rdf/aboutMsg.py aboutMsg.py] converts email metadata to RDF. (SWAP)&lt;br /&gt;
* [http://swaml.berlios.de/ SWAML] transforms a mailing list into RDF/XML and XHTML+RDFa using [[SIOC]].&lt;br /&gt;
** And [http://linkedmarkmail.wikier.org/ LinkedMarkMail] live transforms into RDF/XML the mailing lists' archives indexed by [http://markmail.org/ MarkMail].&lt;br /&gt;
* [http://search.cpan.org/dist/Email-MIME-XMTP/ Email::MIME::XMTP] Perl extension to read and write [http://www.openhealth.org/xmtp/ XMTP] &lt;br /&gt;
* [http://aperture.sourceforge.net aperture.sf.net] IMAP crawler&lt;br /&gt;
&lt;br /&gt;
There are others in this vein which run over IMAP or mailbox files.@@&lt;br /&gt;
&lt;br /&gt;
=== Excel ===&lt;br /&gt;
&lt;br /&gt;
* Cambridge Semantics' [http://www.cambridgesemantics.com/products/anzo_for_excel Anzo for Excel] extracts RDF data from Excel spreadsheets while keeping the spreadsheet in-sync with the underlying data as things change&lt;br /&gt;
* [http://www.topbraidcomposer.com TopBraid Composer] can convert Excel spreadsheets into instances of an RDF schema.&lt;br /&gt;
* XLWrap, [http://xlwrap.sourceforge.net] wraps spreadsheets (including cross tables) to arbitrary RDF graphs; supports Excel/OpenDocument/CSV streamed processing, local/HTTP loading, expressions similar to Excel/OpenOffice Calc, custom functions, usage via API or SPARQL endpoint&lt;br /&gt;
* [http://www.tao-project.eu/researchanddevelopment/demosanddownloads/RDBToOnto.html RDBToOnto], see description below under SQL section.&lt;br /&gt;
* [http://github.com/Data2Semantics/TabLinker TabLinker] can convert non-standard Excel spreadsheets to the Data Cube vocabulary, e.g. Excel files that contain hierarchical information in row and column headers etc.&lt;br /&gt;
&lt;br /&gt;
=== EXIF ===&lt;br /&gt;
&lt;br /&gt;
See JPEG.&lt;br /&gt;
&lt;br /&gt;
=== File Systems ===&lt;br /&gt;
&lt;br /&gt;
* [http://www.cs.univie.ac.at/publication.php?pid=5750 TripFS] exposes an entire file system as linked data, tracks changes, and links files to external data sources.&lt;br /&gt;
&lt;br /&gt;
=== Flickr data ===&lt;br /&gt;
&lt;br /&gt;
* Dave Becketts [http://librdf.org/flickcurl/ flickurl] library can access Flickr information (including machine tags) and convert it to RDF &lt;br /&gt;
&lt;br /&gt;
=== Flat files ===&lt;br /&gt;
&lt;br /&gt;
Unix systems store data (such as /etc/passwd) in flat files with comma separation.&lt;br /&gt;
&lt;br /&gt;
* [http://simile.mit.edu/repository/RDFizers/flat2rdf/ flat2rdf]  converts classic unix text database files, like /etc/passwd, into RDF/N3 (Simile)&lt;br /&gt;
* [http://www.w3.org/2000/10/swap/tab2n3.py tab2n3.py]  Takes Tab-separated text (as typically output by all kinds of things including Microsoft Output and Spreadsheets) and converts it to N3, using the column headings to generate property URIs.  (SWAP)&lt;br /&gt;
* [http://www.topbraidcomposer.com TopBraid Composer] can convert tab-separated spreadsheet files into an RDF/OWL class with corresponding properties and instances.&lt;br /&gt;
* XLWrap, [http://xlwrap.sourceforge.net] wraps CSV files (and spreadsheets) to arbitrary RDF graphs; supports local/HTTP loading, expressions similar to Excel/OpenOffice Calc, custom functions, usage via API or SPARQL endpoint&lt;br /&gt;
&lt;br /&gt;
=== GPS ===&lt;br /&gt;
&lt;br /&gt;
* [http://www.hackdiary.com/archives/000040.html garmin2rdf.py] Reads a Garmin GOPS receiver, dumping the contents in RDF/XML. (Matt Biddulph)&lt;br /&gt;
* [http://www.w3.org/2000/10/swap/pim/fromGarmin.py fromGarmin.py] Downloads GPS data from a Garmin on a serial link to an RDF/N3 file. (SWAP)&lt;br /&gt;
&lt;br /&gt;
=== iCalendar ===&lt;br /&gt;
&lt;br /&gt;
iCalendar is an IETF standard for calendar (event and to-do list) data.  &lt;br /&gt;
Icalendar files typically are stored with a .ics extension.&lt;br /&gt;
&lt;br /&gt;
* [http://www.w3.org/2002/12/cal/fromIcal.py fromIcal.py] converts iCalendar form to RDF&lt;br /&gt;
* [http://www.w3.org/2000/10/swap/pim/toIcal.py toIcal.py] converts RDF back into iCalendar.&lt;br /&gt;
* [http://aperture.sourceforge.net aperture.sf.net] java converter for iCalendar included&lt;br /&gt;
* [http://torrez.us/ics2rdf/] iCal to RDF Service&lt;br /&gt;
&lt;br /&gt;
=== Java bytecode ===&lt;br /&gt;
&lt;br /&gt;
* [http://simile.mit.edu/repository/RDFizers/java2rdf/ java2rdf] scans [http://java.sun.com/ java] bytecode for method calls and creates a description of the dependencies between classes and the package/archive encoded in RDF/N3. (Simile)&lt;br /&gt;
&lt;br /&gt;
=== Javadoc ===&lt;br /&gt;
&lt;br /&gt;
* [http://simile.mit.edu/repository/RDFizers/javadoc2rdf/ javadoc2rdf] is a doclet that makes javadoc output metadata about your code (structure of the classes, methods, comments, etc.) encoded in RDF/N3. (Simile) &lt;br /&gt;
&lt;br /&gt;
=== Issue tracking: [http://www.atlassian.com/software/jira/ Jira] ===&lt;br /&gt;
&lt;br /&gt;
* [http://simile.mit.edu/repository/RDFizers/jira2rdf/ jira2rdf]  transforms Atlassian Jira's events about bug reports and issue tracking into RDF/N3.&lt;br /&gt;
&lt;br /&gt;
=== JPEG ===&lt;br /&gt;
&lt;br /&gt;
The metadata within JPEG photo is encoded in the EXIF standard.&lt;br /&gt;
&lt;br /&gt;
* [http://simile.mit.edu/repository/RDFizers/jpeg2rdf/ jpeg2rdf] scans a folder for JPEG files, parses the EXIF and IPCT metadata found in those files and dumps an RDF/N3 representation of it into a file. (Simile)&lt;br /&gt;
* [http://www.w3.org/2000/10/swap/pim/jhead/ An adapted version of jhead] extracts RDF data form the EXIT encoded in JPEG files within a directory. Generates RDF/N3. (SWAP)&lt;br /&gt;
&lt;br /&gt;
=== LDIF ===&lt;br /&gt;
&lt;br /&gt;
This is format used for contact information in LDAP server system. It is for example exported by Thunderbird's address-book.&lt;br /&gt;
&lt;br /&gt;
* [http://www.w3.org/2000/10/swap/pim/ldif2n3.py ldif2n3.py]  Very incomplete, but useful. Generates foaf. Hides email addresses by hashing in the FOAF style if -m command flag is given. (SWAP)&lt;br /&gt;
&lt;br /&gt;
=== Makefile ===&lt;br /&gt;
&lt;br /&gt;
The unix Makefile syntax expresses dependencies between files in a software build.&lt;br /&gt;
&lt;br /&gt;
* [http://www.w3.org/2000/10/swap/util/make2n3.py make2n3.py]  Convert the makefiles in several directories in RDF and merge them to get the big picture. (SWAP)&lt;br /&gt;
&lt;br /&gt;
=== MARC ===&lt;br /&gt;
&lt;br /&gt;
* [http://simile.mit.edu/repository/RDFizers/marcmods2rdf/ marcmods2rdf] &lt;br /&gt;
transforms [http://www.loc.gov/marc/ MARC] records from Z39.2 format into [http://www.loc.gov/standards/mods/ MODS] and then from MODS to an RDF representation of MODS.&lt;br /&gt;
&lt;br /&gt;
=== Meteographical ===&lt;br /&gt;
* [http://inamidst.com/sw/meteo/ Meteo] is UK weather forecast data in RDF, extracted from NOAA's public domain GRIB files. Example: [http://inamidst.com/sw/meteo/rdf/London London].&lt;br /&gt;
&lt;br /&gt;
=== Microformats ===&lt;br /&gt;
* [http://www/incubator.apache.org/any23/ Apache Any23] is a Java library web service and command line tool for parsing multiple document formats and extracting structured data in RDF format from a variety of Web documents. It is used by [http://sindice.com/ Sindice.com]. The microformats support is [http://sindice.com/developers/microformat detailed in the Sindice.com documentation].&lt;br /&gt;
&lt;br /&gt;
=== Multimedia ===&lt;br /&gt;
&lt;br /&gt;
Following the [http://en.wikipedia.org/wiki/Don't_repeat_yourself DRY principle], a pointer to tools in the realm of multimedia (origin: [http://www.w3.org/2005/Incubator/mmsem MMSEM-XG]):&lt;br /&gt;
&lt;br /&gt;
* [http://www.w3.org/2005/Incubator/mmsem/wiki/Tools_and_Resources Multimedia Semantics Tools]&lt;br /&gt;
&lt;br /&gt;
=== OAI-PMH ===&lt;br /&gt;
&lt;br /&gt;
* [http://simile.mit.edu/repository/RDFizers/oai2rdf/ oai2rdf] harvests an [http://www.openarchives.org/OAI/openarchivesprotocol.html OAI-PMH] repository and transforms the captured metadata in an RDF representation thru pluggable XSLT stylesheets.&lt;br /&gt;
&lt;br /&gt;
=== Outlook ===&lt;br /&gt;
&lt;br /&gt;
Microsoft Outlook contains contact and event data, and so on in a proprietary format.&lt;br /&gt;
&lt;br /&gt;
* [http://www.w3.org/2000/10/swap/pim/lookout.py Lookout.py] convers the Microsoft Outlook calendar and address format into RDF.  (SWAP)&lt;br /&gt;
* [http://aperture.sourceforge.net aperture.sf.net] includes Java crawler for MS Outlook&lt;br /&gt;
&lt;br /&gt;
=== Open Financial Exchange (OFX) ===&lt;br /&gt;
&lt;br /&gt;
[http://www.ofx.net/ OFX] is the format for downloaded bank statements and other financial information.&lt;br /&gt;
There are various levels of OFX, the early ones being HTTP headers followed by SGML, the later ones being HTTP-like headers followed by XML.&lt;br /&gt;
&lt;br /&gt;
* [http://www.w3.org/2000/10/swap/pim/financial/OFX-to-n3.py OFX-to-n3.y] converts OFX format to RDF/N3. The conversion is only syntactic.  The OFX modeling is pretty well thought out, so taking it as defining an RDF ontology seems to make sense. Rules can then be used to define mapping into your favorite ontology.&lt;br /&gt;
&lt;br /&gt;
=== Open [[CourseWare]] ===&lt;br /&gt;
&lt;br /&gt;
* [http://simile.mit.edu/repository/RDFizers/ocw2rdf/ ocw2rdf] harvests metadata from the MIT [http://ocw.mit.edu/ OpenCourseWare] web site and transforms it into an RDF representation of [http://meta.wikimedia.org/wiki/IEEE_LOM  IEEE LOM].&lt;br /&gt;
&lt;br /&gt;
=== Palm OS ===&lt;br /&gt;
&lt;br /&gt;
* [http://dev.w3.org/cvsweb/2001/palmagent Palmagent] converts the calendar format of PalmOS into RDF. (SWAP)&lt;br /&gt;
&lt;br /&gt;
=== plist ===&lt;br /&gt;
&lt;br /&gt;
The Apple OS-X property list (.plist) filetype is an XML fromat for arbitrary structured data.&lt;br /&gt;
Numeric keys are used as local IDs.   OS X applications store many kinds uf data in these files, including configuration data,  iPhoto almum and photo data, iTunes metadata, and so on.&lt;br /&gt;
&lt;br /&gt;
To convert plists well, added information is necessary, such as a namespace for the properties.&lt;br /&gt;
&lt;br /&gt;
[http://dev.w3.org/cvsweb/2000/10/swap/util/plist2rdf.xsl plist2rdf.xsl] is an XSLT script to convert a plist file into RDF/XML. It does not add namespaces to the exported data.&lt;br /&gt;
&lt;br /&gt;
=== Quicken Interchange Format (QIF) ===&lt;br /&gt;
&lt;br /&gt;
* [http://www.w3.org/2000/10/swap/pim/qif2n3.py qif2n3.py] Takes Quicken interchange format and converts to to RDF/N3. (SWAP)&lt;br /&gt;
&lt;br /&gt;
=== Quick and Dirty CSV to RDF Converter (QUIDICRC) ===&lt;br /&gt;
&lt;br /&gt;
* [http://www.mindswap.org/~anant/quidicrc/ quidicrc] A perl script for rapidly transferring csv to RDF with some translation in the middle. (not actively being maintained, available open source -- SWAP)&lt;br /&gt;
&lt;br /&gt;
=== Random ===&lt;br /&gt;
&lt;br /&gt;
Seriously.&lt;br /&gt;
&lt;br /&gt;
* [http://simile.mit.edu/repository/RDFizers/random2rdf/ random2rdf] generates synthetic random graphs encoded in RDF/N3.&lt;br /&gt;
&lt;br /&gt;
=== Spreadsheet ===&lt;br /&gt;
&lt;br /&gt;
* An [http://lab.linkeddata.deri.ie/2010/grefine-rdf-extension/ RDF Extension] is available for [http://code.google.com/p/google-refine/ Google Refine]. It can export Excel, CSV, and other tabular data to RDF. The schema mapping can be defined in a graphical UI.&lt;br /&gt;
* [http://www.mindswap.org/%7Erreck/excel2rdf.shtml Esxcel2rdf] is a Microsoft Windows program (exe) that converts Excel files into valid RDF. It has been tested on Windows 98, and Windows 2000 Professional. ([[MindSwap]]) Export can be done via comma- or tab- separated values. See Flat Files above.&lt;br /&gt;
* [http://aperture.sourceforge.net aperture.sf.net] includes Java crawler for Excel and open document. Does only extract plaintext and basic metadata, though.&lt;br /&gt;
* [http://rdf123.umbc.edu/ RDF123] has Windows and Linux applications to download, a Java application and servlet. &lt;br /&gt;
* Cambridge Semantics' [http://www.cambridgesemantics.com/products/anzo_for_excel Anzo for Excel] extracts RDF data from Excel spreadsheets while keeping the spreadsheet in-sync with the underlying data as things change&lt;br /&gt;
* XLWrap, [http://xlwrap.sourceforge.net] wraps spreadsheets (including cross tables) to arbitrary RDF graphs; supports Excel/OpenDocument/CSV streamed processing, local/HTTP loading, expressions similar to Excel/OpenOffice Calc, custom functions, usage via API or SPARQL endpoint&lt;br /&gt;
* [http://purl.org/twc/id/software/csv2rdf4lod csv2rdf4lod] uses declarative RDF enhancement parameters to specify how to transform tabular data into well-structured, well-connected RDF. The tool uses identifiers for ''source'' organization, ''dataset'', and ''version'' to establish default namespaces for all URIs created and provides VoID and provenance metadata as part of the conversion output.&lt;br /&gt;
&lt;br /&gt;
=== SQL ===&lt;br /&gt;
&lt;br /&gt;
SQL databases are rich stores of relational data ideal for export as RDF. &lt;br /&gt;
Conference tracks and many papers cover this subject from different angles. See also: [[RdfAndSql]]&lt;br /&gt;
&lt;br /&gt;
* [http://d2rq.org/ D2RQ] provides a mapping from a SQL server (tested with several brands), producing both linked virtual RDF data files and a SPARQL service.  Uses a configuration file in Turtle. (DERI and FU Berlin)&lt;br /&gt;
* [http://www.w3.org/2000/10/swap/dbork/dbview.py dbview.py] provides a mapping from a SQL server (tested with mySQL), producing linked virtual RDF data files. Uses a configuration file in N3. (SWAP)&lt;br /&gt;
* [[VirtuosoUniversalServer|OpenLink Virtuoso]]'s [http://virtuoso.openlinksw.com/wiki/main/Main/VOSSQLRDF declarative N3/Turtle based Metaschema Language] enables the creation of RDF Instance Data for associated RDF Ontologies via RDF VIEWs of ODBC, JDBC, ADO.NET, and OLE-DB accessible SQL Data. It is important to note that these VIEWs also apply to Native Virtuoso Data and/or Heterogeneous Data from other Web Services, HTTP/WebDAV, NNTP, and other Data Sources known to Virtuoso. This is an enhancement of the traditional SQL VIEW concept than enables multiple use of the same base SQL Data from a variety of data access points.&lt;br /&gt;
* [http://triplify.org Triplify] is a small plugin for Web applications, which reveals the semantic structures encoded in relational databases by making database content available as RDF, JSON or Linked Data.&lt;br /&gt;
* [http://www.tao-project.eu/researchanddevelopment/demosanddownloads/RDBToOnto.html RDBToOnto] is a full-fledged conversion tool that can produce accurate RDF/OWL models from various types of relational databases and Excel spreadsheets. The conversion is fully automated while various parameters can be set through the user interface to refine the resulting models (e.g., derivation of rich class hierarchies, proper naming of instances, database optimization before conversion, etc).  &lt;br /&gt;
&lt;br /&gt;
Many RDF Triple stores are implemented using SQL databases, but that is not covered here.&lt;br /&gt;
&lt;br /&gt;
=== Subversion ===&lt;br /&gt;
&lt;br /&gt;
[http://subversion.tigris.org/ Subversion] is a code-management system.&lt;br /&gt;
&lt;br /&gt;
* [http://simile.mit.edu/repository/RDFizers/svn2rdf/ svn2rdf] A pair of scripts; one can be used in a post-commit subversion hook to generate RDF/N3 with each commit, the other on a working copy. (Simile)&lt;br /&gt;
&lt;br /&gt;
=== Tab Separated Text ===&lt;br /&gt;
&lt;br /&gt;
See flat files.&lt;br /&gt;
&lt;br /&gt;
=== Talis SW Format Converter ===&lt;br /&gt;
&lt;br /&gt;
* [http://convert.test.talis.com/ Talis' converter], convert from various format to various formats (including RDF-&amp;gt;RDF with various serializations, RDF-&amp;gt;HTML, etc)&lt;br /&gt;
&lt;br /&gt;
=== UML ===&lt;br /&gt;
&lt;br /&gt;
* [http://www.topbraidcomposer.com TopBraid Composer] can convert UML Class Diagrams (XMI format) into RDF/OWL models.&lt;br /&gt;
* [http://eulergui.sourceforge.net/ EulerGUI] is a lightweight IDE that translates on the fly UML and eCore XMI into N3. Moreover there are N3 rules to convert UML to OWL.&lt;br /&gt;
&lt;br /&gt;
=== VCARD, Addressbook, … ===&lt;br /&gt;
&lt;br /&gt;
VACRD is a standard for interchange of contact data, such as business cards and address books.&lt;br /&gt;
&lt;br /&gt;
[http://www.w3.org/TR/vcard-rdf &amp;quot;Representing vCard Objects in RDF/XML&amp;quot;] is a W3C note defining an [http://www.w3.org/2006/vcard/ns ontology] for VCARD. FOAF is widely used ontology covering some of the domain.&lt;br /&gt;
&lt;br /&gt;
* [http://www.holygoat.co.uk/applications/address-book-foaf/projects/ab/ab.py code to convert your Apple Addressbook into FOAF file] (Richard Newman)&lt;br /&gt;
* [http://people.no-distance.net/ol/software/ab-foaf/ ab-foaf] does the same. &lt;br /&gt;
* [http://search.cpan.org/dist/XML-FOAFKnows-FromvCard/ XML::FOAFKnows::FromvCard], Perl extension to create FOAF dumps from vCards. Does not attempt to create a full model, just foaf:knows. It also has some privacy features. In addition to the module, which conforms with the Formatter API specification, comes with a command-line tool. &lt;br /&gt;
&lt;br /&gt;
=== Weather ===&lt;br /&gt;
&lt;br /&gt;
* [http://simile.mit.edu/repository/RDFizers/weather2rdf/ weather2rdf] Given a US city or ZIP code, retrieves weather report data from weather.com and returns it in RDF. (Simile)&lt;br /&gt;
&lt;br /&gt;
=== XML ===&lt;br /&gt;
&lt;br /&gt;
* '''GRDDL:''' Any XML files can be marked up with pointers to XSLT files which convert them to RDF.  The standard for this is [http://www.w3.org/TR/grddl/ GRDDL].  A GRDDL pointer can even be put in an XML schema, so that automatically all XML documents written to that schema will have a defined RDF mapping which any GRDDL-aware processor will benefit from. Several XSLT conversion transformations can be found linked from [[MicroModels]]&lt;br /&gt;
* &amp;lt;span id=&amp;quot;krextor&amp;quot;&amp;gt;&amp;lt;/span&amp;gt;[http://kwarc.info/projects/krextor/ Krextor] is a framework for extracting RDF in various notations from various XML languages and can easily be extended for additional input languages.  Support for RDFa and some mathematical markup languages is built in.  The implementation is done in XSLT, with a command-line frontend and a Java wrapper.&lt;br /&gt;
* [http://www.topbraidcomposer.com TopBraid Composer] can convert XML Schema (and their XML instance files) into RDF/OWL models.&lt;br /&gt;
* Rhizomik [http://rhizomik.net/redefer/ ReDeFer] includes XSD2OWL and XML2RDF plus MPEG-7 to RDF (all XSLT-based)&lt;br /&gt;
* '''XHTML:''' Convert ''existing'' pages to RDF. For example, see [[HtmlToRdf]].&lt;br /&gt;
* [http://www.dblab.ntua.gr/~bikakis/SPARQL2XQuery.html SPARQL2XQuery] The SPARQL2XQuery Framework provides mechanisms for: (a) Query translation (SPARQL to XQuery) (b) Mapping specification &amp;amp; generation (Ontology to XML Schema) (c) Schema transformation (XML Schema to OWL) and (d) Data Transformation (XML to RDF and vice versa)&lt;br /&gt;
&lt;br /&gt;
=== XMP ===&lt;br /&gt;
&lt;br /&gt;
[http://www.adobe.com/products/xmp/ XMP] is an Adobe-sponsored specification for putting RDF metadata in virtually any form of file, including binary formats.  XMP metadata is RDF data in fact, but it has to be extracted from the file.&lt;br /&gt;
&lt;br /&gt;
* [http://www.inf.unideb.hu/~jeszy/xmp/ xmpextractor] extracts XMP data. ([http://www.inf.unideb.hu/~jeszy/ Jeszenszky Péter])&lt;br /&gt;
* [http://dev.w3.org/cvsweb/~checkout~/2004/PythonLib-IH/xmp.py A python script to extract XMP]. There is also a service to do that on-line, see [http://www.ivan-herman.net/WebLog/WorkRelated/SemanticWeb/xmpextract.html separate page]&lt;br /&gt;
&lt;br /&gt;
== Frameworks ==&lt;br /&gt;
&lt;br /&gt;
The following are general tools which provide conversion from many formats.&lt;br /&gt;
&lt;br /&gt;
=== AnnoCultor ===&lt;br /&gt;
&lt;br /&gt;
[http://annocultor.eu/ AnnoCultor] was built during several years of practical work on porting various datasets to RDF. It allows converting data from the following data sources:&lt;br /&gt;
* databases via SQL and JDBC;&lt;br /&gt;
* XML files, also in batch;&lt;br /&gt;
* RDF files,&lt;br /&gt;
* Solr servers,&lt;br /&gt;
* custom formats, via format-specific parsers written in Java.&lt;br /&gt;
&lt;br /&gt;
AnnoCultor is specifically suited for the situations where XSLT is not sufficient.&lt;br /&gt;
&lt;br /&gt;
It comes with built-in converters for Geonames and Getty vocabularies (AAT, ULAN, TGN), that are ready to use. &lt;br /&gt;
Several additional specific converters illustrate advanced use: converters for collections of Louvre and Joconde, &lt;br /&gt;
Institute Collection Netherlands, Dutch Museum of Asian Ceramics, Tropenmuseum Amsterdam.&lt;br /&gt;
&lt;br /&gt;
As part of conversion, AnnoCultor can semantically tag (enrich) data with links to various vocabularies, with advanced customised disambiguation and term processing possibilities. &lt;br /&gt;
These vocabularies should be represented in RDF or SKOS to be imported via SPARQL queries. &lt;br /&gt;
AnnoCultor comes with built-in tagging with Geonames and a custom time ontology.&lt;br /&gt;
&lt;br /&gt;
AnnoCultor is written in Java, but conversion rules are written in XML. They are extendible with either small Java snippets, or custom rules implementions in Java.&lt;br /&gt;
AnnoCultor has been practically used with datasets ranging from a few records to more than ten millions, containing up to dozens fields each.&lt;br /&gt;
&lt;br /&gt;
=== Apache Any23 ===&lt;br /&gt;
&lt;br /&gt;
[[http://www/incubator.apache.org/any23/ Apache Any23] is a Java library web service and command line tool for parsing multiple document formats and extracting structured data in RDF format from a variety of Web documents. Currently it supports the following input formats:&lt;br /&gt;
&lt;br /&gt;
* RDF/XML, Turtle, Notation 3&lt;br /&gt;
* RDFa&lt;br /&gt;
* Microformats: Adr, Geo, hCalendar, hCard, hListing, hResume, hReview, License, XFN and Species&lt;br /&gt;
&lt;br /&gt;
Apache Any23 is used in major Web of Data applications such as [http://sindice.com/ sindice.com] and [http://sig.ma/ sig.ma].&lt;br /&gt;
&lt;br /&gt;
=== Aperture ===&lt;br /&gt;
&lt;br /&gt;
* [http://aperture.sourceforge.net/ Aperture] is a project written in Java gathering RDF extractors for many formats, mentioned in the list above.&lt;br /&gt;
Aperture supports crawling, making it not a converter but a framework to crawl updates of data (like rsync).&lt;br /&gt;
&lt;br /&gt;
=== [[PiggyBank]] ===&lt;br /&gt;
&lt;br /&gt;
* [http://simile.mit.edu/piggy-bank/ Piggy-bank] is a [http://simile.mit.edu/ Simile] project which allows the Firefox-based clent to automatically load &amp;quot;[http://simile.mit.edu/RDFizers/ RDFizers]&amp;quot;, javascript-based converters to RDF. &lt;br /&gt;
Piggy-bank associates given scarping scripts with given web sites. (How?) &lt;br /&gt;
&lt;br /&gt;
=== Triplr ===&lt;br /&gt;
&lt;br /&gt;
[http://triplr.org/ Triplr] is a general “Stuff in, triples out” system by Dave Beckett. Triplr handles GRDDL, RSS, Atom, and other formats.&lt;br /&gt;
&lt;br /&gt;
=== Virtuoso Sponger ===&lt;br /&gt;
&lt;br /&gt;
[[OpenLinkSoftware|OpenLink Software]] via the &amp;quot;[http://virtuoso.openlinksw.com/dataspace/dav/wiki/Main/VirtSponger Sponger]&amp;quot; component of [[VirtuosoUniversalServer|Virtuoso]]'s SPARQL Processor and Proxy Web Service (used by default by [[OpenLinkDataExplorer| OpenLink Data Explorer]]) provides RDFization for:&lt;br /&gt;
* RDFa &lt;br /&gt;
* GRDDL&lt;br /&gt;
* Amazon Web Services&lt;br /&gt;
* eBay Web Services&lt;br /&gt;
* Freebase Web Services&lt;br /&gt;
* Facebook Web Services&lt;br /&gt;
* Yahoo! Finance&lt;br /&gt;
* XBRL Instance documents&lt;br /&gt;
* DOI (includes a custom resolver for HTTP)&lt;br /&gt;
* OAI&lt;br /&gt;
* RSS/Atom Feeds&lt;br /&gt;
* Digital Music Files (various formats via ID3 Tags)&lt;br /&gt;
* Image Files&lt;br /&gt;
* vCard&lt;br /&gt;
* iCalendar&lt;br /&gt;
* Microformats - hCard, hCalendar&lt;br /&gt;
* HR-XML Resumes &lt;br /&gt;
* Flickr &lt;br /&gt;
* Del.icio.us&lt;br /&gt;
* Bugzilla &lt;br /&gt;
* ODBC or JDBC accessible SQL Data&lt;br /&gt;
* [http://virtuoso.openlinksw.com/dataspace/dav/wiki/Main/VirtSpongerCartridgeSupportedDataSources Many others]&lt;br /&gt;
&lt;br /&gt;
= Notes =&lt;br /&gt;
&lt;br /&gt;
Historically, this list was made from a lists of [http://simile.mit.edu/RDFizers/ RDFizers] and&lt;br /&gt;
[http://www.w3.org/2002/Talks/09-lcs-sweb-tbl/slide31-1.html SWAP converters].&lt;br /&gt;
It has grown significantly from community input since then.&lt;br /&gt;
&lt;br /&gt;
This should be in a data format like Semantic Media Wiki or in N3 -- TimBL&lt;br /&gt;
&lt;br /&gt;
&amp;gt; Would there an advantage to have this kind of list in an RDF file specifically to make queries on it. Maybe if we add a format on how to declare it here, we could create a converter to RDF. -- [[KarlDubost]]&lt;br /&gt;
&lt;br /&gt;
&amp;gt; The task force [http://esw.w3.org/topic/SweoIG/TaskForces/InfoGathering InfoGathering] from SWEO works on such a vocabulary, if you want to rewrite this list using this vocab, look here: [http://esw.w3.org/topic/SweoIG/TaskForces/InfoGathering/DataVocabulary DataVocabulary] or contact me -- [[LeoSauermann]] on 22.1.2007&lt;br /&gt;
----&lt;br /&gt;
&lt;br /&gt;
----&lt;br /&gt;
[[Category:SwTools]] [[Category:SwTools]]&lt;/div&gt;</description>
			<pubDate>Tue, 03 Apr 2012 10:22:55 GMT</pubDate>			<dc:creator>Rcygania2</dc:creator>			<comments>http://www.w3.org/wiki/Talk:ConverterToRdf</comments>		</item>
		<item>
			<title>TaskForces/CommunityProjects/LinkingOpenData/DataSets/CKANmetainformation</title>
			<link>http://www.w3.org/wiki/TaskForces/CommunityProjects/LinkingOpenData/DataSets/CKANmetainformation</link>
			<description>&lt;p&gt;Rcygania2:&amp;#32;&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;= Guidelines for Collecting Metadata on Linked Datasets in CKAN =&lt;br /&gt;
&lt;br /&gt;
For keeping the LOD cloud diagram up to date, the Linking Open Data community effort has started to collect meta-information about Linked datasets on [http://www.ckan.net CKAN], a registry of open data and content packages provided by the Open Knowledge Foundation.&lt;br /&gt;
&lt;br /&gt;
This page explains how dataset publishers or other people that want a dataset to be added to the LOD cloud, describe datasets on CKAN.&lt;br /&gt;
&lt;br /&gt;
The list of datasets about which we have already collected information is be found here:&lt;br /&gt;
&lt;br /&gt;
http://www4.wiwiss.fu-berlin.de/lodcloud/ckan/validator/&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Which datasets are included into the LOD cloud diagram? ==&lt;br /&gt;
&lt;br /&gt;
All datasets are included that fullfil the following requirements:&lt;br /&gt;
&lt;br /&gt;
# Data items are accessible via dereferencable URIs&lt;br /&gt;
# The dataset sets at least 50 RDF links pointing at other datasets or at least one other dataset is setting 50 RDF links pointing at your dataset.&lt;br /&gt;
&lt;br /&gt;
== How do I add a data set to CKAN or edit an existing data set? ==&lt;br /&gt;
&lt;br /&gt;
# Please register with [http://www.ckan.net CKAN] before editing or adding any packages.&lt;br /&gt;
# Please confirm that your data set does not already exist on CKAN before adding a new data set.&lt;br /&gt;
# Add or edit your data set and describe it with the following minimum required information: &lt;br /&gt;
#* CKAN name (a unique id)&lt;br /&gt;
#* title&lt;br /&gt;
#* URL &lt;br /&gt;
#* number of triples&lt;br /&gt;
#* links to other data sets. &lt;br /&gt;
# Please tag newly added data sets with &amp;lt;code&amp;gt;''lod''&amp;lt;/code&amp;gt;. &lt;br /&gt;
# If you are not aware of any in- or outlinks, tag it with &amp;lt;code&amp;gt;''lodcloud.nolinks''&amp;lt;/code&amp;gt;.&lt;br /&gt;
# Please provide as much additional information as possible (e.g. SPARQL endpoint, voiD description, license, and the topic of the data set) as described below. This information helps the community to know more about the development state of the Web of Linked Data and is made available via the CKAN API. &lt;br /&gt;
&lt;br /&gt;
=== Minimum Information ===&lt;br /&gt;
&lt;br /&gt;
Please provide the following minimum information about your data set.  &lt;br /&gt;
&lt;br /&gt;
==== Standard CKAN fields ====&lt;br /&gt;
&lt;br /&gt;
{| border=&amp;quot;1px&amp;quot; cellpadding=&amp;quot;2&amp;quot; cellspacing=&amp;quot;0&amp;quot;&lt;br /&gt;
|'''Field name'''&lt;br /&gt;
|'''Description'''&lt;br /&gt;
|'''Format/Examples'''&lt;br /&gt;
|-&lt;br /&gt;
|Name&lt;br /&gt;
|Unique ID for your data set on CKAN&lt;br /&gt;
|[a-z0-9-]+ &amp;quot;my-dataset&amp;quot;&lt;br /&gt;
|-&lt;br /&gt;
|Title&lt;br /&gt;
|Full name of your data set&lt;br /&gt;
|&amp;quot;My Dataset&amp;quot;&lt;br /&gt;
|-&lt;br /&gt;
|URL&lt;br /&gt;
|Link to data set homepage&lt;br /&gt;
|http://example.com/my-ds&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
==== Custom CKAN fields ====&lt;br /&gt;
&lt;br /&gt;
{| border=&amp;quot;1px&amp;quot; cellpadding=&amp;quot;2&amp;quot; cellspacing=&amp;quot;0&amp;quot;&lt;br /&gt;
|'''Field name'''&lt;br /&gt;
|'''Description'''&lt;br /&gt;
|'''Format/Examples'''&lt;br /&gt;
|-&lt;br /&gt;
|triples&lt;br /&gt;
|Approximate size of your data set in RDF triples&lt;br /&gt;
|100000, 62345123&lt;br /&gt;
|-&lt;br /&gt;
|links:xxx&lt;br /&gt;
|Number of RDF links pointing at data set with Data Hub ID xxx (http://thedatahub.org/dataset/xxx). Please provide separate links xxx statements for each data set your are linking to&lt;br /&gt;
|20000&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
==== CKAN tags ====&lt;br /&gt;
&lt;br /&gt;
Please use the following tags to provide meta-information about your data set. &lt;br /&gt;
&lt;br /&gt;
We will use the topic information to color the LOD cloud later. &lt;br /&gt;
&lt;br /&gt;
Please also list the vocabularies used by your data set so that the community can get an overview of which vocabularies are commonly used on the Web of Linked Data. &lt;br /&gt;
&lt;br /&gt;
Linked Data published on the Web should be as [http://www.w3.org/2001/tag/doc/selfDescribingDocuments.html self-describing] as possible in order to make it easier for clients to understand and use the data. Important aspects of self-descriptiveness are making vocabulary terms dereferenceable according to the best practices described in [http://www.w3.org/TR/swbp-vocab-pub/ Publishing RDF Vocabularies], using terms from common vocabularies and providing vocabulary mappings for proprietary vocabulary terms. In order to allow the community to get an overview which data sets implement these best practices, please tag your data set accordingly.&lt;br /&gt;
&lt;br /&gt;
{| border=&amp;quot;1px&amp;quot; cellpadding=&amp;quot;2&amp;quot; cellspacing=&amp;quot;0&amp;quot;&lt;br /&gt;
|'''Tag'''&lt;br /&gt;
|'''Purpose'''&lt;br /&gt;
|-&lt;br /&gt;
|&amp;lt;code&amp;gt;''&amp;amp;lt;topic&amp;amp;gt;''&amp;lt;/code&amp;gt;&lt;br /&gt;
|One of:&lt;br /&gt;
* &amp;lt;code&amp;gt;media&amp;lt;/code&amp;gt;&lt;br /&gt;
* &amp;lt;code&amp;gt;geographic&amp;lt;/code&amp;gt;&lt;br /&gt;
* &amp;lt;code&amp;gt;lifesciences&amp;lt;/code&amp;gt;&lt;br /&gt;
* &amp;lt;code&amp;gt;publications&amp;lt;/code&amp;gt; (including library and museum data)&lt;br /&gt;
* &amp;lt;code&amp;gt;government&amp;lt;/code&amp;gt;&lt;br /&gt;
* &amp;lt;code&amp;gt;ecommerce&amp;lt;/code&amp;gt;&lt;br /&gt;
* &amp;lt;code&amp;gt;socialweb&amp;lt;/code&amp;gt; (people and their activities)&lt;br /&gt;
* &amp;lt;code&amp;gt;usergeneratedcontent&amp;lt;/code&amp;gt; (blog posts, discussions, pictures, ...) &lt;br /&gt;
* &amp;lt;code&amp;gt;schemata&amp;lt;/code&amp;gt; (structural resources, including vocabularies, ontologies, classifications, thesauri)&lt;br /&gt;
* &amp;lt;code&amp;gt;crossdomain&amp;lt;/code&amp;gt;&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
=== Enhanced Information ===&lt;br /&gt;
&lt;br /&gt;
Please provide the following additional information about your data set.  This information helps the community to know more about the development state of the Web of Linked Data and is made available via the CKAN API.&lt;br /&gt;
&lt;br /&gt;
==== Standard CKAN fields ====&lt;br /&gt;
&lt;br /&gt;
{| border=&amp;quot;1px&amp;quot; cellpadding=&amp;quot;2&amp;quot; cellspacing=&amp;quot;0&amp;quot;&lt;br /&gt;
|'''Field name'''&lt;br /&gt;
|'''Description'''&lt;br /&gt;
|'''Format/Examples'''&lt;br /&gt;
|-&lt;br /&gt;
|Version&lt;br /&gt;
|Last modification date or version of your data set&lt;br /&gt;
|&amp;quot;2010-04 (3.5)&amp;quot;, &amp;quot;2006&amp;quot;, &amp;quot;beta&amp;quot;&lt;br /&gt;
|-&lt;br /&gt;
|Notes&lt;br /&gt;
|Description of your data set&lt;br /&gt;
|some free text&lt;br /&gt;
|-&lt;br /&gt;
|Author	&lt;br /&gt;
|Name of publishing org and/or person&lt;br /&gt;
|&amp;quot;Talis (Leigh Dodds)&amp;quot;&lt;br /&gt;
|-&lt;br /&gt;
|Author email&lt;br /&gt;
|Contact email&lt;br /&gt;
|leigh@ldodds.com&lt;br /&gt;
|-&lt;br /&gt;
|License&lt;br /&gt;
|Standard license drop-down&lt;br /&gt;
|OSI approved::MIT license&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
==== Custom CKAN fields ====&lt;br /&gt;
&lt;br /&gt;
{| border=&amp;quot;1px&amp;quot; cellpadding=&amp;quot;2&amp;quot; cellspacing=&amp;quot;0&amp;quot;&lt;br /&gt;
|'''Field name'''&lt;br /&gt;
|'''Description'''&lt;br /&gt;
|'''Format/examples'''&lt;br /&gt;
|-&lt;br /&gt;
|shortname&lt;br /&gt;
|Short name for LOD bubble&lt;br /&gt;
|&amp;quot;NY Times&amp;quot;&lt;br /&gt;
|-&lt;br /&gt;
|license_link&lt;br /&gt;
|Custom license link&lt;br /&gt;
|http://example.com/so-sue-me&lt;br /&gt;
|-&lt;br /&gt;
|sparql_graph_name	&lt;br /&gt;
|Named graph in SPARQL store (if used by your SPARQL endpoint)&lt;br /&gt;
|http://species.geospecies.org&lt;br /&gt;
|-&lt;br /&gt;
|namespace	&lt;br /&gt;
|Instance namespace&lt;br /&gt;
|http://dbpedia.org/resource/&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
==== CKAN resource links ====&lt;br /&gt;
&lt;br /&gt;
Links (other than dereferenceable URIs) that enable alternative access to the data set (e.g., via downloads or SPARQL endpoints) should be specified in the Resources section of the CKAN entry form. Please also provide links to the [http://vocab.deri.ie/void/guide voiD description] or [http://sw.deri.org/2007/07/sitemapextension/ Semantic Web Sitemap] describing your data set.&lt;br /&gt;
&lt;br /&gt;
{| border=&amp;quot;1px&amp;quot; cellpadding=&amp;quot;2&amp;quot; cellspacing=&amp;quot;0&amp;quot;&lt;br /&gt;
|'''Purpose'''&lt;br /&gt;
|'''Format'''&lt;br /&gt;
|'''Description'''&lt;br /&gt;
|-&lt;br /&gt;
|Download page&lt;br /&gt;
| —&lt;br /&gt;
|Download&lt;br /&gt;
|-&lt;br /&gt;
|XML Sitemap&lt;br /&gt;
|&amp;lt;code&amp;gt;meta/sitemap&amp;lt;/code&amp;gt;&lt;br /&gt;
|XML Sitemap&lt;br /&gt;
|-&lt;br /&gt;
|SPARQL endpoint&lt;br /&gt;
|&amp;lt;code&amp;gt;api/sparql&amp;lt;/code&amp;gt;&lt;br /&gt;
|SPARQL endpoint&lt;br /&gt;
|-&lt;br /&gt;
|voiD file&lt;br /&gt;
|&amp;lt;code&amp;gt;meta/void&amp;lt;/code&amp;gt;&lt;br /&gt;
|voiD description&lt;br /&gt;
|-&lt;br /&gt;
|RDF/XML download&lt;br /&gt;
|&amp;lt;code&amp;gt;application/rdf+xml&amp;lt;/code&amp;gt;&lt;br /&gt;
|Download&lt;br /&gt;
|-&lt;br /&gt;
|Turtle download&lt;br /&gt;
|&amp;lt;code&amp;gt;text/turtle&amp;lt;/code&amp;gt;&lt;br /&gt;
|Download&lt;br /&gt;
|-&lt;br /&gt;
|N-Triples download&lt;br /&gt;
|&amp;lt;code&amp;gt;application/x-ntriples&amp;lt;/code&amp;gt;&lt;br /&gt;
|Download&lt;br /&gt;
|-&lt;br /&gt;
|N-Quads download&lt;br /&gt;
|&amp;lt;code&amp;gt;application/x-nquads&amp;lt;/code&amp;gt;&lt;br /&gt;
|Download&lt;br /&gt;
|-&lt;br /&gt;
|RDF Schema&lt;br /&gt;
|&amp;lt;code&amp;gt;meta/rdf-schema&amp;lt;/code&amp;gt;&lt;br /&gt;
|Download link to RDF/OWL Schema used by your data set (in addition to having dereferenceable vocabulary URIs)&lt;br /&gt;
|-&lt;br /&gt;
|RDF/XML example link&lt;br /&gt;
|&amp;lt;code&amp;gt;example/rdf+xml&amp;lt;/code&amp;gt;&lt;br /&gt;
|Link to an example data item within your data set (RDF/XML)&lt;br /&gt;
|-&lt;br /&gt;
|Turtle example link&lt;br /&gt;
|&amp;lt;code&amp;gt;example/turtle&amp;lt;/code&amp;gt;&lt;br /&gt;
|Link to an example data item within your data set  (Turtle)&lt;br /&gt;
|-&lt;br /&gt;
|N-Triples example link&lt;br /&gt;
|&amp;lt;code&amp;gt;example/ntriples&amp;lt;/code&amp;gt;&lt;br /&gt;
|Link to an example data item within your data set  (N-Triples)&lt;br /&gt;
|-&lt;br /&gt;
|HTML+RDFa example link&lt;br /&gt;
|&amp;lt;code&amp;gt;example/rdfa&amp;lt;/code&amp;gt;&lt;br /&gt;
|Link to an example data item within your data set  (RDFa)&lt;br /&gt;
|-&lt;br /&gt;
|Vocabulary Mappings, e.g., OWL, RDFS, RIF, R2R&lt;br /&gt;
|&amp;lt;code&amp;gt;mapping/''&amp;amp;lt;format&amp;amp;gt;''&amp;lt;/code&amp;gt;&lt;br /&gt;
|If your data set uses proprietary vocabulary terms and you know these terms also exists in other vocabularies, you should set &amp;lt;code&amp;gt;owl:equivalentClass&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;owl:equivalentProperty&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;rdfs:subClassOf&amp;lt;/code&amp;gt;, and/or &amp;lt;code&amp;gt;rdfs:subPropertyOf&amp;lt;/code&amp;gt; links pointing at these terms or provide mapping expressed as RIF rules or using the R2R Mapping Language. If your mappings can be downloaded as a single file, please provide the link to the download.&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
==== CKAN tags ====&lt;br /&gt;
&lt;br /&gt;
Please use the following tags to provide meta-information about your data set. &lt;br /&gt;
&lt;br /&gt;
We will use the topic information to color the LOD cloud later. &lt;br /&gt;
&lt;br /&gt;
Please also list the vocabularies used by your data set so that the community can get an overview of which vocabularies are commonly used on the Web of Linked Data. &lt;br /&gt;
&lt;br /&gt;
Linked Data published on the Web should be as [http://www.w3.org/2001/tag/doc/selfDescribingDocuments.html self-describing] as possible in order to make it easier for clients to understand and use the data. Important aspects of self-descriptiveness are making vocabulary terms dereferenceable according to the best practices described in [http://www.w3.org/TR/swbp-vocab-pub/ Publishing RDF Vocabularies], using terms from common vocabularies and providing vocabulary mappings for proprietary vocabulary terms. In order to allow the community to get an overview which data sets implement these best practices, please tag your data set accordingly.&lt;br /&gt;
&lt;br /&gt;
{| border=&amp;quot;1px&amp;quot; cellpadding=&amp;quot;2&amp;quot; cellspacing=&amp;quot;0&amp;quot;&lt;br /&gt;
|'''Tag'''&lt;br /&gt;
|'''Purpose'''&lt;br /&gt;
|-&lt;br /&gt;
|&amp;lt;code&amp;gt;format-''&amp;amp;lt;prefix&amp;amp;gt;''&amp;lt;/code&amp;gt;&lt;br /&gt;
|A [http://esw.w3.org/TaskForces/CommunityProjects/LinkingOpenData/CommonVocabularies vocabulary] used by the data set, e.g., &amp;lt;code&amp;gt;format-skos&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;format-dc&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;format-foaf&amp;lt;/code&amp;gt;. Use http://prefix.cc/ to find a prefix for a vocabulary. If a vocabulary is not in prefix.cc, then add it there or ignore that vocabulary.&lt;br /&gt;
|-&lt;br /&gt;
|&amp;lt;code&amp;gt;no-proprietary-vocab&amp;lt;/code&amp;gt;&lt;br /&gt;
|Indicates that your data set does not use a proprietary vocabulary (defined within your top-level domain).&lt;br /&gt;
|-&lt;br /&gt;
|&amp;lt;code&amp;gt;deref-vocab&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;no-deref-vocab&amp;lt;/code&amp;gt;&lt;br /&gt;
|Indicates whether the proprietary vocabulary terms used by your data set (the ones that are defined within your top-level domain) are dereferenceable according to the best practices for [http://www.w3.org/TR/swbp-vocab-pub/ Publishing RDF Vocabularies]&lt;br /&gt;
|-&lt;br /&gt;
|&amp;lt;code&amp;gt;vocab-mappings&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;no-vocab-mappings&amp;lt;/code&amp;gt;&lt;br /&gt;
|Indicates whether you provide mappings for proprietary vocabulary terms (by setting &amp;lt;code&amp;gt;owl:equivalentClass&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;owl:equivalentProperty&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;rdfs:subClassOf&amp;lt;/code&amp;gt;, and/or &amp;lt;code&amp;gt;rdfs:subPropertyOf&amp;lt;/code&amp;gt; links, or publish mapping expressed as RIF rules or using the R2R Mapping Language). &lt;br /&gt;
|-&lt;br /&gt;
|&amp;lt;code&amp;gt;provenance-metadata&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;no-provenance-metadata&amp;lt;/code&amp;gt;&lt;br /&gt;
|Indicates whether the data set provides provenance meta-information (creator of the data set, creation date, maybe creation method) as document meta-information or via a voiD description. For instance, using the &amp;lt;code&amp;gt;dc:creator&amp;lt;/code&amp;gt; or &amp;lt;code&amp;gt;dc:date&amp;lt;/code&amp;gt; properties.&lt;br /&gt;
|-&lt;br /&gt;
|&amp;lt;code&amp;gt;license-metadata&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;no-license-metadata&amp;lt;/code&amp;gt;&lt;br /&gt;
|Indicates whether the data set provides licensing meta-information as document meta-information or via a voiD description. For instance, using the &amp;lt;code&amp;gt;dc:rights&amp;lt;/code&amp;gt; property.&lt;br /&gt;
|-&lt;br /&gt;
|&amp;lt;code&amp;gt;published-by-producer&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;published-by-third-party&amp;lt;/code&amp;gt;&lt;br /&gt;
|Indicates whether the data set is published by the original data producer or a third party.&lt;br /&gt;
|-&lt;br /&gt;
|&amp;lt;code&amp;gt;limited-sparql-endpoint&amp;lt;/code&amp;gt;&lt;br /&gt;
|Indicates whether the SPARQL endpoint is not serving the whole data set.&lt;br /&gt;
|-&lt;br /&gt;
|&amp;lt;code&amp;gt;lodcloud.nolinks&amp;lt;/code&amp;gt;&lt;br /&gt;
|Dataset has no external RDF links to other datasets.&lt;br /&gt;
|-&lt;br /&gt;
|&amp;lt;code&amp;gt;lodcloud.unconnected&amp;lt;/code&amp;gt;&lt;br /&gt;
|Dataset has no external RDF links to or from other datasets.&lt;br /&gt;
|-&lt;br /&gt;
|&amp;lt;code&amp;gt;lodcloud.needsinfo&amp;lt;/code&amp;gt;&lt;br /&gt;
|The data provider or dataset homepage do not provide mininum information (and information can't be determined from SPARQL endpoint or downloads).&lt;br /&gt;
|-&lt;br /&gt;
|&amp;lt;code&amp;gt;lodcloud.needsfixing&amp;lt;/code&amp;gt;&lt;br /&gt;
|The dataset is currently broken. Provide details in the Notes.&lt;br /&gt;
|}&lt;/div&gt;</description>
			<pubDate>Thu, 01 Dec 2011 13:05:52 GMT</pubDate>			<dc:creator>Rcygania2</dc:creator>			<comments>http://www.w3.org/wiki/Talk:TaskForces/CommunityProjects/LinkingOpenData/DataSets/CKANmetainformation</comments>		</item>
		<item>
			<title>TaskForces/CommunityProjects/LinkingOpenData/DataSets/CKANmetainformation</title>
			<link>http://www.w3.org/wiki/TaskForces/CommunityProjects/LinkingOpenData/DataSets/CKANmetainformation</link>
			<description>&lt;p&gt;Rcygania2:&amp;#32;&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;= Guidelines for Collecting Metadata on Linked Datasets in CKAN =&lt;br /&gt;
&lt;br /&gt;
For keeping the LOD cloud diagram up to date, the Linking Open Data community effort has started to collect meta-information about Linked datasets on [http://www.ckan.net CKAN], a registry of open data and content packages provided by the Open Knowledge Foundation.&lt;br /&gt;
&lt;br /&gt;
This page explains how dataset publishers or other people that want a dataset to be added to the LOD cloud, describe datasets on CKAN.&lt;br /&gt;
&lt;br /&gt;
The list of datasets about which we have already collected information is be found here:&lt;br /&gt;
&lt;br /&gt;
http://www4.wiwiss.fu-berlin.de/lodcloud/ckan/validator/&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Which datasets are included into the LOD cloud diagram? ==&lt;br /&gt;
&lt;br /&gt;
All datasets are included that fullfil the following requirements:&lt;br /&gt;
&lt;br /&gt;
# Data items are accessible via dereferencable URIs&lt;br /&gt;
# The dataset sets at least 50 RDF links pointing at other datasets or at least one other dataset is setting 50 RDF links pointing at your dataset.&lt;br /&gt;
&lt;br /&gt;
== How do I add a data set to CKAN or edit an existing data set? ==&lt;br /&gt;
&lt;br /&gt;
# Please register with [http://www.ckan.net CKAN] before editing or adding any packages.&lt;br /&gt;
# Please confirm that your data set does not already exist on CKAN before adding a new data set.&lt;br /&gt;
# Add or edit your data set and describe it with the following minimum required information: &lt;br /&gt;
#* CKAN name (a unique id)&lt;br /&gt;
#* title&lt;br /&gt;
#* URL &lt;br /&gt;
#* number of triples&lt;br /&gt;
#* links to other data sets. &lt;br /&gt;
# Please tag newly added data sets with &amp;lt;code&amp;gt;''lod''&amp;lt;/code&amp;gt;. &lt;br /&gt;
# If you are not aware of any in- or outlinks, tag it with &amp;lt;code&amp;gt;''lodcloud.nolinks''&amp;lt;/code&amp;gt;.&lt;br /&gt;
# Please provide as much additional information as possible (e.g. SPARQL endpoint, voiD description, license, and the topic of the data set) as described below. This information helps the community to know more about the development state of the Web of Linked Data and is made available via the CKAN API. &lt;br /&gt;
&lt;br /&gt;
=== Minimum Information ===&lt;br /&gt;
&lt;br /&gt;
Please provide the following minimum information about your data set.  &lt;br /&gt;
&lt;br /&gt;
==== Standard CKAN fields ====&lt;br /&gt;
&lt;br /&gt;
{| border=&amp;quot;1px&amp;quot; cellpadding=&amp;quot;2&amp;quot; cellspacing=&amp;quot;0&amp;quot;&lt;br /&gt;
|'''Field name'''&lt;br /&gt;
|'''Description'''&lt;br /&gt;
|'''Format/Examples'''&lt;br /&gt;
|-&lt;br /&gt;
|Name&lt;br /&gt;
|Unique ID for your data set on CKAN&lt;br /&gt;
|[a-z0-9-]+ &amp;quot;my-dataset&amp;quot;&lt;br /&gt;
|-&lt;br /&gt;
|Title&lt;br /&gt;
|Full name of your data set&lt;br /&gt;
|&amp;quot;My Dataset&amp;quot;&lt;br /&gt;
|-&lt;br /&gt;
|URL&lt;br /&gt;
|Link to data set homepage&lt;br /&gt;
|http://example.com/my-ds&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
==== Custom CKAN fields ====&lt;br /&gt;
&lt;br /&gt;
{| border=&amp;quot;1px&amp;quot; cellpadding=&amp;quot;2&amp;quot; cellspacing=&amp;quot;0&amp;quot;&lt;br /&gt;
|'''Field name'''&lt;br /&gt;
|'''Description'''&lt;br /&gt;
|'''Format/Examples'''&lt;br /&gt;
|-&lt;br /&gt;
|triples&lt;br /&gt;
|Approximate size of your data set in RDF triples&lt;br /&gt;
|100000, 62345123&lt;br /&gt;
|-&lt;br /&gt;
|links:xxx&lt;br /&gt;
|Number of RDF links pointing at data set with Data Hub ID xxx (http://thedatahub.org/dataset/xxx). Please provide separate links xxx statements for each data set your are linking to&lt;br /&gt;
|20000&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
==== CKAN tags ====&lt;br /&gt;
&lt;br /&gt;
Please use the following tags to provide meta-information about your data set. &lt;br /&gt;
&lt;br /&gt;
We will use the topic information to color the LOD cloud later. &lt;br /&gt;
&lt;br /&gt;
Please also list the vocabularies used by your data set so that the community can get an overview of which vocabularies are commonly used on the Web of Linked Data. &lt;br /&gt;
&lt;br /&gt;
Linked Data published on the Web should be as [http://www.w3.org/2001/tag/doc/selfDescribingDocuments.html self-describing] as possible in order to make it easier for clients to understand and use the data. Important aspects of self-descriptiveness are making vocabulary terms dereferenceable according to the best practices described in [http://www.w3.org/TR/swbp-vocab-pub/ Publishing RDF Vocabularies], using terms from common vocabularies and providing vocabulary mappings for proprietary vocabulary terms. In order to allow the community to get an overview which data sets implement these best practices, please tag your data set accordingly.&lt;br /&gt;
&lt;br /&gt;
{| border=&amp;quot;1px&amp;quot; cellpadding=&amp;quot;2&amp;quot; cellspacing=&amp;quot;0&amp;quot;&lt;br /&gt;
|'''Tag'''&lt;br /&gt;
|'''Purpose'''&lt;br /&gt;
|-&lt;br /&gt;
|&amp;lt;code&amp;gt;''&amp;amp;lt;topic&amp;amp;gt;''&amp;lt;/code&amp;gt;&lt;br /&gt;
|One of:&lt;br /&gt;
* &amp;lt;code&amp;gt;media&amp;lt;/code&amp;gt;&lt;br /&gt;
* &amp;lt;code&amp;gt;geographic&amp;lt;/code&amp;gt;&lt;br /&gt;
* &amp;lt;code&amp;gt;lifesciences&amp;lt;/code&amp;gt;&lt;br /&gt;
* &amp;lt;code&amp;gt;publications&amp;lt;/code&amp;gt; (including library and museum data)&lt;br /&gt;
* &amp;lt;code&amp;gt;government&amp;lt;/code&amp;gt;&lt;br /&gt;
* &amp;lt;code&amp;gt;ecommerce&amp;lt;/code&amp;gt;&lt;br /&gt;
* &amp;lt;code&amp;gt;socialweb&amp;lt;/code&amp;gt; (people and their activities)&lt;br /&gt;
* &amp;lt;code&amp;gt;usergeneratedcontent&amp;lt;/code&amp;gt; (blog posts, discussions, pictures, ...) &lt;br /&gt;
* &amp;lt;code&amp;gt;schemata&amp;lt;/code&amp;gt; (structural resources, including vocabularies, ontologies, classifications, thesauri)&lt;br /&gt;
* &amp;lt;code&amp;gt;crossdomain&amp;lt;/code&amp;gt;&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
=== Enhanced Information ===&lt;br /&gt;
&lt;br /&gt;
Please provide the following additional information about your data set.  This information helps the community to know more about the development state of the Web of Linked Data and is made available via the CKAN API.&lt;br /&gt;
&lt;br /&gt;
==== Standard CKAN fields ====&lt;br /&gt;
&lt;br /&gt;
{| border=&amp;quot;1px&amp;quot; cellpadding=&amp;quot;2&amp;quot; cellspacing=&amp;quot;0&amp;quot;&lt;br /&gt;
|'''Field name'''&lt;br /&gt;
|'''Description'''&lt;br /&gt;
|'''Format/Examples'''&lt;br /&gt;
|-&lt;br /&gt;
|Version&lt;br /&gt;
|Last modification date or version of your data set&lt;br /&gt;
|&amp;quot;2010-04 (3.5)&amp;quot;, &amp;quot;2006&amp;quot;, &amp;quot;beta&amp;quot;&lt;br /&gt;
|-&lt;br /&gt;
|Notes&lt;br /&gt;
|Description of your data set&lt;br /&gt;
|some free text&lt;br /&gt;
|-&lt;br /&gt;
|Author	&lt;br /&gt;
|Name of publishing org and/or person&lt;br /&gt;
|&amp;quot;Talis (Leigh Dodds)&amp;quot;&lt;br /&gt;
|-&lt;br /&gt;
|Author email&lt;br /&gt;
|Contact email&lt;br /&gt;
|leigh@ldodds.com&lt;br /&gt;
|-&lt;br /&gt;
|License&lt;br /&gt;
|Standard license drop-down&lt;br /&gt;
|OSI approved::MIT license&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
==== Custom CKAN fields ====&lt;br /&gt;
&lt;br /&gt;
{| border=&amp;quot;1px&amp;quot; cellpadding=&amp;quot;2&amp;quot; cellspacing=&amp;quot;0&amp;quot;&lt;br /&gt;
|'''Field name'''&lt;br /&gt;
|'''Description'''&lt;br /&gt;
|'''Format/examples'''&lt;br /&gt;
|-&lt;br /&gt;
|shortname&lt;br /&gt;
|Short name for LOD bubble&lt;br /&gt;
|&amp;quot;NY Times&amp;quot;&lt;br /&gt;
|-&lt;br /&gt;
|license_link&lt;br /&gt;
|Custom license link&lt;br /&gt;
|http://example.com/so-sue-me&lt;br /&gt;
|-&lt;br /&gt;
|sparql_graph_name	&lt;br /&gt;
|Named graph in SPARQL store (if used by your SPARQL endpoint)&lt;br /&gt;
|http://species.geospecies.org&lt;br /&gt;
|-&lt;br /&gt;
|namespace	&lt;br /&gt;
|Instance namespace&lt;br /&gt;
|http://dbpedia.org/resource/&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
==== CKAN resource links ====&lt;br /&gt;
&lt;br /&gt;
Links (other than dereferenceable URIs) that enable alternative access to the data set (e.g., via downloads or SPARQL endpoints) should be specified in the Resources section of the CKAN entry form. Please also provide links to the [http://vocab.deri.ie/void/guide voiD description] or [http://sw.deri.org/2007/07/sitemapextension/ Semantic Web Sitemap] describing your data set.&lt;br /&gt;
&lt;br /&gt;
{| border=&amp;quot;1px&amp;quot; cellpadding=&amp;quot;2&amp;quot; cellspacing=&amp;quot;0&amp;quot;&lt;br /&gt;
|'''Purpose'''&lt;br /&gt;
|'''Format'''&lt;br /&gt;
|'''Description'''&lt;br /&gt;
|-&lt;br /&gt;
|Download page&lt;br /&gt;
| —&lt;br /&gt;
|Download&lt;br /&gt;
|-&lt;br /&gt;
|XML Sitemap&lt;br /&gt;
|&amp;lt;code&amp;gt;meta/sitemap&amp;lt;/code&amp;gt;&lt;br /&gt;
|XML Sitemap&lt;br /&gt;
|-&lt;br /&gt;
|SPARQL endpoint&lt;br /&gt;
|&amp;lt;code&amp;gt;api/sparql&amp;lt;/code&amp;gt;&lt;br /&gt;
|SPARQL endpoint&lt;br /&gt;
|-&lt;br /&gt;
|voiD file&lt;br /&gt;
|&amp;lt;code&amp;gt;meta/void&amp;lt;/code&amp;gt;&lt;br /&gt;
|voiD description&lt;br /&gt;
|-&lt;br /&gt;
|RDF/XML download&lt;br /&gt;
|&amp;lt;code&amp;gt;application/rdf+xml&amp;lt;/code&amp;gt;&lt;br /&gt;
|Download&lt;br /&gt;
|-&lt;br /&gt;
|Turtle download&lt;br /&gt;
|&amp;lt;code&amp;gt;text/turtle&amp;lt;/code&amp;gt;&lt;br /&gt;
|Download&lt;br /&gt;
|-&lt;br /&gt;
|N-Triples download&lt;br /&gt;
|&amp;lt;code&amp;gt;application/x-ntriples&amp;lt;/code&amp;gt;&lt;br /&gt;
|Download&lt;br /&gt;
|-&lt;br /&gt;
|N-Quads download&lt;br /&gt;
|&amp;lt;code&amp;gt;application/x-nquads&amp;lt;/code&amp;gt;&lt;br /&gt;
|Download&lt;br /&gt;
|-&lt;br /&gt;
|RDF Schema&lt;br /&gt;
|&amp;lt;code&amp;gt;meta/rdf-schema&amp;lt;/code&amp;gt;&lt;br /&gt;
|Download link to RDF/OWL Schema used by your data set (in addition to having dereferenceable vocabulary URIs)&lt;br /&gt;
|-&lt;br /&gt;
|RDF/XML example link&lt;br /&gt;
|&amp;lt;code&amp;gt;example/rdf+xml&amp;lt;/code&amp;gt;&lt;br /&gt;
|Link to an example data item within your data set (RDF/XML)&lt;br /&gt;
|-&lt;br /&gt;
|Turtle example link&lt;br /&gt;
|&amp;lt;code&amp;gt;example/turtle&amp;lt;/code&amp;gt;&lt;br /&gt;
|Link to an example data item within your data set  (Turtle)&lt;br /&gt;
|-&lt;br /&gt;
|N-Triples example link&lt;br /&gt;
|&amp;lt;code&amp;gt;example/ntriples&amp;lt;/code&amp;gt;&lt;br /&gt;
|Link to an example data item within your data set  (N-Triples)&lt;br /&gt;
|-&lt;br /&gt;
|HTML+RDFa example link&lt;br /&gt;
|&amp;lt;code&amp;gt;example/rdfa&amp;lt;/code&amp;gt;&lt;br /&gt;
|Link to an example data item within your data set  (RDFa)&lt;br /&gt;
|-&lt;br /&gt;
|Vocabulary Mappings, e.g., OWL, RDFS, RIF, R2R&lt;br /&gt;
|&amp;lt;code&amp;gt;mapping/''&amp;amp;lt;format&amp;amp;gt;''&amp;lt;/code&amp;gt;&lt;br /&gt;
|If your data set uses proprietary vocabulary terms and you know these terms also exists in other vocabularies, you should set &amp;lt;code&amp;gt;owl:equivalentClass&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;owl:equivalentProperty&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;rdfs:subClassOf&amp;lt;/code&amp;gt;, and/or &amp;lt;code&amp;gt;rdfs:subPropertyOf&amp;lt;/code&amp;gt; links pointing at these terms or provide mapping expressed as RIF rules or using the R2R Mapping Language. If your mappings can be downloaded as a single file, please provide the link to the download.&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
==== CKAN tags ====&lt;br /&gt;
&lt;br /&gt;
Please use the following tags to provide meta-information about your data set. &lt;br /&gt;
&lt;br /&gt;
We will use the topic information to color the LOD cloud later. &lt;br /&gt;
&lt;br /&gt;
Please also list the vocabularies used by your data set so that the community can get an overview of which vocabularies are commonly used on the Web of Linked Data. &lt;br /&gt;
&lt;br /&gt;
Linked Data published on the Web should be as [http://www.w3.org/2001/tag/doc/selfDescribingDocuments.html self-describing] as possible in order to make it easier for clients to understand and use the data. Important aspects of self-descriptiveness are making vocabulary terms dereferenceable according to the best practices described in [http://www.w3.org/TR/swbp-vocab-pub/ Publishing RDF Vocabularies], using terms from common vocabularies and providing vocabulary mappings for proprietary vocabulary terms. In order to allow the community to get an overview which data sets implement these best practices, please tag your data set accordingly.&lt;br /&gt;
&lt;br /&gt;
{| border=&amp;quot;1px&amp;quot; cellpadding=&amp;quot;2&amp;quot; cellspacing=&amp;quot;0&amp;quot;&lt;br /&gt;
|'''Tag'''&lt;br /&gt;
|'''Purpose'''&lt;br /&gt;
|-&lt;br /&gt;
|&amp;lt;code&amp;gt;format-''&amp;amp;lt;prefix&amp;amp;gt;''&amp;lt;/code&amp;gt;&lt;br /&gt;
|A [http://esw.w3.org/TaskForces/CommunityProjects/LinkingOpenData/CommonVocabularies vocabulary] used by the data set, e.g., &amp;lt;code&amp;gt;format-skos&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;format-dc&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;format-foaf&amp;lt;/code&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
|&amp;lt;code&amp;gt;no-proprietary-vocab&amp;lt;/code&amp;gt;&lt;br /&gt;
|Indicates that your data set does not use a proprietary vocabulary (defined within your top-level domain).&lt;br /&gt;
|-&lt;br /&gt;
|&amp;lt;code&amp;gt;deref-vocab&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;no-deref-vocab&amp;lt;/code&amp;gt;&lt;br /&gt;
|Indicates whether the proprietary vocabulary terms used by your data set (the ones that are defined within your top-level domain) are dereferenceable according to the best practices for [http://www.w3.org/TR/swbp-vocab-pub/ Publishing RDF Vocabularies]&lt;br /&gt;
|-&lt;br /&gt;
|&amp;lt;code&amp;gt;vocab-mappings&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;no-vocab-mappings&amp;lt;/code&amp;gt;&lt;br /&gt;
|Indicates whether you provide mappings for proprietary vocabulary terms (by setting &amp;lt;code&amp;gt;owl:equivalentClass&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;owl:equivalentProperty&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;rdfs:subClassOf&amp;lt;/code&amp;gt;, and/or &amp;lt;code&amp;gt;rdfs:subPropertyOf&amp;lt;/code&amp;gt; links, or publish mapping expressed as RIF rules or using the R2R Mapping Language). &lt;br /&gt;
|-&lt;br /&gt;
|&amp;lt;code&amp;gt;provenance-metadata&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;no-provenance-metadata&amp;lt;/code&amp;gt;&lt;br /&gt;
|Indicates whether the data set provides provenance meta-information (creator of the data set, creation date, maybe creation method) as document meta-information or via a voiD description. For instance, using the &amp;lt;code&amp;gt;dc:creator&amp;lt;/code&amp;gt; or &amp;lt;code&amp;gt;dc:date&amp;lt;/code&amp;gt; properties.&lt;br /&gt;
|-&lt;br /&gt;
|&amp;lt;code&amp;gt;license-metadata&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;no-license-metadata&amp;lt;/code&amp;gt;&lt;br /&gt;
|Indicates whether the data set provides licensing meta-information as document meta-information or via a voiD description. For instance, using the &amp;lt;code&amp;gt;dc:rights&amp;lt;/code&amp;gt; property.&lt;br /&gt;
|-&lt;br /&gt;
|&amp;lt;code&amp;gt;published-by-producer&amp;lt;/code&amp;gt;&lt;br /&gt;
&amp;lt;code&amp;gt;published-by-third-party&amp;lt;/code&amp;gt;&lt;br /&gt;
|Indicates whether the data set is published by the original data producer or a third party.&lt;br /&gt;
|-&lt;br /&gt;
|&amp;lt;code&amp;gt;limited-sparql-endpoint&amp;lt;/code&amp;gt;&lt;br /&gt;
|Indicates whether the SPARQL endpoint is not serving the whole data set.&lt;br /&gt;
|-&lt;br /&gt;
|&amp;lt;code&amp;gt;lodcloud.nolinks&amp;lt;/code&amp;gt;&lt;br /&gt;
|Dataset has no external RDF links to other datasets.&lt;br /&gt;
|-&lt;br /&gt;
|&amp;lt;code&amp;gt;lodcloud.unconnected&amp;lt;/code&amp;gt;&lt;br /&gt;
|Dataset has no external RDF links to or from other datasets.&lt;br /&gt;
|-&lt;br /&gt;
|&amp;lt;code&amp;gt;lodcloud.needsinfo&amp;lt;/code&amp;gt;&lt;br /&gt;
|The data provider or dataset homepage do not provide mininum information (and information can't be determined from SPARQL endpoint or downloads).&lt;br /&gt;
|-&lt;br /&gt;
|&amp;lt;code&amp;gt;lodcloud.needsfixing&amp;lt;/code&amp;gt;&lt;br /&gt;
|The dataset is currently broken. Provide details in the Notes.&lt;br /&gt;
|}&lt;/div&gt;</description>
			<pubDate>Thu, 01 Dec 2011 13:02:07 GMT</pubDate>			<dc:creator>Rcygania2</dc:creator>			<comments>http://www.w3.org/wiki/Talk:TaskForces/CommunityProjects/LinkingOpenData/DataSets/CKANmetainformation</comments>		</item>
		<item>
			<title>User:Rcygania2</title>
			<link>http://www.w3.org/wiki/User:Rcygania2</link>
			<description>&lt;p&gt;Rcygania2:&amp;#32;/* Nearby */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;__NOTOC__&lt;br /&gt;
&amp;lt;!-- #format wiki --&amp;gt;&lt;br /&gt;
&amp;lt;!-- #language en --&amp;gt;&lt;br /&gt;
== Richard Cyganiak ==&lt;br /&gt;
&lt;br /&gt;
Email: richard@cyganiak.de&lt;br /&gt;
&lt;br /&gt;
Blog: http://dowhatimean.net/&lt;br /&gt;
&lt;br /&gt;
Homepage (in German): http://richard.cyganiak.de/&lt;br /&gt;
&lt;br /&gt;
== Nearby ==&lt;br /&gt;
&lt;br /&gt;
* My [[/RulesOfThumb]]&lt;br /&gt;
* My [[/FavouritePapers]]&lt;br /&gt;
* Notes: [[/HashVsSlash]]&lt;br /&gt;
&lt;br /&gt;
== Semwebby interests ==&lt;br /&gt;
&lt;br /&gt;
* [[JenaFramework]]&lt;br /&gt;
* [[SPARQL]]&lt;br /&gt;
* RDF and relational databases -- [http://www.wiwiss.fu-berlin.de/suhl/bizer/D2RQ/ D2RQ], [http://www.wiwiss.fu-berlin.de/suhl/bizer/d2r-server/ D2R Server], [http://jena.sourceforge.net/sparql2sql/ sparql2sql]&lt;br /&gt;
* [[NamedGraphs]] -- [[NamedGraphsApiForJena]]&lt;br /&gt;
* User interfaces for authoring and browsing RDF -- Snorql&lt;br /&gt;
* [[SemanticWiki]]&lt;br /&gt;
* FOAF&lt;br /&gt;
* [http://esw.w3.org/topic/SweoIG/TaskForces/CommunityProjects/LinkingOpenData Linking Open Data]&lt;br /&gt;
* [[DBpedia]]&lt;br /&gt;
&lt;br /&gt;
----&lt;br /&gt;
[[Category:Homepage]]&lt;/div&gt;</description>
			<pubDate>Thu, 23 Jun 2011 21:01:33 GMT</pubDate>			<dc:creator>Rcygania2</dc:creator>			<comments>http://www.w3.org/wiki/User_talk:Rcygania2</comments>		</item>
		<item>
			<title>RichardCyganiak/RulesOfThumb</title>
			<link>http://www.w3.org/wiki/RichardCyganiak/RulesOfThumb</link>
			<description>&lt;p&gt;Rcygania2:&amp;#32;moved RichardCyganiak/RulesOfThumb to User:Rcygania2/RulesOfThumb:&amp;amp;#32;ESW style user page =&amp;gt; W3C account&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;#REDIRECT [[User:Rcygania2/RulesOfThumb]]&lt;/div&gt;</description>
			<pubDate>Thu, 23 Jun 2011 21:01:02 GMT</pubDate>			<dc:creator>Rcygania2</dc:creator>			<comments>http://www.w3.org/wiki/Talk:RichardCyganiak/RulesOfThumb</comments>		</item>
		<item>
			<title>User:Rcygania2/RulesOfThumb</title>
			<link>http://www.w3.org/wiki/User:Rcygania2/RulesOfThumb</link>
			<description>&lt;p&gt;Rcygania2:&amp;#32;moved RichardCyganiak/RulesOfThumb to User:Rcygania2/RulesOfThumb:&amp;amp;#32;ESW style user page =&amp;gt; W3C account&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;A loose collection of [[SemWeb]]-related rules of thumb that I would propose as good practice.&lt;br /&gt;
&lt;br /&gt;
= Modelling a dataset in RDF =&lt;br /&gt;
&lt;br /&gt;
== Labels ==&lt;br /&gt;
* Make sure that everything has an rdfs:label, either directly specified, or by using some property that is defined as a subproperty of rdfs:label&lt;br /&gt;
* Don't be overly concerned with ambiguous labels; just consider the resource in isolation. That's because labels cannot do the job of disambiguation anyway, and trying to do it results in artificial and awkward labels.&lt;br /&gt;
&lt;br /&gt;
== Language tags ==&lt;br /&gt;
* On untyped literals, if the literal is likely to be understood only by speakers of a single language, then add a language tag. If it is likely to work for speakers of many languages, keep it without a language tag. If the file or dataset has only a few exceptions, then it is perhaps better to go for consistency and mark them the same way as the rest of the file.&lt;br /&gt;
* If you publish in multiple languages, then perhaps it's a good idea to include a plain literal in a “default language” without a language tag, to make SPARQLing easy.&lt;br /&gt;
&lt;br /&gt;
== Datatypes ==&lt;br /&gt;
* Avoid xsd:string. Just use a plain literal.&lt;br /&gt;
* For numbers, prefer xsd:decimal and xsd:integer because they are not restricted in accuracy/range.&lt;br /&gt;
* Avoid defining custom datatypes if you can. Better bake the literal semantics into properties.&lt;br /&gt;
* Issue: SKOS demands custom datatypes for skos:notation. Just ignore that?&lt;br /&gt;
* For units of measurement, prefer a pattern such as: ex:length [ ex:meter 5.21 ]&lt;br /&gt;
&lt;br /&gt;
= Designing vocabularies and ontologies =&lt;br /&gt;
&lt;br /&gt;
@@@ Interesting advice from TimBL: http://lists.w3.org/Archives/Public/public-lod/2011Apr/0282.html&lt;br /&gt;
&lt;br /&gt;
== Naming of properties ==&lt;br /&gt;
* Properties that point to documents (information resources) should have names that announce this fact, e.g. userProfile, userPage, userList, eventRecord, eventForm&lt;br /&gt;
* Relationship nouns make good propery names, e.g “parent” is better than “hasParent” or “isParentOf” (as per TimBL)&lt;br /&gt;
 Focus on one problem::&lt;br /&gt;
* [from an email to rdf-schema-dev on 2008-04-05] Some random half-formed thoughts: It's good if the vocabulary covers all my needs for a given problem. It's good if the vocabulary doesn't contain much extra stuff that I don't need to solve my problem. It's good if the purpose and coverage of the vocabulary can be conveyed in a short term or phrase (e.g. “document metadata” or “issue tracking”). It's good if the level of abstraction is consistent throughout the vocabulary, e.g. don't mix high-level concepts like Service and Container into your down-to-earth photo annotation vocabulary.&lt;br /&gt;
 Provide excellent documentation::&lt;br /&gt;
* [from an email to rdf-schema-dev on 2008-04-05] Random thoughts again: Some introductory narrative. A bunch of good examples for typical usages of the vocabulary. An UML-style overview diagram if the vocab has more than a few classes. Some tutorial-style text. An excellent reference section with all terms and notes about how they are supposed to be used (including notes on what they are NOT supposed to be used for).&lt;br /&gt;
&lt;br /&gt;
== Namespace URIs ==&lt;br /&gt;
From danbri in an email to the DC Architecture list, 30 March 2010 09:40:47 IST:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;blockquote&amp;gt;My current preference / advice (for new work) is for the managers of&lt;br /&gt;
each serious namespace to invest in a distinct domain name for it, and&lt;br /&gt;
for us as a community to come up with social machinery for 'watching&lt;br /&gt;
each other's backs' to ensure that the domains are kept in good&lt;br /&gt;
working order, fees are paid, etc. Sometimes an additional level of&lt;br /&gt;
indirection can add as much risk as it saves.  Initially when I bought&lt;br /&gt;
xmlns.com I have idea it could be a home for lots of namespaces, and&lt;br /&gt;
then the more I thought about it, the less I liked that idea. Each new&lt;br /&gt;
namespace added to the bucket brings some risk to the others using the&lt;br /&gt;
domain, by adding to the complexity and burden for subsequent&lt;br /&gt;
maintainers. So I think a proliferation of independent domain names,&lt;br /&gt;
while painful in its own way, spreads the risk...&amp;lt;/blockquote&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Later in the thread:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;blockquote&amp;gt;Rule of thumb - when wondering what info to include in a namespace&lt;br /&gt;
URI, ... try to leave *out* as much as possible&amp;lt;/blockquote&amp;gt;&lt;br /&gt;
&lt;br /&gt;
And [http://www.w3.org/mid/BANLkTikhWKT90gWq+L-5YJuqsPOdo5ECQA@mail.gmail.com even more concise]:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;blockquote&amp;gt;First rule of namespace URI design &amp;quot;you're more likely to regret&lt;br /&gt;
things you included, than things you omitted&amp;quot;.&amp;lt;/blockquote&amp;gt;&lt;br /&gt;
&lt;br /&gt;
For hash namespaces, the RDF document containing the vocabulary should be typed as owl:Ontology and should be the target of any rdfs:isDefinedBy statements. Note, the RDF document's URI is ''the namespace URI without the trailing hash''.&lt;br /&gt;
&lt;br /&gt;
= Publishing RDF on the Web =&lt;br /&gt;
&lt;br /&gt;
== Metadata in RDF documents ==&lt;br /&gt;
* Every RDF document has ''some'' relation to the thing(s) it talks about. It is useful to explicitly state what that relation is. For example, in my FOAF file I state that I'm both the foaf:maker and the foaf:primaryTopic of the file. Other useful properties are: rdfs:isDefinedBy, foaf:topic. A concrete benefit is that consumers can pick out the “important” things from the graph.&lt;br /&gt;
&lt;br /&gt;
== HTML descriptions of URI-denoted things ==&lt;br /&gt;
* To create trust into the stability, reliability and availability of a URI, its HTML description should explicitly state the URI, it should contain an explicit ''Statement of Purpose'' to the effect that the URI is intended to be used as an identifier for the thing, and it should contain a ''Publisher Identification''. It must provide sufficient information to enable human users to know exactly what is being referred to. (This is inspired by [http://www.oasis-open.org/committees/download.php/3050/pubsubj-pt1-1.02-cs.pdf Published Subjects].)&lt;br /&gt;
&lt;br /&gt;
== Content negotiation ==&lt;br /&gt;
* The benefit of CN is that all URIs also work in a standard Web browser, not just in RDF-enabled tools and browsers. Thus it's great for authoring and debugging and when your URIs are exposed to a lot of neophytes (e.g. DBpedia, FOAF, DC). On the other hand, content negotiation is very hard to get right, the devil is in the details and it has turned out to be quite an interop hassle in practice. So, CN should be thought of as icing on the cake, but not a requirement for publishing RDF.&lt;br /&gt;
* Rule of thumb: If your server solution does CN, then do CN. Otherwise, getting it right will be too much effort.&lt;br /&gt;
* Keep in mind the advice from [http://www.w3.org/2001/tag/doc/alternatives-discovery.html On Linking Alternative Representations]: Provide links between different variants to make them all accessible. This means, if some HTML can be returned in response to RDF/HTML negotiation, there should be an RDF icon nearby, which points to the RDF variant.&lt;br /&gt;
&lt;br /&gt;
== Blank nodes ==&lt;br /&gt;
* Should be avoided in general. Using a blank node is appropriate if the publisher thinks that no one should ever care about this resource except in the context of looking at another, identified, resource in the same RDF document, e.g. a geo:Point that exists solely to give the location of another resource. Another situation where a blank node would be appropriate is when used as an existential variable, but I've never seen them used that way in a Linked Data context.&lt;br /&gt;
&lt;br /&gt;
== Linked Data ==&lt;br /&gt;
* Have rdfs:label (or a subclass thereof) on everything, always&lt;br /&gt;
* Have rdfs:label for the document URI, always&lt;br /&gt;
* Have a foaf:primaryTopic triple connecting document URI and main resource, always&lt;br /&gt;
* Have as much dc: metadata as possible on the document URI&lt;br /&gt;
* Think hard about possible external links, to other web pages and other RDF documents and entities. Provide as many as possible. These make all the difference.&lt;br /&gt;
&lt;br /&gt;
== URI design ==&lt;br /&gt;
* RDF URIs are always case sensitive, while from HTTP's point of view, some parts of the URI can change without changing any behaviour. So, be clear about the case of your URIs and stick to it once a decision is made. If in doubt, use as much lowercase as possible. (Story: L3S changed case of the domain name in their URIs, broke a Semantic Web Pipes demo.)&lt;br /&gt;
&lt;br /&gt;
= Web Architecture =&lt;br /&gt;
&lt;br /&gt;
== Information resources ==&lt;br /&gt;
* [http://lists.w3.org/Archives/Public/www-tag/2007Sep/0123.html From Harry Halpin]: “If there is a URI that is used to identify a resource one would want to make logical statements about, and these statements do not apply to possible representations of that resource, then one should use the &amp;quot;hash&amp;quot; or 303 redirection to separate  these URIs.”&lt;/div&gt;</description>
			<pubDate>Thu, 23 Jun 2011 21:01:02 GMT</pubDate>			<dc:creator>Rcygania2</dc:creator>			<comments>http://www.w3.org/wiki/User_talk:Rcygania2/RulesOfThumb</comments>		</item>
		<item>
			<title>RichardCyganiak/HashVsSlash</title>
			<link>http://www.w3.org/wiki/RichardCyganiak/HashVsSlash</link>
			<description>&lt;p&gt;Rcygania2:&amp;#32;moved RichardCyganiak/HashVsSlash to User:Rcygania2/HashVsSlash:&amp;amp;#32;ESW style user page =&amp;gt; W3C account&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;#REDIRECT [[User:Rcygania2/HashVsSlash]]&lt;/div&gt;</description>
			<pubDate>Thu, 23 Jun 2011 21:01:02 GMT</pubDate>			<dc:creator>Rcygania2</dc:creator>			<comments>http://www.w3.org/wiki/Talk:RichardCyganiak/HashVsSlash</comments>		</item>
		<item>
			<title>User:Rcygania2/HashVsSlash</title>
			<link>http://www.w3.org/wiki/User:Rcygania2/HashVsSlash</link>
			<description>&lt;p&gt;Rcygania2:&amp;#32;moved RichardCyganiak/HashVsSlash to User:Rcygania2/HashVsSlash:&amp;amp;#32;ESW style user page =&amp;gt; W3C account&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;__NOTOC__&lt;br /&gt;
The Semantic Web uses URIs as identifiers for things and documents (as opposed to the traditional World Wide Web, which uses URIs just as identifiers for documents). Publishers can choose between two styles of URIs as identifiers for things: hash URIs and slash URIs. Slash URIs must be used in conjunction with 303 redirects.&lt;br /&gt;
&lt;br /&gt;
This page is [[RichardCyganiak]]'s subjective collection of pros and cons of each approach.&lt;br /&gt;
&lt;br /&gt;
== Pro hash ==&lt;br /&gt;
&lt;br /&gt;
* Easy authoring and publishing just by editing and uploading static files&lt;br /&gt;
* Fits well with the explanation: &amp;quot;The Web is about documents. The Semantic Web is about the things inside those documents.&amp;quot; This seems like a very sound and marketable explanation of the Semantic Web, and hash URIs reinforce this explanation.&lt;br /&gt;
&lt;br /&gt;
Example: [http://dbpedia2.openlinksw.com/about/Berlin#this URI for Entity Berlin]&lt;br /&gt;
&lt;br /&gt;
== Con hash ==&lt;br /&gt;
&lt;br /&gt;
* There are issues with content negotiation. See [[IanDavis]]' blog posts from late 2007, discussion on the TAG list.&lt;br /&gt;
* Doesn't work well with large namespaces&lt;br /&gt;
* Doesn't work well with documents that describe just one thing -- the #it style is ugly and can't be QName-abbreviated, e.g. as in dbpedia:Berlin&lt;br /&gt;
&lt;br /&gt;
== Pro slash ==&lt;br /&gt;
&lt;br /&gt;
* The URIs of the thing and its describing document can be chosen and evolved independently&lt;br /&gt;
&lt;br /&gt;
Example: [http://dbpedia.org/resource/Berlin URI for Entity Berlin]&lt;br /&gt;
&lt;br /&gt;
== Con slash ==&lt;br /&gt;
&lt;br /&gt;
* 303 redirects require an extra HTTP round-trip&lt;br /&gt;
* This will never fit truly well into web architecture. At best, it's &amp;quot;mostly harmless&amp;quot;. It's a hack.&lt;br /&gt;
* When linking to a slashy URI, it's hard to provide metadata about the document describing the thing. With hash URIs, we naturally know the URI of that document. (See sbp's discussion from early January 2008 on the TAG lists)&lt;br /&gt;
* Deployment is challenging&lt;br /&gt;
&lt;br /&gt;
== Attempted synthesis ==&lt;br /&gt;
&lt;br /&gt;
Slash URIs are mostly legacy. For everyday RDF publishing, hash URIs are better. Slash URIs may have their place for identifiers that are supposed to outlive the particular documents that describe them, and for very large collections of classes and properties.&lt;/div&gt;</description>
			<pubDate>Thu, 23 Jun 2011 21:01:02 GMT</pubDate>			<dc:creator>Rcygania2</dc:creator>			<comments>http://www.w3.org/wiki/User_talk:Rcygania2/HashVsSlash</comments>		</item>
		<item>
			<title>RichardCyganiak/FavouritePapers</title>
			<link>http://www.w3.org/wiki/RichardCyganiak/FavouritePapers</link>
			<description>&lt;p&gt;Rcygania2:&amp;#32;moved RichardCyganiak/FavouritePapers to User:Rcygania2/FavouritePapers:&amp;amp;#32;ESW style user page =&amp;gt; W3C account&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;#REDIRECT [[User:Rcygania2/FavouritePapers]]&lt;/div&gt;</description>
			<pubDate>Thu, 23 Jun 2011 21:01:02 GMT</pubDate>			<dc:creator>Rcygania2</dc:creator>			<comments>http://www.w3.org/wiki/Talk:RichardCyganiak/FavouritePapers</comments>		</item>
		<item>
			<title>User:Rcygania2/FavouritePapers</title>
			<link>http://www.w3.org/wiki/User:Rcygania2/FavouritePapers</link>
			<description>&lt;p&gt;Rcygania2:&amp;#32;moved RichardCyganiak/FavouritePapers to User:Rcygania2/FavouritePapers:&amp;amp;#32;ESW style user page =&amp;gt; W3C account&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;__NOTOC__&lt;br /&gt;
I'll start a list of my favourite papers here. This is very incomplete at the moment.&lt;br /&gt;
&lt;br /&gt;
  Unlocking the Potential of Public Sector Information with Semantic Web Technology::&lt;br /&gt;
;   : Shows the Semantic Web in practice. Semantic Web projects work well if kept simple. I like the practical, down-to-earth approach. Also, there's this delightful sentence: “Carol Tullo, co-author of this paper, is granted authority by Her Majesty The Queen to manage all copyrights and databases owned by the Crown.” [http://eprints.ecs.soton.ac.uk/14429/ Link]&lt;/div&gt;</description>
			<pubDate>Thu, 23 Jun 2011 21:01:02 GMT</pubDate>			<dc:creator>Rcygania2</dc:creator>			<comments>http://www.w3.org/wiki/User_talk:Rcygania2/FavouritePapers</comments>		</item>
		<item>
			<title>RichardCyganiak</title>
			<link>http://www.w3.org/wiki/RichardCyganiak</link>
			<description>&lt;p&gt;Rcygania2:&amp;#32;moved RichardCyganiak to User:Rcygania2:&amp;amp;#32;ESW style user page =&amp;gt; W3C account&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;#REDIRECT [[User:Rcygania2]]&lt;/div&gt;</description>
			<pubDate>Thu, 23 Jun 2011 21:01:01 GMT</pubDate>			<dc:creator>Rcygania2</dc:creator>			<comments>http://www.w3.org/wiki/Talk:RichardCyganiak</comments>		</item>
		<item>
			<title>User:Rcygania2</title>
			<link>http://www.w3.org/wiki/User:Rcygania2</link>
			<description>&lt;p&gt;Rcygania2:&amp;#32;moved RichardCyganiak to User:Rcygania2:&amp;amp;#32;ESW style user page =&amp;gt; W3C account&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;__NOTOC__&lt;br /&gt;
&amp;lt;!-- #format wiki --&amp;gt;&lt;br /&gt;
&amp;lt;!-- #language en --&amp;gt;&lt;br /&gt;
== Richard Cyganiak ==&lt;br /&gt;
&lt;br /&gt;
Email: richard@cyganiak.de&lt;br /&gt;
&lt;br /&gt;
Blog: http://dowhatimean.net/&lt;br /&gt;
&lt;br /&gt;
Homepage (in German): http://richard.cyganiak.de/&lt;br /&gt;
&lt;br /&gt;
== Nearby ==&lt;br /&gt;
&lt;br /&gt;
* My /RulesOfThumb&lt;br /&gt;
* My /FavouritePapers&lt;br /&gt;
* Notes: /HashVsSlash&lt;br /&gt;
&lt;br /&gt;
== Semwebby interests ==&lt;br /&gt;
&lt;br /&gt;
* [[JenaFramework]]&lt;br /&gt;
* [[SPARQL]]&lt;br /&gt;
* RDF and relational databases -- [http://www.wiwiss.fu-berlin.de/suhl/bizer/D2RQ/ D2RQ], [http://www.wiwiss.fu-berlin.de/suhl/bizer/d2r-server/ D2R Server], [http://jena.sourceforge.net/sparql2sql/ sparql2sql]&lt;br /&gt;
* [[NamedGraphs]] -- [[NamedGraphsApiForJena]]&lt;br /&gt;
* User interfaces for authoring and browsing RDF -- Snorql&lt;br /&gt;
* [[SemanticWiki]]&lt;br /&gt;
* FOAF&lt;br /&gt;
* [http://esw.w3.org/topic/SweoIG/TaskForces/CommunityProjects/LinkingOpenData Linking Open Data]&lt;br /&gt;
* [[DBpedia]]&lt;br /&gt;
&lt;br /&gt;
----&lt;br /&gt;
[[Category:Homepage]]&lt;/div&gt;</description>
			<pubDate>Thu, 23 Jun 2011 21:01:01 GMT</pubDate>			<dc:creator>Rcygania2</dc:creator>			<comments>http://www.w3.org/wiki/User_talk:Rcygania2</comments>		</item>
		<item>
			<title>Bad RDF Crawlers</title>
			<link>http://www.w3.org/wiki/Bad_RDF_Crawlers</link>
			<description>&lt;p&gt;Rcygania2:&amp;#32;devote a separate page to WebID advocacy&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;This page is intended to hold a list of poorly behaving crawlers that target RDF-publishing websites, unfortunately a recurring problem.&amp;lt;ref&amp;gt;[http://lists.w3.org/Archives/Public/public-lod/2011Jun/0433.html Think before you write Semantic Web crawlers], public-lod post by Martin Hepp, 21 June 2011&amp;lt;/ref&amp;gt; The list will allow publishers to defend themselves by blocking such crawlers.&lt;br /&gt;
&lt;br /&gt;
== Best practices for web crawlers ==&lt;br /&gt;
&lt;br /&gt;
Dereferencing is a privilege, not a right. Crawlers that don't use server resources considerately abuse that privilege. It has bad consequences for the Web in general.&lt;br /&gt;
&lt;br /&gt;
A well-behaved crawler …&lt;br /&gt;
&lt;br /&gt;
* … uses reasonable limits for default crawling speed and re-crawling delay,&lt;br /&gt;
* … obeys [http://www.robotstxt.org/robotstxt.html robots.txt],&lt;br /&gt;
* … obeys crawling speed limitations in robots.txt ([http://help.yahoo.com/l/us/yahoo/search/webcrawler/slurp-03.html Crawl-Delay]),&lt;br /&gt;
* … identifies itself properly with the [http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.43 User-Agent HTTP request header], including contact information therein, &lt;br /&gt;
* … avoids excessive re-crawling,&lt;br /&gt;
* … respect [http://www.peej.co.uk/articles/http-caching.html HTTP cache headers] such as If-Modified-Since, Last-Modified and ETag when re-crawling.&lt;br /&gt;
&lt;br /&gt;
See [[Write Web Crawler]] for further guidelines.&lt;br /&gt;
&lt;br /&gt;
== Defensive measures ==&lt;br /&gt;
&lt;br /&gt;
If you run large web servers, you may want to consider [http://code.google.com/p/ldspider/wiki/ServerConfig defensive measures] against abuse and attacks.&lt;br /&gt;
&lt;br /&gt;
On Apache web servers, [http://www.fleiner.com/bots/#banning mod_rewrite can be used] to block bad crawlers based on their IP address or User-Agent string.&lt;br /&gt;
&lt;br /&gt;
[[WebID]] has been proposed as a [[stronger defense against over-eager crawlers]].&lt;br /&gt;
 &lt;br /&gt;
There are several sites dedicated to collecting and sharing information about bad web crawlers in general (not RDF-specific):&lt;br /&gt;
&lt;br /&gt;
* [http://www.bot-trap.de/ BotTrap.de] (in German)&lt;br /&gt;
* …&lt;br /&gt;
&lt;br /&gt;
== Incidents ==&lt;br /&gt;
&lt;br /&gt;
To report a poorly behaving crawler, please provide at least the following information:&lt;br /&gt;
&lt;br /&gt;
* Date of incident:&lt;br /&gt;
* What the crawler did wrong:&lt;br /&gt;
* User agent string:&lt;br /&gt;
* IP address range:&lt;br /&gt;
* Access logs (if possible):&lt;br /&gt;
&lt;br /&gt;
== References ==&lt;br /&gt;
&amp;lt;references /&amp;gt;&lt;/div&gt;</description>
			<pubDate>Thu, 23 Jun 2011 20:44:09 GMT</pubDate>			<dc:creator>Rcygania2</dc:creator>			<comments>http://www.w3.org/wiki/Talk:Bad_RDF_Crawlers</comments>		</item>
		<item>
			<title>WebID</title>
			<link>http://www.w3.org/wiki/WebID</link>
			<description>&lt;p&gt;Rcygania2:&amp;#32;add link to &amp;quot;WebID and crawlers&amp;quot; page&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;=WebIDs and the WebID Protocol=&lt;br /&gt;
&lt;br /&gt;
== What is a WebID? ==&lt;br /&gt;
A WebID is a way to uniquely identify a person, company,  organization, or other agent using a URI.   The term &amp;quot;WebID&amp;quot; was coined by Dan Brickley and Tim Berners-Lee in 2000.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
One direct use of this concept is the protocol known as [[foaf+ssl]] that is now being worked on in the [http://www.w3.org/2005/Incubator/webid/charter WebID Incubator Group] at the W3C.&lt;br /&gt;
&lt;br /&gt;
== FOAF+SSL WebID ==&lt;br /&gt;
&lt;br /&gt;
A FOAF+SSL WebID is a URI that refers to a person (Agents or Robots are ok too) via a uniquely identifying description placed on the web.&lt;br /&gt;
&lt;br /&gt;
[[Image:X509-Sense-and-Reference.jpg]]&lt;br /&gt;
&lt;br /&gt;
The FOAF+SSL WebID here is https://bblfish.net/#hjs . As stated  by [http://labs.apache.org/webarch/uri/rfc/rfc3986.html#fragment the URI specification] &lt;br /&gt;
&amp;lt;blockquote&amp;gt;The fragment identifier component of a URI allows indirect identification of a secondary resource by reference to a primary resource&amp;lt;/blockquote&amp;gt;&lt;br /&gt;
In this case the primary resource is the profile document https://bblfish.net/ and the secondary one is the WebID . This is usually thought of as the profile document.&lt;br /&gt;
&lt;br /&gt;
[http://www.foaf-project.org/ FOAF] is one fairly-commonly used WebID-centered vocabulary, used for profiles of ''agents'' (people, organizations, or software). Any [http://www.w3.org/2004/01/rdxh/spec GRDDLable] format - ie, pretty much any consistent format (including JSON due to [http://buzzword.org.uk/2008/jsonGRDDL/spec jsonGRDDL])  - should work as well.&lt;br /&gt;
&lt;br /&gt;
The value of having a URI as an identifier are:&lt;br /&gt;
* that it can be linked to by other profiles, to create a linked web of trust&lt;br /&gt;
* that it can be tied to information enabling a method of authentication ( such as OpenID or even more directly with [[foaf+ssl]] ) &lt;br /&gt;
&lt;br /&gt;
The screen cast [http://www.youtube.com/watch?v=8iZPJBpI2Po  Using a WebID with the FOAF+SSL on the emerging Social Web] should help reveal what is being enabled here. Note that in the case of [[foaf+ssl]] the end user does not need to remember his WebID.&lt;br /&gt;
&lt;br /&gt;
=== Why should I get a WebID? ===&lt;br /&gt;
&lt;br /&gt;
People often publish data about themselves on the Web, such as:&lt;br /&gt;
&lt;br /&gt;
* Who they know&lt;br /&gt;
* What they are interested in&lt;br /&gt;
* Photos they have taken&lt;br /&gt;
* Projects they work on&lt;br /&gt;
* Their curriculum vitae or employment history&lt;br /&gt;
* Their publications&lt;br /&gt;
&lt;br /&gt;
Having a Web ID can allow you to identify yourself when you publish this sort of information online and link&lt;br /&gt;
to each of those resources.&lt;br /&gt;
&lt;br /&gt;
Most importantly of all it allows people to reference you and declare social relations on the&lt;br /&gt;
web, as that you are their friend, colleague, parent, etc... even when their profile is hosted on a different web&lt;br /&gt;
server as yours. This is key to enabling the Social Web: ie. social networks between individuals, citizens, companies, universities, &lt;br /&gt;
governments, while allowing each player to remain in control of their data they publish.&lt;br /&gt;
&lt;br /&gt;
In order to deal with privacy issues, a Profile Server should reveal a more or less depending on the identity&lt;br /&gt;
(WebID) of the user.&lt;br /&gt;
&lt;br /&gt;
=== Do I already have a WebID? ===&lt;br /&gt;
&lt;br /&gt;
If you are already a member of [[FoafSites|a number of social networking sites]], you may already&lt;br /&gt;
have a WebID which they have given you automatically (but be aware that some sites, e.g., LiveJournal,&lt;br /&gt;
currently expose FOAF data without creating WebIDs for each user; [http://my.opera.com/community/ MyOpera] and [http://id.myopenlink.net/ MyOpenLink] are&lt;br /&gt;
sites which do assign and use URIs for all users).&lt;br /&gt;
&lt;br /&gt;
These sites export (some of) the data which you have put into them.  It is normally a subset -- perhaps&lt;br /&gt;
just the social graph (i.e., who knows whom on the site).  This is useful, but it doesn't mean you can use the social networking site to&lt;br /&gt;
manage all the data you may want in your public profile on the web.&lt;br /&gt;
&lt;br /&gt;
An important question about any site which exports your data is: ''Does it allow me to make a link&lt;br /&gt;
to data on another site?'' If so, your data becomes linked together from site to site, and people and&lt;br /&gt;
crawlers can find all of it which you choose to expose. If a site does not allow you to link your data&lt;br /&gt;
in this way, they are effectively confining your data to only being used within their service, which&lt;br /&gt;
some people find restrictive.&lt;br /&gt;
&lt;br /&gt;
=== Best Practices for Sites Providing WebIDs for Users ===&lt;br /&gt;
&lt;br /&gt;
* Allow your users to tell your site their existing Web ID(s). Publish these Web IDs in your data.&lt;br /&gt;
** Use &amp;lt;code&amp;gt;&amp;lt;nowiki&amp;gt;owl:sameAs&amp;lt;/nowiki&amp;gt;&amp;lt;/code&amp;gt; to associate their existing Web IDs with the Web ID you provide for them; or&lt;br /&gt;
** Use their existing Web ID in your data ''instead'' of the Web ID you would normally provide them.&lt;br /&gt;
* Be careful to distinguish between your user as a person, and your user's account. In SIOC, these are represented as &amp;lt;code&amp;gt;&amp;lt;nowiki&amp;gt;foaf:Person&amp;lt;/nowiki&amp;gt;&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;&amp;lt;nowiki&amp;gt;sioc:UserAccount&amp;lt;/nowiki&amp;gt;&amp;lt;/code&amp;gt; respectively. These two classes are disjoint (i.e., an entity can't be both at the same time).&lt;br /&gt;
&lt;br /&gt;
=== Can I make my own Web ID? ===&lt;br /&gt;
&lt;br /&gt;
You can use any web server you have authorization to publish your files to.&lt;br /&gt;
&lt;br /&gt;
First you must decide on a stable place to host your profile document.  If you have your own domain name,&lt;br /&gt;
that is perfect as you can change providers without changing your Web ID.  If you don't have your own&lt;br /&gt;
domain, you'll need to pick a provider where you can host a file.  The file you make will be named &amp;lt;code&amp;gt;&amp;lt;nowiki&amp;gt;foaf.rdf&amp;lt;/nowiki&amp;gt;&amp;lt;/code&amp;gt;.&lt;br /&gt;
(Not all Web IDs will use &amp;lt;code&amp;gt;&amp;lt;nowiki&amp;gt;foaf.rdf&amp;lt;/nowiki&amp;gt;&amp;lt;/code&amp;gt;, but this is an easy example.)&lt;br /&gt;
&lt;br /&gt;
Your Web ID is made up of the URL of that file on the Web, plus &amp;lt;code&amp;gt;&amp;lt;nowiki&amp;gt;#ABC&amp;lt;/nowiki&amp;gt;&amp;lt;/code&amp;gt;, where you replace ABC with your initials.&lt;br /&gt;
(Technically, you can use &amp;lt;code&amp;gt;&amp;lt;nowiki&amp;gt;#me&amp;lt;/nowiki&amp;gt;&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;&amp;lt;nowiki&amp;gt;#this&amp;lt;/nowiki&amp;gt;&amp;lt;/code&amp;gt;, or any other fragment identifier; initials are currently being&lt;br /&gt;
recommended.) (&amp;lt;-- recommended by whom? and how to handle suffixes, e.g., Jr., Sr., III, IV?)&lt;br /&gt;
&lt;br /&gt;
So your Web ID looks something like:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&amp;lt;nowiki&amp;gt;&lt;br /&gt;
http://your.isp.com/whatever/~yourusername/foaf.rdf#ABC&lt;br /&gt;
&amp;lt;/nowiki&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
You can consult the [http://www.w3.org/TR/swbp-vocab-pub/ Best Practice Recipes for Publishing RDF Vocabularies]&lt;br /&gt;
for some options on how to do this.&lt;br /&gt;
&lt;br /&gt;
There are a number of ways to make a FOAF file.&lt;br /&gt;
&lt;br /&gt;
* Use Tabulator (version supporting this action has not yet been released, as of 2008-11)&lt;br /&gt;
* Use a template and edit it for your own use&lt;br /&gt;
* [http://foafbuilder.qdos.com/ FOAF builder], a recent service that allows you to create a FOAF file to place where you want, or to be saved on their servers (which currently requires an OpenID).&lt;br /&gt;
* Use [http://www.ldodds.com/foaf/foaf-a-matic FOAF-a-Matic]&lt;br /&gt;
* [http://virtuoso.openlinksw.com/dataspace/dav/wiki/Main/GetAPersonalURIIn5MinutesOrLess Get A URI in 5 Minutes or Less] via [[OpenLinkDataSpaces|ODS]] from [[OpenLinkSoftware|OpenLink Software]]&lt;br /&gt;
* Use [http://www.foaf-o-matic.org FOAF-o-Matic], a recent service for creating FOAF profiles in which you may decide to reuse pre-existing URIs for you or your friends, this way improving the integration of the FOAF social network. This service is empowered by the Entity Name System (ENS) which is developed as part of the [http://www.okkam.org OKKAM] EU-funded project.&lt;br /&gt;
&lt;br /&gt;
It is good to pick a hosting service which allows you to edit the file in place&lt;br /&gt;
using WebDAV. WebDAV is an extension for HTTP which lets you edit as well as read&lt;br /&gt;
web pages, without needing to learn how to download and upload files with that host.&lt;br /&gt;
&lt;br /&gt;
See [[EditingData]] for details of how to make an editable web site.&lt;br /&gt;
&lt;br /&gt;
=== How does a WebID compare with my social networking site? ===&lt;br /&gt;
&lt;br /&gt;
When you get an account on your favorite social networking site, like Facebook, [[MySpace]], [[LinkedIn]] and so on,&lt;br /&gt;
you spend a lot of time telling that site about yourself, who your friends are, what photos include you, and so on.&lt;br /&gt;
This information is re-used to provide, within that site, services like showing you photos of your friends.&lt;br /&gt;
The trouble is, each site is a silo. When you want to use info you gave to one site on another,&lt;br /&gt;
you have to negotiate for each site to access your data on the other.&lt;br /&gt;
&lt;br /&gt;
But this is your data.  You can publish it anywhere you can publish web content.&lt;br /&gt;
&lt;br /&gt;
* See also: [http://www.w3.org/2008/Talks/1022-Steven_Pemberton/ Steven Pemberton: Why You Should Have a Web Site]&lt;br /&gt;
&lt;br /&gt;
That said, these silos are slowly opening; technologies like OpenID, OAuth, [[OpenSocial]], [[PortableContacts]], and so on, are being&lt;br /&gt;
deployed on many such sites. A Web ID is an identifier that you can use to tie together information exposed by such&lt;br /&gt;
services, in a portable and future-proof way.&lt;br /&gt;
&lt;br /&gt;
=== I have a home page, is that a Web ID? ===&lt;br /&gt;
&lt;br /&gt;
No, your Web Page URL identifies the location of a document that is associated with you, in a sense it is similar to your drivers license or social security card, both of these artifacts identify you indirectly. &lt;br /&gt;
&lt;br /&gt;
In DBMS parlance, an indirect identifier takes the form of a Unique Key, and in Semantic Web parlance indirect identifiers are values associated with Inverse Functional Properties (IFPs).  &lt;br /&gt;
&lt;br /&gt;
Since you aren't a Document, a Web Page URL cannot be used to construct an Identifier that uniquely identifies you. It cannot be the Naming mechanism used by other Web users to accurately reference you. &lt;br /&gt;
&lt;br /&gt;
A Web ID looks similar to a home page URL, but it specifically identifies Entity ''You'' of Type: Person. Typically, the definition of Type: Person,&lt;br /&gt;
comes from a vocabulary or ontology or data dictionary. One such vocabulary is FOAF, which is the basis of this effort.&lt;br /&gt;
&lt;br /&gt;
When you get a Web ID, it can be used to accurately reference &amp;quot;You&amp;quot;. It is also the conduit to all information associated with &amp;quot;You&amp;quot; which includes things you've created -- Blog Posts, Shared Bookmarks, Music, Photos etc. -- and relationships you have with other people (your social network for example). &lt;br /&gt;
&lt;br /&gt;
=== Can I use a Web ID for Federated Single Sign-on? ===&lt;br /&gt;
&lt;br /&gt;
Yes!  WebIDs form the basis of the [[foaf+ssl|WebID Protocol]], an innovative approach to federated single sign-on, which is discussed below.&lt;br /&gt;
&lt;br /&gt;
== What is the WebID Protocol? ==&lt;br /&gt;
&lt;br /&gt;
The [[foaf+ssl|WebID Protocol]] authenticates a digital identity (a WebID) by allowing an Agent (e.g., a Web Browser) to prove possession of or access to a private key, whose corresponding public key is tightly bound to that WebID. The private key is usually associated with a &amp;quot;certificate&amp;quot; on the user's computer, while the public key and WebID are part of that certificate and tightly bound to its subject.  Likewise, the public key and WebID are both tightly bound into the profile-oriented document(s) that describe the identity being authenticated.&lt;br /&gt;
&lt;br /&gt;
This proof is accomplished by adding an invisible look-up step into the standard client-side certificate verification process (that is, the TLS Handshake Protocol) used by the SSL protocol -- which is in use anytime you see an &amp;lt;code&amp;gt;https://&amp;lt;/code&amp;gt; link.  This look-up reveals the user accessing the site (i.e., the owner of the WebID) in the overall Web of Linked Data, against which the server can then query to determine the level of trust that user should be granted. ''Note: This process works with self-signed certificates, so it does not require the participation of Certificate Authorities to grow.''&lt;br /&gt;
&lt;br /&gt;
The WebID Protocol promises to be more fine grained than other Federated Single Sign-on methods. ''Federation'' generally implies large entities that work together toward a common goal. The WebID Protocol lets individuals participate at the same level as large entities. &lt;br /&gt;
&lt;br /&gt;
=== Some Protocol Details ===&lt;br /&gt;
&lt;br /&gt;
An excerpt from the introduction of the [http://tools.ietf.org/html/rfc5246#section-1 Transport Layer Security (TLS) Protocol] specification may help clarify matters somewhat.&lt;br /&gt;
&lt;br /&gt;
   The TLS Record Protocol is used for encapsulation of various higher-&lt;br /&gt;
   level protocols.  One such encapsulated protocol, the TLS Handshake&lt;br /&gt;
   Protocol, allows the server and client to authenticate each other and&lt;br /&gt;
   to negotiate an encryption algorithm and cryptographic keys before&lt;br /&gt;
   the application protocol transmits or receives its first byte of&lt;br /&gt;
   data.  The TLS Handshake Protocol provides connection security that&lt;br /&gt;
   has three basic properties:&lt;br /&gt;
   &lt;br /&gt;
   -  The peer's identity can be authenticated using asymmetric, or&lt;br /&gt;
      public key, cryptography (e.g., RSA [RSA], DSA [DSS], etc.).  This&lt;br /&gt;
      authentication can be made optional, but is generally required for&lt;br /&gt;
      at least one of the peers.&lt;br /&gt;
   &lt;br /&gt;
   -  The negotiation of a shared secret is secure: the negotiated&lt;br /&gt;
      secret is unavailable to eavesdroppers, and for any authenticated&lt;br /&gt;
      connection the secret cannot be obtained, even by an attacker who&lt;br /&gt;
      can place himself in the middle of the connection.&lt;br /&gt;
   &lt;br /&gt;
   -  The negotiation is reliable: no attacker can modify the&lt;br /&gt;
      negotiation communication without being detected by the parties to&lt;br /&gt;
      the communication.&lt;br /&gt;
&lt;br /&gt;
The WebID Protocol simply adds a check to the existing TLS Handshake, authenticating the identity of the client-side peer.  That check is a look up against the WebID for the public key received during a successful handshake.&lt;br /&gt;
&lt;br /&gt;
=== Why is the WebID Protocol Viable? ===&lt;br /&gt;
&lt;br /&gt;
The simple answer is that URLs can be used to name anything, including, in particular, people. In the [[LinkedData]] pattern one places information about the object at the URL of the object named. This helps in finding the meaning of any URI: you just need to click it, to GET it. The same pattern is applied here. Someone names themselves with a URL, and places a document containing structured data about themselves, typically including links to the people they know, at that URL. Each link they add is a vote of trust in the information at which they are pointing; in this case, trusting that the URL they are using for their friend really is one that reliably and stably refers to and describes their friend &amp;amp;mdash; since their friend wants to use that URL to keep track of the information about theirself. This builds a network of trust.&lt;br /&gt;
&lt;br /&gt;
[[foaf+ssl|WebID Protocol]] is just a technology that leverages X.509 Certificates, already in common (and largely invisible) use. By placing that same URL in the certificate that identifies the user, and then placing information at that URL about the public key of the certificate, a web server that receives a user request can verify that the user has write access to that URL. If that can be proven then the server may as well agree that the user is the person in question described by the resource. The value of the trust the server puts into what this person says can then be established by the position of that URL in the network of trust. A more detailed version of this explanation can be found in the article [http://blogs.sun.com/bblfish/entry/more_on_authorization_in_foaf FOAF+SSL: Creating a Web of Trust Without Key Signing Parties].&lt;br /&gt;
&lt;br /&gt;
=== How does the WebID Protocol compare with OpenID? ===&lt;br /&gt;
&lt;br /&gt;
OpenID is also a single-sign-on system.&lt;br /&gt;
&lt;br /&gt;
However, while it does give you an identity for sign on,&lt;br /&gt;
it does not give you an identity for all the types&lt;br /&gt;
of networks mentioned above, such as Friend Of A Friend (FOAF),&lt;br /&gt;
Description Of A Project (DOAP), and so on.&lt;br /&gt;
&lt;br /&gt;
OpenIDs serve as fine ''indirect identifiers.'' Much as you can talk about &amp;quot;the Person who has the blog at example.org&amp;quot; or&lt;br /&gt;
&amp;quot;the Company whose homepage is at example.com&amp;quot;, you can use an OpenID page identifier as an indirect identifier for the person&lt;br /&gt;
who controls that OpenID page. In fact, this in a way OpenID works — by demonstrating OpenID control over some &lt;br /&gt;
identifying page, you can assure other sites of your real-world identity. OpenID is decentralized in that it allows you to arrange&lt;br /&gt;
for any page (e.g., your longstanding homepage or blog) to serve as an indirect person-identifying page.&lt;br /&gt;
&lt;br /&gt;
Additionally, because OpenID allows you to demonstrate control of your page to other sites, it can also be used to help&lt;br /&gt;
those sites know what the preferred Web ID is for any OpenID user whose pages link to a FOAF description. In other words,&lt;br /&gt;
once you have the combination of OpenID/FOAF set up, you can log in elsewhere using your OpenID and it will also make clear&lt;br /&gt;
what your preferred Web ID is.&lt;br /&gt;
&lt;br /&gt;
It is possible to connect your Web ID to your OpenID in both directions.&lt;br /&gt;
&lt;br /&gt;
For a detailed protocol-level discussion, see [http://blogs.sun.com/bblfish/entry/what_does_foaf_ssl_give A comparison of FOAF+SSL and OpenID].&lt;br /&gt;
&lt;br /&gt;
== Official Specifications ==&lt;br /&gt;
&lt;br /&gt;
* [http://www.w3.org/2005/Incubator/webid/spec/ First set of specifications] (aka WebID 1.0)&lt;br /&gt;
&lt;br /&gt;
== Related (seeAlso) ==&lt;br /&gt;
&lt;br /&gt;
* [http://lists.w3.org/Archives/Public/public-xg-webid/ The WebID Incubator Group] at the W3C&lt;br /&gt;
* [http://www.openlinksw.com/blog/~kidehen/?id=1148 Personal URIs &amp;amp; Data Spaces] from Kingsley Idehen's [[http://www.openlinksw.com/blog/~kidehen/ blog]&lt;br /&gt;
* [[SocialNetworks2009Workshop|Social Networks Workshop (Barcelona 2009)]]&lt;br /&gt;
* [http://blogs.sun.com/bblfish/entry/more_on_authorization_in_foaf How FOAF+SSL works]&lt;br /&gt;
* [http://dig.csail.mit.edu/breadcrumbs/node/71 Give yourself a URI] - on Tim Berners-Lee's blog&lt;br /&gt;
* [http://www.w3.org/DesignIssues/ReadWriteLinkedData.html Read-Write Linked Data] and [http://www.w3.org/DesignIssues/CloudStorage.html Socially Aware Cloud Storage] - from Tim Berners-Lee's [http://www.w3.org/DesignIssues/Overview.html Design Issues] series&lt;br /&gt;
* [[WebID and Crawlers]]&lt;br /&gt;
&lt;br /&gt;
=== Screencasts ===&lt;br /&gt;
* [http://www.youtube.com/watch?v=8iZPJBpI2Po Using a WebID with the WebID Protocol (née FOAF+SSL) on the emerging Social Web], from Henry Story&lt;br /&gt;
* [http://www.google.com/search?q=webid+kidehen&amp;amp;tbs=vid:1 several more], from Kingsley Idehen&lt;br /&gt;
* [http://www.youtube.com/watch?v=rnoRQZCL9I4 WebID &amp;amp; Browsers] - presentation made by Henry Story for W3C's Identity in the Browser Workshop 2011.&lt;/div&gt;</description>
			<pubDate>Thu, 23 Jun 2011 20:43:33 GMT</pubDate>			<dc:creator>Rcygania2</dc:creator>			<comments>http://www.w3.org/wiki/Talk:WebID</comments>		</item>
		<item>
			<title>WebID and Crawlers</title>
			<link>http://www.w3.org/wiki/WebID_and_Crawlers</link>
			<description>&lt;p&gt;Rcygania2:&amp;#32;move section from &amp;quot;Bad RDF Crawlers&amp;quot; to its own page&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;[[WebID]] can be used as an effective defense against abusive Web crawlers.&lt;br /&gt;
&lt;br /&gt;
== Traditional countermeasures are insufficient ==&lt;br /&gt;
The traditional measures described on [[Bad RDF Crawlers]] have been around since the beginning of Web crawling, and suffer from a number of problems:&lt;br /&gt;
&lt;br /&gt;
* IP addresses are very bad identifiers&lt;br /&gt;
** they can be faked&lt;br /&gt;
** a large number of users can sit behind a single IP address. In the early Web (1995-&amp;gt;1998) most addresses came through AOL proxies.&lt;br /&gt;
* headers can be faked or forgotten&lt;br /&gt;
* robots.txt works by convention only - it has no enforcement mechanism&lt;br /&gt;
** robot writers need to know about it, and this is not always an evident thing to understand&lt;br /&gt;
** not all users have access to robots.txt, so in any case it is not a very flexible mechanism for setting access control&lt;br /&gt;
&lt;br /&gt;
Where these were perfectly fine in a world where there were few writers of robots and computing power for running such tools was expensive, they are no longer appropriate for a world where every laptop has more RAM and CPU than the largest machines search engines were running on in 1996.&lt;br /&gt;
&lt;br /&gt;
== Requirement: Strong distributed access control ==&lt;br /&gt;
What is required is strong and automatic access control, that works in a distributed manner. But global authentication is required for this to work. Otherwise, robots would need to find the login page for every web site and create themselves a username and password for that site, which is clearly an impossible task.&lt;br /&gt;
&lt;br /&gt;
Global Authentication tied into Linked Data is enabled by [[foaf+ssl|FOAF+SSL]], also known as [http://webid.info/spec/ WebID]. Both HTTP and HTTPS resources can be protected this way&lt;br /&gt;
* HTTPS resources request client-side certificates according to the usual WebID protocol&lt;br /&gt;
* HTTP resources can use cookies and redirect clients to an HTTPS endpoint for authentication if the requestor has no cookie. If the client does not have a WebID-enabled certificate, OpenID or other methods of authentication can be used. Once authenticated, clients (and hence robots) can then be redirected to the HTTP resources and proceed as usual.&lt;br /&gt;
&lt;br /&gt;
== Advantages of WebID ==&lt;br /&gt;
The advantages of WebID are many:&lt;br /&gt;
* Robots and crawlers can identify themselves as :Crawler in their WebID Profile document (ontology to be developed), and so get access to special resources more useful to robots, such as full dumps or RSS feeds.&lt;br /&gt;
* Authentication is automatically enforced - so bad robot writers will very soon find out about it, as they won't get access until they do.&lt;br /&gt;
* WebIDs are distributed and can preserve anonymity while enabling authentication. WebIDs can be self-generated and throw-away. There is no center of control. &lt;br /&gt;
* Good WebID users can get better service over time, leading even anonymously-identified robots to pursue a strategy of long-term good behavior.&lt;br /&gt;
* Getting WebIDs is very easy, and most software libraries support client-side certificates, so it should only be a few hours work for robot authors to enable their crawlers with it.&lt;br /&gt;
* Building WebID-enabled application servers is not that much work either.&lt;br /&gt;
&lt;br /&gt;
== The WebID Incubator Group ==&lt;br /&gt;
The [http://tinyurl.com/webidxg WebID Incubator Group] is very keen to work with robot writers and linked data publishers to help them WebID-enable their apps.&lt;br /&gt;
&lt;br /&gt;
== See Also ==&lt;br /&gt;
* [[WebID]]&lt;br /&gt;
* [[Bad RDF Crawlers]]&lt;/div&gt;</description>
			<pubDate>Thu, 23 Jun 2011 20:43:01 GMT</pubDate>			<dc:creator>Rcygania2</dc:creator>			<comments>http://www.w3.org/wiki/Talk:WebID_and_Crawlers</comments>		</item>
		<item>
			<title>Bad RDF Crawlers</title>
			<link>http://www.w3.org/wiki/Bad_RDF_Crawlers</link>
			<description>&lt;p&gt;Rcygania2:&amp;#32;make clear that this page is specifically about RDF crawlers&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;This page is intended to hold a list of poorly behaving crawlers that target RDF-publishing websites, unfortunately a recurring problem.&amp;lt;ref&amp;gt;[http://lists.w3.org/Archives/Public/public-lod/2011Jun/0433.html Think before you write Semantic Web crawlers], public-lod post by Martin Hepp, 21 June 2011&amp;lt;/ref&amp;gt; The list will allow publishers to defend themselves by blocking such crawlers.&lt;br /&gt;
&lt;br /&gt;
== Best practices for web crawlers ==&lt;br /&gt;
&lt;br /&gt;
Dereferencing is a privilege, not a right. Crawlers that don't use server resources considerately abuse that privilege. It has bad consequences for the Web in general.&lt;br /&gt;
&lt;br /&gt;
A well-behaved crawler …&lt;br /&gt;
&lt;br /&gt;
* … uses reasonable limits for default crawling speed and re-crawling delay,&lt;br /&gt;
* … obeys [http://www.robotstxt.org/robotstxt.html robots.txt],&lt;br /&gt;
* … obeys crawling speed limitations in robots.txt ([http://help.yahoo.com/l/us/yahoo/search/webcrawler/slurp-03.html Crawl-Delay]),&lt;br /&gt;
* … identifies itself properly with the [http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.43 User-Agent HTTP request header], including contact information therein, &lt;br /&gt;
* … avoids excessive re-crawling,&lt;br /&gt;
* … respect [http://www.peej.co.uk/articles/http-caching.html HTTP cache headers] such as If-Modified-Since, Last-Modified and ETag when re-crawling.&lt;br /&gt;
&lt;br /&gt;
See [[Write Web Crawler]] for further guidelines.&lt;br /&gt;
&lt;br /&gt;
== Defensive measures ==&lt;br /&gt;
&lt;br /&gt;
If you run large web servers, you may want to consider [http://code.google.com/p/ldspider/wiki/ServerConfig defensive measures] against abuse and attacks.&lt;br /&gt;
&lt;br /&gt;
On Apache web servers, [http://www.fleiner.com/bots/#banning mod_rewrite can be used] to block bad crawlers based on their IP address or User-Agent string.&lt;br /&gt;
&lt;br /&gt;
There are several sites dedicated to collecting and sharing information about bad web crawlers in general (not RDF-specific):&lt;br /&gt;
&lt;br /&gt;
* [http://www.bot-trap.de/ BotTrap.de] (in German)&lt;br /&gt;
* …&lt;br /&gt;
&lt;br /&gt;
=== Stronger defenses using WebID ===&lt;br /&gt;
&lt;br /&gt;
The above measures have been around since the beginning of Web crawling, and suffer from a number of problems&lt;br /&gt;
&lt;br /&gt;
* IP addresses are very bad identifiers&lt;br /&gt;
** they can be faked&lt;br /&gt;
** a large number of users can sit behind a single IP address. In the early Web (1995-&amp;gt;1998) most addresses came through AOL proxies.&lt;br /&gt;
* headers can be faked or forgotten&lt;br /&gt;
* robots.txt works by convention only - it has no enforcement mechanism&lt;br /&gt;
** robot writers need to know about it, and this is not always an evident thing to understand&lt;br /&gt;
** not all users have access to robots.txt, so in any case it is not a very flexible mechanism for setting access control&lt;br /&gt;
&lt;br /&gt;
Where these were perfectly fine in a world where there were few writers of robots and computing power for running such tools was expensive, they are no longer appropriate for a world where every laptop has more RAM and CPU than the largest machines search engines were running on in 1996.&lt;br /&gt;
What is required is strong and automatic access control, that works in a distributed manner. But global authentication is required for this to work. Otherwise, robots would need to find the login page for every web site and create themselves a username and password for that site, which is clearly an impossible task.&lt;br /&gt;
&lt;br /&gt;
Global Authentication tied into Linked Data is enabled by [[foaf+ssl|FOAF+SSL]], also known as [http://webid.info/spec/ WebID]. Both HTTP and HTTPS resources can be protected this way&lt;br /&gt;
* HTTPS resources request client-side certificates according to the usual WebID protocol&lt;br /&gt;
* HTTP resources can use cookies and redirect clients to an HTTPS endpoint for authentication if the requestor has no cookie. If the client does not have a WebID-enabled certificate, OpenID or other methods of authentication can be used. Once authenticated, clients (and hence robots) can then be redirected to the HTTP resources and proceed as usual.&lt;br /&gt;
&lt;br /&gt;
The advantages of WebID are many:&lt;br /&gt;
* Robots and crawlers can identify themselves as :Crawler in their WebID Profile document (ontology to be developed), and so get access to special resources more useful to robots, such as full dumps or RSS feeds.&lt;br /&gt;
* Authentication is automatically enforced - so bad robot writers will very soon find out about it, as they won't get access until they do.&lt;br /&gt;
* WebIDs are distributed and can preserve anonymity while enabling authentication. WebIDs can be self-generated and throw-away. There is no center of control. &lt;br /&gt;
* Good WebID users can get better service over time, leading even anonymously-identified robots to pursue a strategy of long-term good behavior.&lt;br /&gt;
* Getting WebIDs is very easy, and most software libraries support client-side certificates, so it should only be a few hours work for robot authors to enable their crawlers with it.&lt;br /&gt;
* Building WebID-enabled application servers is not that much work either.&lt;br /&gt;
&lt;br /&gt;
The [http://tinyurl.com/webidxg WebID Incubator Group] is very keen to work with robot writers and linked data publishers to help them WebID-enable their apps.&lt;br /&gt;
&lt;br /&gt;
== Incidents ==&lt;br /&gt;
&lt;br /&gt;
To report a poorly behaving crawler, please provide at least the following information:&lt;br /&gt;
&lt;br /&gt;
* Date of incident:&lt;br /&gt;
* What the crawler did wrong:&lt;br /&gt;
* User agent string:&lt;br /&gt;
* IP address range:&lt;br /&gt;
* Access logs (if possible):&lt;br /&gt;
&lt;br /&gt;
== References ==&lt;br /&gt;
&amp;lt;references /&amp;gt;&lt;/div&gt;</description>
			<pubDate>Thu, 23 Jun 2011 20:35:22 GMT</pubDate>			<dc:creator>Rcygania2</dc:creator>			<comments>http://www.w3.org/wiki/Talk:Bad_RDF_Crawlers</comments>		</item>
		<item>
			<title>Bad Crawlers</title>
			<link>http://www.w3.org/wiki/Bad_Crawlers</link>
			<description>&lt;p&gt;Rcygania2:&amp;#32;moved Bad Crawlers to Bad RDF Crawlers&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;#REDIRECT [[Bad RDF Crawlers]]&lt;/div&gt;</description>
			<pubDate>Thu, 23 Jun 2011 20:34:17 GMT</pubDate>			<dc:creator>Rcygania2</dc:creator>			<comments>http://www.w3.org/wiki/Talk:Bad_Crawlers</comments>		</item>
		<item>
			<title>Bad RDF Crawlers</title>
			<link>http://www.w3.org/wiki/Bad_RDF_Crawlers</link>
			<description>&lt;p&gt;Rcygania2:&amp;#32;moved Bad Crawlers to Bad RDF Crawlers&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;This page is intended to hold a list of poorly behaving crawlers, unfortunately a recurring problem.&amp;lt;ref&amp;gt;[http://lists.w3.org/Archives/Public/public-lod/2011Jun/0433.html Think before you write Semantic Web crawlers], public-lod post by Martin Hepp, 21 June 2011&amp;lt;/ref&amp;gt; The list will allow publishers to defend themselves by blocking such crawlers.&lt;br /&gt;
&lt;br /&gt;
== Best practices for web crawlers ==&lt;br /&gt;
&lt;br /&gt;
Dereferencing is a privilege, not a right. Crawlers that don't use server resources considerately abuse that privilege. It has bad consequences for the Web in general.&lt;br /&gt;
&lt;br /&gt;
A well-behaved crawler …&lt;br /&gt;
&lt;br /&gt;
* … uses reasonable limits for default crawling speed and re-crawling delay,&lt;br /&gt;
* … obeys [http://www.robotstxt.org/robotstxt.html robots.txt],&lt;br /&gt;
* … obeys crawling speed limitations in robots.txt ([http://help.yahoo.com/l/us/yahoo/search/webcrawler/slurp-03.html Crawl-Delay]),&lt;br /&gt;
* … identifies itself properly with the [http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.43 User-Agent HTTP request header], including contact information therein, &lt;br /&gt;
* … avoids excessive re-crawling,&lt;br /&gt;
* … respect [http://www.peej.co.uk/articles/http-caching.html HTTP cache headers] such as If-Modified-Since, Last-Modified and ETag when re-crawling.&lt;br /&gt;
&lt;br /&gt;
See [[Write Web Crawler]] for further guidelines.&lt;br /&gt;
&lt;br /&gt;
== Defensive measures ==&lt;br /&gt;
&lt;br /&gt;
If you run large web servers, you may want to consider [http://code.google.com/p/ldspider/wiki/ServerConfig defensive measures] against abuse and attacks.&lt;br /&gt;
&lt;br /&gt;
On Apache web servers, [http://www.fleiner.com/bots/#banning mod_rewrite can be used] to block bad crawlers based on their IP address or User-Agent string.&lt;br /&gt;
&lt;br /&gt;
There are several sites dedicated to collecting and sharing information about bad web crawlers in general:&lt;br /&gt;
&lt;br /&gt;
* [http://www.bot-trap.de/ BotTrap.de] (in German)&lt;br /&gt;
* …&lt;br /&gt;
&lt;br /&gt;
=== Stronger defenses using WebID ===&lt;br /&gt;
&lt;br /&gt;
The above measures have been around since the beginning of Web crawling, and suffer from a number of problems&lt;br /&gt;
&lt;br /&gt;
* IP addresses are very bad identifiers&lt;br /&gt;
** they can be faked&lt;br /&gt;
** a large number of users can sit behind a single IP address. In the early Web (1995-&amp;gt;1998) most addresses came through AOL proxies.&lt;br /&gt;
* headers can be faked or forgotten&lt;br /&gt;
* robots.txt works by convention only - it has no enforcement mechanism&lt;br /&gt;
** robot writers need to know about it, and this is not always an evident thing to understand&lt;br /&gt;
** not all users have access to robots.txt, so in any case it is not a very flexible mechanism for setting access control&lt;br /&gt;
&lt;br /&gt;
Where these were perfectly fine in a world where there were few writers of robots and computing power for running such tools was expensive, they are no longer appropriate for a world where every laptop has more RAM and CPU than the largest machines search engines were running on in 1996.&lt;br /&gt;
What is required is strong and automatic access control, that works in a distributed manner. But global authentication is required for this to work. Otherwise, robots would need to find the login page for every web site and create themselves a username and password for that site, which is clearly an impossible task.&lt;br /&gt;
&lt;br /&gt;
Global Authentication tied into Linked Data is enabled by [[foaf+ssl|FOAF+SSL]], also known as [http://webid.info/spec/ WebID]. Both HTTP and HTTPS resources can be protected this way&lt;br /&gt;
* HTTPS resources request client-side certificates according to the usual WebID protocol&lt;br /&gt;
* HTTP resources can use cookies and redirect clients to an HTTPS endpoint for authentication if the requestor has no cookie. If the client does not have a WebID-enabled certificate, OpenID or other methods of authentication can be used. Once authenticated, clients (and hence robots) can then be redirected to the HTTP resources and proceed as usual.&lt;br /&gt;
&lt;br /&gt;
The advantages of WebID are many:&lt;br /&gt;
* Robots and crawlers can identify themselves as :Crawler in their WebID Profile document (ontology to be developed), and so get access to special resources more useful to robots, such as full dumps or RSS feeds.&lt;br /&gt;
* Authentication is automatically enforced - so bad robot writers will very soon find out about it, as they won't get access until they do.&lt;br /&gt;
* WebIDs are distributed and can preserve anonymity while enabling authentication. WebIDs can be self-generated and throw-away. There is no center of control. &lt;br /&gt;
* Good WebID users can get better service over time, leading even anonymously-identified robots to pursue a strategy of long-term good behavior.&lt;br /&gt;
* Getting WebIDs is very easy, and most software libraries support client-side certificates, so it should only be a few hours work for robot authors to enable their crawlers with it.&lt;br /&gt;
* Building WebID-enabled application servers is not that much work either.&lt;br /&gt;
&lt;br /&gt;
The [http://tinyurl.com/webidxg WebID Incubator Group] is very keen to work with robot writers and linked data publishers to help them WebID-enable their apps.&lt;br /&gt;
&lt;br /&gt;
== Incidents ==&lt;br /&gt;
&lt;br /&gt;
To report a poorly behaving crawler, please provide at least the following information:&lt;br /&gt;
&lt;br /&gt;
* Date of incident:&lt;br /&gt;
* What the crawler did wrong:&lt;br /&gt;
* User agent string:&lt;br /&gt;
* IP address range:&lt;br /&gt;
* Access logs (if possible):&lt;br /&gt;
&lt;br /&gt;
== References ==&lt;br /&gt;
&amp;lt;references /&amp;gt;&lt;/div&gt;</description>
			<pubDate>Thu, 23 Jun 2011 20:34:17 GMT</pubDate>			<dc:creator>Rcygania2</dc:creator>			<comments>http://www.w3.org/wiki/Talk:Bad_RDF_Crawlers</comments>		</item>
		<item>
			<title>Bad RDF Crawlers</title>
			<link>http://www.w3.org/wiki/Bad_RDF_Crawlers</link>
			<description>&lt;p&gt;Rcygania2:&amp;#32;Revert Karl Dubost edit. I think the list of bad crawlers is important. Made general improvements to page.&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;This page is intended as a list of poorly behaving crawlers that target RDF-publishing websites, unfortunately a recurring problem&amp;lt;ref&amp;gt;[http://lists.w3.org/Archives/Public/public-lod/2011Jun/0433.html Think before you write Semantic Web crawlers], public-lod post by Martin Hepp, 21 June 2011&amp;lt;/ref&amp;gt;. The list will allow publishers to defend themselves by blocking such crawlers.&lt;br /&gt;
&lt;br /&gt;
== Best practices for web crawlers ==&lt;br /&gt;
&lt;br /&gt;
Dereferencing is a privilege, not a right. Crawlers that don't use server resources considerately abuse that privilege. It has bad consequences for the Web in general.&lt;br /&gt;
&lt;br /&gt;
A well-behaved crawler …&lt;br /&gt;
&lt;br /&gt;
* … uses reasonable limits for default crawling speed and re-crawling delay,&lt;br /&gt;
* … obeys [http://www.robotstxt.org/robotstxt.html robots.txt],&lt;br /&gt;
* … obeys crawling speed limitations in robots.txt ([http://help.yahoo.com/l/us/yahoo/search/webcrawler/slurp-03.html Crawl-Delay]),&lt;br /&gt;
* … identifies itself properly with the [http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.43 User-Agent HTTP request header], including contact information therein, &lt;br /&gt;
* … avoids excessive re-crawling,&lt;br /&gt;
* … respect [http://www.peej.co.uk/articles/http-caching.html HTTP cache headers] such as If-Modified-Since, Last-Modified and ETag when re-crawling.&lt;br /&gt;
&lt;br /&gt;
See [[Write Web Crawler]] for further guidelines.&lt;br /&gt;
&lt;br /&gt;
== Defensive measures ==&lt;br /&gt;
&lt;br /&gt;
If you run large web servers, you may want to consider [http://code.google.com/p/ldspider/wiki/ServerConfig defensive measures] against abuse and attacks.&lt;br /&gt;
&lt;br /&gt;
On Apache web servers, [http://www.fleiner.com/bots/#banning mod_rewrite can be used] to block bad crawlers based on their IP address or User-Agent string.&lt;br /&gt;
&lt;br /&gt;
There are several sites dedicated to collecting and sharing information about bad web crawlers in general (not RDF-specific):&lt;br /&gt;
&lt;br /&gt;
* [http://www.bot-trap.de/ BotTrap.de] (in German)&lt;br /&gt;
* …&lt;br /&gt;
&lt;br /&gt;
== Incidents ==&lt;br /&gt;
&lt;br /&gt;
To report a poorly behaving crawler, please provide at least the following information:&lt;br /&gt;
&lt;br /&gt;
* Date of incident:&lt;br /&gt;
* What the crawler did wrong:&lt;br /&gt;
* User agent string:&lt;br /&gt;
* IP address range:&lt;br /&gt;
* Access logs (if possible):&lt;br /&gt;
&lt;br /&gt;
== References ==&lt;br /&gt;
&amp;lt;references /&amp;gt;&lt;/div&gt;</description>
			<pubDate>Thu, 23 Jun 2011 15:34:59 GMT</pubDate>			<dc:creator>Rcygania2</dc:creator>			<comments>http://www.w3.org/wiki/Talk:Bad_RDF_Crawlers</comments>		</item>
		<item>
			<title>Bad RDF Crawlers</title>
			<link>http://www.w3.org/wiki/Bad_RDF_Crawlers</link>
			<description>&lt;p&gt;Rcygania2:&amp;#32;Created page with &amp;quot;This page is intended as a list of poorly behaving crawlers that target RDF-publishing websites.  This will allow publishers to defend themselves by blocking such crawlers based …&amp;quot;&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;This page is intended as a list of poorly behaving crawlers that target RDF-publishing websites.&lt;br /&gt;
&lt;br /&gt;
This will allow publishers to defend themselves by blocking such crawlers based on blocking of user agents or IP ranges.&lt;br /&gt;
&lt;br /&gt;
For background, see this public-lod thread: [http://lists.w3.org/Archives/Public/public-lod/2011Jun/0433.html Think before you write Semantic Web crawlers]&lt;br /&gt;
&lt;br /&gt;
== Best practices ==&lt;br /&gt;
Dereferencing is a privilege, not a right. Crawlers that don't use server resources considerately abuse that privilege. A well-behaved crawler must:&lt;br /&gt;
&lt;br /&gt;
* use reasonable limits for default crawling speed and re-crawling delay,&lt;br /&gt;
* obey [http://www.robotstxt.org/robotstxt.html robots.txt],&lt;br /&gt;
* obey crawling speed limitations in robots.txt ([http://help.yahoo.com/l/us/yahoo/search/webcrawler/slurp-03.html Crawl-Delay]),&lt;br /&gt;
* identify itself properly with the [http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.43 User-Agent HTTP request header], including contact information therein, &lt;br /&gt;
* avoid excessive re-crawling.&lt;br /&gt;
&lt;br /&gt;
== Incidents ==&lt;br /&gt;
&lt;br /&gt;
To report a poorly behaving crawler, please provide at least the following information:&lt;br /&gt;
&lt;br /&gt;
* Date of incident:&lt;br /&gt;
* What the crawler did wrong:&lt;br /&gt;
* User agent string:&lt;br /&gt;
* IP address range:&lt;/div&gt;</description>
			<pubDate>Wed, 22 Jun 2011 21:41:08 GMT</pubDate>			<dc:creator>Rcygania2</dc:creator>			<comments>http://www.w3.org/wiki/Talk:Bad_RDF_Crawlers</comments>		</item>
		<item>
			<title>HttpRange14Webography</title>
			<link>http://www.w3.org/wiki/HttpRange14Webography</link>
			<description>&lt;p&gt;Rcygania2:&amp;#32;/* See also */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;p align=&amp;quot;right&amp;quot;&amp;gt;&amp;lt;i&amp;gt; This is an old issue, and people are tired of it.&amp;lt;/i&amp;gt;  --Sandro Hawke, 2003&amp;lt;/p&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The chronology of a permathread.&lt;br /&gt;
&lt;br /&gt;
Not going to list the thousands of messages on the subject, but here are some highlights.&lt;br /&gt;
&lt;br /&gt;
* 2002-03-18 [http://www.w3.org/2002/03/18-tagmem-irc.html Discussion on TAG telcon]&lt;br /&gt;
* 2002-03-19 [http://lists.w3.org/Archives/Public/www-tag/2002Mar/0092.html TimBL &amp;quot;The range of the HTTP dereference function&amp;quot;] (look at the whole thread)&lt;br /&gt;
* 2002-03-25 [http://www.w3.org/2002/03/25-tag-summary TAG decides to accept the issue]&lt;br /&gt;
* 2002-03-25 [http://www.w3.org/2001/tag/issues.html?type=1#httpRange-14 Issue 14 entry in old TAG issues list]&lt;br /&gt;
* 2002-07-26 [http://www.w3.org/DesignIssues/HTTP-URI Tim Berners-Lee, &amp;quot;What do HTTP URIs Identify?&amp;quot;]&lt;br /&gt;
* 2002-09-24 [http://www.w3.org/2002/09/24-tag-summary#httpRange-14 TAG F2F discussion]&lt;br /&gt;
* 2003-01-12 [http://www.w3.org/2002/12/rdf-identifiers/ Sandro Hawke, &amp;quot;Disambiguating RDF Identifiers&amp;quot;]&lt;br /&gt;
* 2003-02-06 [http://www.w3.org/2003/02/06-tag-summary#httpRange-14 TAG F2F, vote to reopen issue 14 (with whiteboard photos)]&lt;br /&gt;
* 2003-07-24 [http://lists.w3.org/Archives/Public/www-tag/2003Jul/0317 Summary by Norman Walsh]&lt;br /&gt;
* 2004-04-20 [http://www.ontopia.net/topicmaps/materials/identitycrisis.html Curing the Web's Identity Crisis: Subject Indicators for RDF]&lt;br /&gt;
* 2004-05-13 [http://www.w3.org/2004/05/HTTPRange14.txt TimBL proposal, origin of &amp;quot;information resource&amp;quot;]&lt;br /&gt;
* 2004-10-07 [http://lists.w3.org/Archives/Public/www-tag/2004Oct/0014.html Dan C on IRs and copyright]&lt;br /&gt;
* 2004-10-14 [http://lists.w3.org/Archives/Public/www-tag/2004Oct/0101.html Sandro Hawke, &amp;quot;Referendum on httpRange-14&amp;quot;]&lt;br /&gt;
* 2004-10-25 [http://lists.w3.org/Archives/Public/www-tag/2004Oct/0161.html TimBL in &amp;quot;referendum&amp;quot; thread]&lt;br /&gt;
* 2005-05-31 [http://www.w3.org/2005/05/31-tagmem-minutes#item05 Telcon discussion]&lt;br /&gt;
* 2005-06-09 [http://www.w3.org/DesignIssues/HTTP-URI2 Tim Berners-Lee, &amp;quot;What do HTTP URIs Identify?&amp;quot;]&lt;br /&gt;
* 2005-06-14 [http://www.w3.org/2001/tag/2005/06/14-16-minutes#item023 TAG F2F discussion]&lt;br /&gt;
* 2005-06-18 [http://lists.w3.org/Archives/Public/www-tag/2005Jun/0039.html Roy Fielding, &amp;quot;httpRange-14 Resolved&amp;quot;]&lt;br /&gt;
* 2006-01-16 [http://www.w3.org/2000/10/swap/doc/Reach.html TimBL, &amp;quot;Reaching out onto the Web&amp;quot;]&lt;br /&gt;
* 2006-05-23 [http://www.w3.org/2006/04/irw65/urisym.html Dan Connolly, &amp;quot;A Pragmatic Theory of Reference for the Web&amp;quot;]&lt;br /&gt;
* 2007-02-18 [http://norman.walsh.name/2007/02/18/bigBang Norm Walsh, &amp;quot;Hacking httpRange-14&amp;quot;]&lt;br /&gt;
* 2007-07-12 [http://lists.w3.org/Archives/Public/www-tag/2007Jul/0034 Giovanni Tumarello on performance of 303]&lt;br /&gt;
* 2007-08-27 [http://lists.w3.org/Archives/Public/www-tag/2007Aug/0045.html &amp;quot;ISSUE-57: The use of HTTP Redirection&amp;quot;]&lt;br /&gt;
* 2007-08-28 TAG switches to use of tracker; [http://www.w3.org/2001/tag/group/track/actions/14 Issue 14 page]&lt;br /&gt;
* 2007-08-28 [http://www.w3.org/2001/tag/group/track/issues/57 Issue 57 page in tracker]&lt;br /&gt;
* 2007-10-04 [http://www.w3.org/2001/tag/doc/httpRange-14/2007-05-31/HttpRange-14 Rhys Lewis draft finding on the issue (withdrawn)]&lt;br /&gt;
* 2007-10-19 [http://www.dehora.net/journal/2007/10/19/fragged/ Bill de h&amp;amp;Oacute;ra, &amp;quot;Fragged&amp;quot;]&lt;br /&gt;
* 2007-11-24 [http://lists.w3.org/Archives/Public/www-tag/2007Nov/0029.html Pat Hayes, &amp;quot;Conforming is such sweet sorrow&amp;quot;]&lt;br /&gt;
* 2007-11-25 [http://lists.w3.org/Archives/Public/www-tag/2007Nov/0041.html Information resources] (www-tag thread)&lt;br /&gt;
* 2007-12-04 [http://lists.w3.org/Archives/Public/www-tag/2007Dec/0008.html Sean Palmer, &amp;quot;httpRange-14 Two Years On&amp;quot;]&lt;br /&gt;
* 2007-12-15 [http://lists.w3.org/Archives/Public/www-tag/2007Dec/0067.html What is an Information Resource?] (40 messages)&lt;br /&gt;
* 2008-04-29 [http://lists.w3.org/Archives/Public/public-awwsw/2008Apr/0046.html David Booth, definition of information resource as function]&lt;br /&gt;
* 2008-12-03 [http://www.w3.org/TR/2008/NOTE-cooluris-20081203/ Cool URIs for the Semantic Web]&lt;br /&gt;
* 2009-01-29 [http://lists.w3.org/Archives/Public/www-tag/2009Jan/0132.html Lisa Dusseault dissents]; [http://lists.w3.org/Archives/Public/www-tag/2009Jan/0135.html TimBL in same thread]&lt;br /&gt;
* 2009-08-01 [http://www.w3.org/mid/8618F212-C191-4CCC-9F27-6BF7829622FE@w3.org Proposed IETF/W3C task force: &amp;quot;Resource meaning: Review of new HTTPbis text for 303 See Other&amp;quot;]&lt;br /&gt;
* 2009-06-25 [http://lists.w3.org/Archives/Public/semantic-web/2009Jun/0241.html Pat Hayes &amp;quot;.htaccess a major bottleneck to Semantic Web adoption&amp;quot;] &lt;br /&gt;
* 2010-04-06 [http://lists.w3.org/Archives/Public/public-awwsw/2010Apr/0004.html Metadata subjects + 200 - a poll (thread)]&lt;br /&gt;
* 2010-07-01 [http://derivadow.com/2010/07/01/linked-things/ Tom Scott, &amp;quot;Linked things&amp;quot;]&lt;br /&gt;
* 2010-07-06 [http://www.w3.org/QA/2010/07/new_opportunities_for_linked_d.html J Rees, New opportunities for linked data nose-following]&lt;br /&gt;
* 2010-07-07 [http://inkdroid.org/journal/2010/07/07/linking-things-and-common-sense/ Ed Summers, Linking things and common sense]&lt;br /&gt;
* 2010-11-03 [http://iand.posterous.com/is-303-really-necessary Ian Davis, Is 303 Really Necessary?]&lt;br /&gt;
* 2010-11-07 [http://lists.w3.org/Archives/Public/public-lod/2010Nov/0270.html John Sheridan, &amp;quot;200 OK with Content-Location might work&amp;quot; caution] part of long public-lod thread&lt;br /&gt;
* 2010-11-09 [http://prototypo.blogspot.com/2010/11/another-guide-to-publishing-linked-data.html David Wood, &amp;quot;A(nother) Guide to Publishing Linked Data Without Redirects&amp;quot;]&lt;br /&gt;
* 2010-11-10 [http://tomheath.com/blog/2010/11/arguments-about-http-303-considered-harmful/ Tom Heath, Arguments about HTTP 303 Considered Harmful]&lt;br /&gt;
* 2011-01-20 [http://tools.ietf.org/html/draft-masinter-dated-uri-08 Larry Masinter, &amp;quot;The 'tdb' and 'duri' URI schemes&amp;quot;]&lt;br /&gt;
* 2011-01-16 [http://lists.w3.org/Archives/Public/public-awwsw/2011Jan/0012.html Manu Sporny on # and 303]&lt;br /&gt;
* 2011-01-20 [http://lists.w3.org/Archives/Public/public-awwsw/2011Jan/0021.html Harry Halpin on # and 303]&lt;br /&gt;
* 2011-02-01 [http://www.w3.org/2001/tag/awwsw/2011/status-2011-02.html AWWSW status report] (xhtml:license interop problem)&lt;br /&gt;
* 2011-02-09 [http://www.w3.org/2001/tag/2011/02/metadata-arch#slide9 JAR's slides for TAG F2F] (what you write with and without the httpRange-14 rule)&lt;br /&gt;
* 2011-03-03 [http://www.w3.org/2001/tag/2011/03/03-minutes#item03 TAG ISSUE-57 redescribed as &amp;quot;Mechanisms for obtaining information about the meaning of a given URI&amp;quot;]&lt;br /&gt;
&lt;br /&gt;
== See also ==&lt;br /&gt;
&lt;br /&gt;
* 1989-03-xx [http://www.w3.org/History/1989/proposal.html Information Management: A Proposal] see section &amp;quot;The problem with keywords&amp;quot;&lt;br /&gt;
* 1996-11-13 [http://www.w3.org/Architecture/NOTE-link.html Describing and Linking Web Resources]&lt;br /&gt;
* 1997-01-06 [http://www.w3.org/DesignIssues/Metadata Metadata architecture design note]&lt;br /&gt;
* 1997-05-14 [http://www.w3.org/Member/9705/WD-pics-ng-metadata-970514.html PICS-NG Metadata Model and Label Syntax]&lt;br /&gt;
* 1997-10-02 [http://www.w3.org/TR/WD-rdf-syntax-971002/ An early RDF draft] &amp;quot;RDF is a foundation for processing metadata&amp;quot;&lt;br /&gt;
* 1998-10-08 [http://www.w3.org/TR/1998/WD-rdf-syntax-19980216/ Another early RDF draft] &lt;br /&gt;
* 2000-xx-xx [http://www.w3.org/DesignIssues/Generic.html Generic Resources design note] first published&lt;br /&gt;
* 2003-05-21 [http://www.tbray.org/ongoing/When/200x/2003/05/21/RDFNet History of RDF (Tim Bray blog post on RDF.net)]&lt;/div&gt;</description>
			<pubDate>Mon, 20 Jun 2011 14:52:11 GMT</pubDate>			<dc:creator>Rcygania2</dc:creator>			<comments>http://www.w3.org/wiki/Talk:HttpRange14Webography</comments>		</item>
		<item>
			<title>User:Rcygania2/RulesOfThumb</title>
			<link>http://www.w3.org/wiki/User:Rcygania2/RulesOfThumb</link>
			<description>&lt;p&gt;Rcygania2:&amp;#32;/* Namespace URIs */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;A loose collection of [[SemWeb]]-related rules of thumb that I would propose as good practice.&lt;br /&gt;
&lt;br /&gt;
= Modelling a dataset in RDF =&lt;br /&gt;
&lt;br /&gt;
== Labels ==&lt;br /&gt;
* Make sure that everything has an rdfs:label, either directly specified, or by using some property that is defined as a subproperty of rdfs:label&lt;br /&gt;
* Don't be overly concerned with ambiguous labels; just consider the resource in isolation. That's because labels cannot do the job of disambiguation anyway, and trying to do it results in artificial and awkward labels.&lt;br /&gt;
&lt;br /&gt;
== Language tags ==&lt;br /&gt;
* On untyped literals, if the literal is likely to be understood only by speakers of a single language, then add a language tag. If it is likely to work for speakers of many languages, keep it without a language tag. If the file or dataset has only a few exceptions, then it is perhaps better to go for consistency and mark them the same way as the rest of the file.&lt;br /&gt;
* If you publish in multiple languages, then perhaps it's a good idea to include a plain literal in a “default language” without a language tag, to make SPARQLing easy.&lt;br /&gt;
&lt;br /&gt;
== Datatypes ==&lt;br /&gt;
* Avoid xsd:string. Just use a plain literal.&lt;br /&gt;
* For numbers, prefer xsd:decimal and xsd:integer because they are not restricted in accuracy/range.&lt;br /&gt;
* Avoid defining custom datatypes if you can. Better bake the literal semantics into properties.&lt;br /&gt;
* Issue: SKOS demands custom datatypes for skos:notation. Just ignore that?&lt;br /&gt;
* For units of measurement, prefer a pattern such as: ex:length [ ex:meter 5.21 ]&lt;br /&gt;
&lt;br /&gt;
= Designing vocabularies and ontologies =&lt;br /&gt;
&lt;br /&gt;
@@@ Interesting advice from TimBL: http://lists.w3.org/Archives/Public/public-lod/2011Apr/0282.html&lt;br /&gt;
&lt;br /&gt;
== Naming of properties ==&lt;br /&gt;
* Properties that point to documents (information resources) should have names that announce this fact, e.g. userProfile, userPage, userList, eventRecord, eventForm&lt;br /&gt;
* Relationship nouns make good propery names, e.g “parent” is better than “hasParent” or “isParentOf” (as per TimBL)&lt;br /&gt;
 Focus on one problem::&lt;br /&gt;
* [from an email to rdf-schema-dev on 2008-04-05] Some random half-formed thoughts: It's good if the vocabulary covers all my needs for a given problem. It's good if the vocabulary doesn't contain much extra stuff that I don't need to solve my problem. It's good if the purpose and coverage of the vocabulary can be conveyed in a short term or phrase (e.g. “document metadata” or “issue tracking”). It's good if the level of abstraction is consistent throughout the vocabulary, e.g. don't mix high-level concepts like Service and Container into your down-to-earth photo annotation vocabulary.&lt;br /&gt;
 Provide excellent documentation::&lt;br /&gt;
* [from an email to rdf-schema-dev on 2008-04-05] Random thoughts again: Some introductory narrative. A bunch of good examples for typical usages of the vocabulary. An UML-style overview diagram if the vocab has more than a few classes. Some tutorial-style text. An excellent reference section with all terms and notes about how they are supposed to be used (including notes on what they are NOT supposed to be used for).&lt;br /&gt;
&lt;br /&gt;
== Namespace URIs ==&lt;br /&gt;
From danbri in an email to the DC Architecture list, 30 March 2010 09:40:47 IST:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;blockquote&amp;gt;My current preference / advice (for new work) is for the managers of&lt;br /&gt;
each serious namespace to invest in a distinct domain name for it, and&lt;br /&gt;
for us as a community to come up with social machinery for 'watching&lt;br /&gt;
each other's backs' to ensure that the domains are kept in good&lt;br /&gt;
working order, fees are paid, etc. Sometimes an additional level of&lt;br /&gt;
indirection can add as much risk as it saves.  Initially when I bought&lt;br /&gt;
xmlns.com I have idea it could be a home for lots of namespaces, and&lt;br /&gt;
then the more I thought about it, the less I liked that idea. Each new&lt;br /&gt;
namespace added to the bucket brings some risk to the others using the&lt;br /&gt;
domain, by adding to the complexity and burden for subsequent&lt;br /&gt;
maintainers. So I think a proliferation of independent domain names,&lt;br /&gt;
while painful in its own way, spreads the risk...&amp;lt;/blockquote&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Later in the thread:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;blockquote&amp;gt;Rule of thumb - when wondering what info to include in a namespace&lt;br /&gt;
URI, ... try to leave *out* as much as possible&amp;lt;/blockquote&amp;gt;&lt;br /&gt;
&lt;br /&gt;
And [http://www.w3.org/mid/BANLkTikhWKT90gWq+L-5YJuqsPOdo5ECQA@mail.gmail.com even more concise]:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;blockquote&amp;gt;First rule of namespace URI design &amp;quot;you're more likely to regret&lt;br /&gt;
things you included, than things you omitted&amp;quot;.&amp;lt;/blockquote&amp;gt;&lt;br /&gt;
&lt;br /&gt;
For hash namespaces, the RDF document containing the vocabulary should be typed as owl:Ontology and should be the target of any rdfs:isDefinedBy statements. Note, the RDF document's URI is ''the namespace URI without the trailing hash''.&lt;br /&gt;
&lt;br /&gt;
= Publishing RDF on the Web =&lt;br /&gt;
&lt;br /&gt;
== Metadata in RDF documents ==&lt;br /&gt;
* Every RDF document has ''some'' relation to the thing(s) it talks about. It is useful to explicitly state what that relation is. For example, in my FOAF file I state that I'm both the foaf:maker and the foaf:primaryTopic of the file. Other useful properties are: rdfs:isDefinedBy, foaf:topic. A concrete benefit is that consumers can pick out the “important” things from the graph.&lt;br /&gt;
&lt;br /&gt;
== HTML descriptions of URI-denoted things ==&lt;br /&gt;
* To create trust into the stability, reliability and availability of a URI, its HTML description should explicitly state the URI, it should contain an explicit ''Statement of Purpose'' to the effect that the URI is intended to be used as an identifier for the thing, and it should contain a ''Publisher Identification''. It must provide sufficient information to enable human users to know exactly what is being referred to. (This is inspired by [http://www.oasis-open.org/committees/download.php/3050/pubsubj-pt1-1.02-cs.pdf Published Subjects].)&lt;br /&gt;
&lt;br /&gt;
== Content negotiation ==&lt;br /&gt;
* The benefit of CN is that all URIs also work in a standard Web browser, not just in RDF-enabled tools and browsers. Thus it's great for authoring and debugging and when your URIs are exposed to a lot of neophytes (e.g. DBpedia, FOAF, DC). On the other hand, content negotiation is very hard to get right, the devil is in the details and it has turned out to be quite an interop hassle in practice. So, CN should be thought of as icing on the cake, but not a requirement for publishing RDF.&lt;br /&gt;
* Rule of thumb: If your server solution does CN, then do CN. Otherwise, getting it right will be too much effort.&lt;br /&gt;
* Keep in mind the advice from [http://www.w3.org/2001/tag/doc/alternatives-discovery.html On Linking Alternative Representations]: Provide links between different variants to make them all accessible. This means, if some HTML can be returned in response to RDF/HTML negotiation, there should be an RDF icon nearby, which points to the RDF variant.&lt;br /&gt;
&lt;br /&gt;
== Blank nodes ==&lt;br /&gt;
* Should be avoided in general. Using a blank node is appropriate if the publisher thinks that no one should ever care about this resource except in the context of looking at another, identified, resource in the same RDF document, e.g. a geo:Point that exists solely to give the location of another resource. Another situation where a blank node would be appropriate is when used as an existential variable, but I've never seen them used that way in a Linked Data context.&lt;br /&gt;
&lt;br /&gt;
== Linked Data ==&lt;br /&gt;
* Have rdfs:label (or a subclass thereof) on everything, always&lt;br /&gt;
* Have rdfs:label for the document URI, always&lt;br /&gt;
* Have a foaf:primaryTopic triple connecting document URI and main resource, always&lt;br /&gt;
* Have as much dc: metadata as possible on the document URI&lt;br /&gt;
* Think hard about possible external links, to other web pages and other RDF documents and entities. Provide as many as possible. These make all the difference.&lt;br /&gt;
&lt;br /&gt;
== URI design ==&lt;br /&gt;
* RDF URIs are always case sensitive, while from HTTP's point of view, some parts of the URI can change without changing any behaviour. So, be clear about the case of your URIs and stick to it once a decision is made. If in doubt, use as much lowercase as possible. (Story: L3S changed case of the domain name in their URIs, broke a Semantic Web Pipes demo.)&lt;br /&gt;
&lt;br /&gt;
= Web Architecture =&lt;br /&gt;
&lt;br /&gt;
== Information resources ==&lt;br /&gt;
* [http://lists.w3.org/Archives/Public/www-tag/2007Sep/0123.html From Harry Halpin]: “If there is a URI that is used to identify a resource one would want to make logical statements about, and these statements do not apply to possible representations of that resource, then one should use the &amp;quot;hash&amp;quot; or 303 redirection to separate  these URIs.”&lt;/div&gt;</description>
			<pubDate>Thu, 09 Jun 2011 09:19:04 GMT</pubDate>			<dc:creator>Rcygania2</dc:creator>			<comments>http://www.w3.org/wiki/User_talk:Rcygania2/RulesOfThumb</comments>		</item>
		<item>
			<title>User:Rcygania2/RulesOfThumb</title>
			<link>http://www.w3.org/wiki/User:Rcygania2/RulesOfThumb</link>
			<description>&lt;p&gt;Rcygania2:&amp;#32;/* Designing vocabularies and ontologies */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;A loose collection of [[SemWeb]]-related rules of thumb that I would propose as good practice.&lt;br /&gt;
&lt;br /&gt;
= Modelling a dataset in RDF =&lt;br /&gt;
&lt;br /&gt;
== Labels ==&lt;br /&gt;
* Make sure that everything has an rdfs:label, either directly specified, or by using some property that is defined as a subproperty of rdfs:label&lt;br /&gt;
* Don't be overly concerned with ambiguous labels; just consider the resource in isolation. That's because labels cannot do the job of disambiguation anyway, and trying to do it results in artificial and awkward labels.&lt;br /&gt;
&lt;br /&gt;
== Language tags ==&lt;br /&gt;
* On untyped literals, if the literal is likely to be understood only by speakers of a single language, then add a language tag. If it is likely to work for speakers of many languages, keep it without a language tag. If the file or dataset has only a few exceptions, then it is perhaps better to go for consistency and mark them the same way as the rest of the file.&lt;br /&gt;
* If you publish in multiple languages, then perhaps it's a good idea to include a plain literal in a “default language” without a language tag, to make SPARQLing easy.&lt;br /&gt;
&lt;br /&gt;
== Datatypes ==&lt;br /&gt;
* Avoid xsd:string. Just use a plain literal.&lt;br /&gt;
* For numbers, prefer xsd:decimal and xsd:integer because they are not restricted in accuracy/range.&lt;br /&gt;
* Avoid defining custom datatypes if you can. Better bake the literal semantics into properties.&lt;br /&gt;
* Issue: SKOS demands custom datatypes for skos:notation. Just ignore that?&lt;br /&gt;
* For units of measurement, prefer a pattern such as: ex:length [ ex:meter 5.21 ]&lt;br /&gt;
&lt;br /&gt;
= Designing vocabularies and ontologies =&lt;br /&gt;
&lt;br /&gt;
@@@ Interesting advice from TimBL: http://lists.w3.org/Archives/Public/public-lod/2011Apr/0282.html&lt;br /&gt;
&lt;br /&gt;
== Naming of properties ==&lt;br /&gt;
* Properties that point to documents (information resources) should have names that announce this fact, e.g. userProfile, userPage, userList, eventRecord, eventForm&lt;br /&gt;
* Relationship nouns make good propery names, e.g “parent” is better than “hasParent” or “isParentOf” (as per TimBL)&lt;br /&gt;
 Focus on one problem::&lt;br /&gt;
* [from an email to rdf-schema-dev on 2008-04-05] Some random half-formed thoughts: It's good if the vocabulary covers all my needs for a given problem. It's good if the vocabulary doesn't contain much extra stuff that I don't need to solve my problem. It's good if the purpose and coverage of the vocabulary can be conveyed in a short term or phrase (e.g. “document metadata” or “issue tracking”). It's good if the level of abstraction is consistent throughout the vocabulary, e.g. don't mix high-level concepts like Service and Container into your down-to-earth photo annotation vocabulary.&lt;br /&gt;
 Provide excellent documentation::&lt;br /&gt;
* [from an email to rdf-schema-dev on 2008-04-05] Random thoughts again: Some introductory narrative. A bunch of good examples for typical usages of the vocabulary. An UML-style overview diagram if the vocab has more than a few classes. Some tutorial-style text. An excellent reference section with all terms and notes about how they are supposed to be used (including notes on what they are NOT supposed to be used for).&lt;br /&gt;
&lt;br /&gt;
== Namespace URIs ==&lt;br /&gt;
From danbri in an email to the DC Architecture list, 30 March 2010 09:40:47 IST:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;blockquote&amp;gt;My current preference / advice (for new work) is for the managers of&lt;br /&gt;
each serious namespace to invest in a distinct domain name for it, and&lt;br /&gt;
for us as a community to come up with social machinery for 'watching&lt;br /&gt;
each other's backs' to ensure that the domains are kept in good&lt;br /&gt;
working order, fees are paid, etc. Sometimes an additional level of&lt;br /&gt;
indirection can add as much risk as it saves.  Initially when I bought&lt;br /&gt;
xmlns.com I have idea it could be a home for lots of namespaces, and&lt;br /&gt;
then the more I thought about it, the less I liked that idea. Each new&lt;br /&gt;
namespace added to the bucket brings some risk to the others using the&lt;br /&gt;
domain, by adding to the complexity and burden for subsequent&lt;br /&gt;
maintainers. So I think a proliferation of independent domain names,&lt;br /&gt;
while painful in its own way, spreads the risk...&amp;lt;/blockquote&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Later in the thread:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;blockquote&amp;gt;Rule of thumb - when wondering what info to include in a namespace&lt;br /&gt;
URI, ... try to leave *out* as much as possible&amp;lt;/blockquote&amp;gt;&lt;br /&gt;
&lt;br /&gt;
For hash namespaces, the RDF document containing the vocabulary should be typed as owl:Ontology and should be the target of any rdfs:isDefinedBy statements. Note, the RDF document's URI is ''the namespace URI without the trailing hash''.&lt;br /&gt;
&lt;br /&gt;
= Publishing RDF on the Web =&lt;br /&gt;
&lt;br /&gt;
== Metadata in RDF documents ==&lt;br /&gt;
* Every RDF document has ''some'' relation to the thing(s) it talks about. It is useful to explicitly state what that relation is. For example, in my FOAF file I state that I'm both the foaf:maker and the foaf:primaryTopic of the file. Other useful properties are: rdfs:isDefinedBy, foaf:topic. A concrete benefit is that consumers can pick out the “important” things from the graph.&lt;br /&gt;
&lt;br /&gt;
== HTML descriptions of URI-denoted things ==&lt;br /&gt;
* To create trust into the stability, reliability and availability of a URI, its HTML description should explicitly state the URI, it should contain an explicit ''Statement of Purpose'' to the effect that the URI is intended to be used as an identifier for the thing, and it should contain a ''Publisher Identification''. It must provide sufficient information to enable human users to know exactly what is being referred to. (This is inspired by [http://www.oasis-open.org/committees/download.php/3050/pubsubj-pt1-1.02-cs.pdf Published Subjects].)&lt;br /&gt;
&lt;br /&gt;
== Content negotiation ==&lt;br /&gt;
* The benefit of CN is that all URIs also work in a standard Web browser, not just in RDF-enabled tools and browsers. Thus it's great for authoring and debugging and when your URIs are exposed to a lot of neophytes (e.g. DBpedia, FOAF, DC). On the other hand, content negotiation is very hard to get right, the devil is in the details and it has turned out to be quite an interop hassle in practice. So, CN should be thought of as icing on the cake, but not a requirement for publishing RDF.&lt;br /&gt;
* Rule of thumb: If your server solution does CN, then do CN. Otherwise, getting it right will be too much effort.&lt;br /&gt;
* Keep in mind the advice from [http://www.w3.org/2001/tag/doc/alternatives-discovery.html On Linking Alternative Representations]: Provide links between different variants to make them all accessible. This means, if some HTML can be returned in response to RDF/HTML negotiation, there should be an RDF icon nearby, which points to the RDF variant.&lt;br /&gt;
&lt;br /&gt;
== Blank nodes ==&lt;br /&gt;
* Should be avoided in general. Using a blank node is appropriate if the publisher thinks that no one should ever care about this resource except in the context of looking at another, identified, resource in the same RDF document, e.g. a geo:Point that exists solely to give the location of another resource. Another situation where a blank node would be appropriate is when used as an existential variable, but I've never seen them used that way in a Linked Data context.&lt;br /&gt;
&lt;br /&gt;
== Linked Data ==&lt;br /&gt;
* Have rdfs:label (or a subclass thereof) on everything, always&lt;br /&gt;
* Have rdfs:label for the document URI, always&lt;br /&gt;
* Have a foaf:primaryTopic triple connecting document URI and main resource, always&lt;br /&gt;
* Have as much dc: metadata as possible on the document URI&lt;br /&gt;
* Think hard about possible external links, to other web pages and other RDF documents and entities. Provide as many as possible. These make all the difference.&lt;br /&gt;
&lt;br /&gt;
== URI design ==&lt;br /&gt;
* RDF URIs are always case sensitive, while from HTTP's point of view, some parts of the URI can change without changing any behaviour. So, be clear about the case of your URIs and stick to it once a decision is made. If in doubt, use as much lowercase as possible. (Story: L3S changed case of the domain name in their URIs, broke a Semantic Web Pipes demo.)&lt;br /&gt;
&lt;br /&gt;
= Web Architecture =&lt;br /&gt;
&lt;br /&gt;
== Information resources ==&lt;br /&gt;
* [http://lists.w3.org/Archives/Public/www-tag/2007Sep/0123.html From Harry Halpin]: “If there is a URI that is used to identify a resource one would want to make logical statements about, and these statements do not apply to possible representations of that resource, then one should use the &amp;quot;hash&amp;quot; or 303 redirection to separate  these URIs.”&lt;/div&gt;</description>
			<pubDate>Tue, 26 Apr 2011 19:12:48 GMT</pubDate>			<dc:creator>Rcygania2</dc:creator>			<comments>http://www.w3.org/wiki/User_talk:Rcygania2/RulesOfThumb</comments>		</item>
		<item>
			<title>User:Rcygania2/RulesOfThumb</title>
			<link>http://www.w3.org/wiki/User:Rcygania2/RulesOfThumb</link>
			<description>&lt;p&gt;Rcygania2:&amp;#32;/* Modelling a dataset in RDF */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;A loose collection of [[SemWeb]]-related rules of thumb that I would propose as good practice.&lt;br /&gt;
&lt;br /&gt;
= Modelling a dataset in RDF =&lt;br /&gt;
&lt;br /&gt;
== Labels ==&lt;br /&gt;
* Make sure that everything has an rdfs:label, either directly specified, or by using some property that is defined as a subproperty of rdfs:label&lt;br /&gt;
* Don't be overly concerned with ambiguous labels; just consider the resource in isolation. That's because labels cannot do the job of disambiguation anyway, and trying to do it results in artificial and awkward labels.&lt;br /&gt;
&lt;br /&gt;
== Language tags ==&lt;br /&gt;
* On untyped literals, if the literal is likely to be understood only by speakers of a single language, then add a language tag. If it is likely to work for speakers of many languages, keep it without a language tag. If the file or dataset has only a few exceptions, then it is perhaps better to go for consistency and mark them the same way as the rest of the file.&lt;br /&gt;
* If you publish in multiple languages, then perhaps it's a good idea to include a plain literal in a “default language” without a language tag, to make SPARQLing easy.&lt;br /&gt;
&lt;br /&gt;
== Datatypes ==&lt;br /&gt;
* Avoid xsd:string. Just use a plain literal.&lt;br /&gt;
* For numbers, prefer xsd:decimal and xsd:integer because they are not restricted in accuracy/range.&lt;br /&gt;
* Avoid defining custom datatypes if you can. Better bake the literal semantics into properties.&lt;br /&gt;
* Issue: SKOS demands custom datatypes for skos:notation. Just ignore that?&lt;br /&gt;
* For units of measurement, prefer a pattern such as: ex:length [ ex:meter 5.21 ]&lt;br /&gt;
&lt;br /&gt;
= Designing vocabularies and ontologies =&lt;br /&gt;
&lt;br /&gt;
== Naming of properties ==&lt;br /&gt;
* Properties that point to documents (information resources) should have names that announce this fact, e.g. userProfile, userPage, userList, eventRecord, eventForm&lt;br /&gt;
* Relationship nouns make good propery names, e.g “parent” is better than “hasParent” or “isParentOf” (as per TimBL)&lt;br /&gt;
 Focus on one problem::&lt;br /&gt;
* [from an email to rdf-schema-dev on 2008-04-05] Some random half-formed thoughts: It's good if the vocabulary covers all my needs for a given problem. It's good if the vocabulary doesn't contain much extra stuff that I don't need to solve my problem. It's good if the purpose and coverage of the vocabulary can be conveyed in a short term or phrase (e.g. “document metadata” or “issue tracking”). It's good if the level of abstraction is consistent throughout the vocabulary, e.g. don't mix high-level concepts like Service and Container into your down-to-earth photo annotation vocabulary.&lt;br /&gt;
 Provide excellent documentation::&lt;br /&gt;
* [from an email to rdf-schema-dev on 2008-04-05] Random thoughts again: Some introductory narrative. A bunch of good examples for typical usages of the vocabulary. An UML-style overview diagram if the vocab has more than a few classes. Some tutorial-style text. An excellent reference section with all terms and notes about how they are supposed to be used (including notes on what they are NOT supposed to be used for).&lt;br /&gt;
&lt;br /&gt;
== Namespace URIs ==&lt;br /&gt;
From danbri in an email to the DC Architecture list, 30 March 2010 09:40:47 IST:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;blockquote&amp;gt;My current preference / advice (for new work) is for the managers of&lt;br /&gt;
each serious namespace to invest in a distinct domain name for it, and&lt;br /&gt;
for us as a community to come up with social machinery for 'watching&lt;br /&gt;
each other's backs' to ensure that the domains are kept in good&lt;br /&gt;
working order, fees are paid, etc. Sometimes an additional level of&lt;br /&gt;
indirection can add as much risk as it saves.  Initially when I bought&lt;br /&gt;
xmlns.com I have idea it could be a home for lots of namespaces, and&lt;br /&gt;
then the more I thought about it, the less I liked that idea. Each new&lt;br /&gt;
namespace added to the bucket brings some risk to the others using the&lt;br /&gt;
domain, by adding to the complexity and burden for subsequent&lt;br /&gt;
maintainers. So I think a proliferation of independent domain names,&lt;br /&gt;
while painful in its own way, spreads the risk...&amp;lt;/blockquote&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Later in the thread:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;blockquote&amp;gt;Rule of thumb - when wondering what info to include in a namespace&lt;br /&gt;
URI, ... try to leave *out* as much as possible&amp;lt;/blockquote&amp;gt;&lt;br /&gt;
&lt;br /&gt;
For hash namespaces, the RDF document containing the vocabulary should be typed as owl:Ontology and should be the target of any rdfs:isDefinedBy statements. Note, the RDF document's URI is ''the namespace URI without the trailing hash''.&lt;br /&gt;
&lt;br /&gt;
= Publishing RDF on the Web =&lt;br /&gt;
&lt;br /&gt;
== Metadata in RDF documents ==&lt;br /&gt;
* Every RDF document has ''some'' relation to the thing(s) it talks about. It is useful to explicitly state what that relation is. For example, in my FOAF file I state that I'm both the foaf:maker and the foaf:primaryTopic of the file. Other useful properties are: rdfs:isDefinedBy, foaf:topic. A concrete benefit is that consumers can pick out the “important” things from the graph.&lt;br /&gt;
&lt;br /&gt;
== HTML descriptions of URI-denoted things ==&lt;br /&gt;
* To create trust into the stability, reliability and availability of a URI, its HTML description should explicitly state the URI, it should contain an explicit ''Statement of Purpose'' to the effect that the URI is intended to be used as an identifier for the thing, and it should contain a ''Publisher Identification''. It must provide sufficient information to enable human users to know exactly what is being referred to. (This is inspired by [http://www.oasis-open.org/committees/download.php/3050/pubsubj-pt1-1.02-cs.pdf Published Subjects].)&lt;br /&gt;
&lt;br /&gt;
== Content negotiation ==&lt;br /&gt;
* The benefit of CN is that all URIs also work in a standard Web browser, not just in RDF-enabled tools and browsers. Thus it's great for authoring and debugging and when your URIs are exposed to a lot of neophytes (e.g. DBpedia, FOAF, DC). On the other hand, content negotiation is very hard to get right, the devil is in the details and it has turned out to be quite an interop hassle in practice. So, CN should be thought of as icing on the cake, but not a requirement for publishing RDF.&lt;br /&gt;
* Rule of thumb: If your server solution does CN, then do CN. Otherwise, getting it right will be too much effort.&lt;br /&gt;
* Keep in mind the advice from [http://www.w3.org/2001/tag/doc/alternatives-discovery.html On Linking Alternative Representations]: Provide links between different variants to make them all accessible. This means, if some HTML can be returned in response to RDF/HTML negotiation, there should be an RDF icon nearby, which points to the RDF variant.&lt;br /&gt;
&lt;br /&gt;
== Blank nodes ==&lt;br /&gt;
* Should be avoided in general. Using a blank node is appropriate if the publisher thinks that no one should ever care about this resource except in the context of looking at another, identified, resource in the same RDF document, e.g. a geo:Point that exists solely to give the location of another resource. Another situation where a blank node would be appropriate is when used as an existential variable, but I've never seen them used that way in a Linked Data context.&lt;br /&gt;
&lt;br /&gt;
== Linked Data ==&lt;br /&gt;
* Have rdfs:label (or a subclass thereof) on everything, always&lt;br /&gt;
* Have rdfs:label for the document URI, always&lt;br /&gt;
* Have a foaf:primaryTopic triple connecting document URI and main resource, always&lt;br /&gt;
* Have as much dc: metadata as possible on the document URI&lt;br /&gt;
* Think hard about possible external links, to other web pages and other RDF documents and entities. Provide as many as possible. These make all the difference.&lt;br /&gt;
&lt;br /&gt;
== URI design ==&lt;br /&gt;
* RDF URIs are always case sensitive, while from HTTP's point of view, some parts of the URI can change without changing any behaviour. So, be clear about the case of your URIs and stick to it once a decision is made. If in doubt, use as much lowercase as possible. (Story: L3S changed case of the domain name in their URIs, broke a Semantic Web Pipes demo.)&lt;br /&gt;
&lt;br /&gt;
= Web Architecture =&lt;br /&gt;
&lt;br /&gt;
== Information resources ==&lt;br /&gt;
* [http://lists.w3.org/Archives/Public/www-tag/2007Sep/0123.html From Harry Halpin]: “If there is a URI that is used to identify a resource one would want to make logical statements about, and these statements do not apply to possible representations of that resource, then one should use the &amp;quot;hash&amp;quot; or 303 redirection to separate  these URIs.”&lt;/div&gt;</description>
			<pubDate>Fri, 08 Apr 2011 12:20:06 GMT</pubDate>			<dc:creator>Rcygania2</dc:creator>			<comments>http://www.w3.org/wiki/User_talk:Rcygania2/RulesOfThumb</comments>		</item>
		<item>
			<title>User:Rcygania2/RulesOfThumb</title>
			<link>http://www.w3.org/wiki/User:Rcygania2/RulesOfThumb</link>
			<description>&lt;p&gt;Rcygania2:&amp;#32;/* Language tags */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;A loose collection of [[SemWeb]]-related rules of thumb that I would propose as good practice.&lt;br /&gt;
&lt;br /&gt;
= Modelling a dataset in RDF =&lt;br /&gt;
&lt;br /&gt;
== Labels ==&lt;br /&gt;
* Make sure that everything has an rdfs:label, either directly specified, or by using some property that is defined as a subproperty of rdfs:label&lt;br /&gt;
* Don't be overly concerned with ambiguous labels; just consider the resource in isolation. That's because labels cannot do the job of disambiguation anyway, and trying to do it results in artificial and awkward labels.&lt;br /&gt;
&lt;br /&gt;
== Language tags ==&lt;br /&gt;
* On untyped literals, if the literal is likely to be understood only by speakers of a single language, then add a language tag. If it is likely to work for speakers of many languages, keep it without a language tag. If the file or dataset has only a few exceptions, then it is perhaps better to go for consistency and mark them the same way as the rest of the file.&lt;br /&gt;
* If you publish in multiple languages, then perhaps it's a good idea to include a plain literal in a “default language” without a language tag, to make SPARQLing easy.&lt;br /&gt;
&lt;br /&gt;
= Designing vocabularies and ontologies =&lt;br /&gt;
&lt;br /&gt;
== Naming of properties ==&lt;br /&gt;
* Properties that point to documents (information resources) should have names that announce this fact, e.g. userProfile, userPage, userList, eventRecord, eventForm&lt;br /&gt;
* Relationship nouns make good propery names, e.g “parent” is better than “hasParent” or “isParentOf” (as per TimBL)&lt;br /&gt;
 Focus on one problem::&lt;br /&gt;
* [from an email to rdf-schema-dev on 2008-04-05] Some random half-formed thoughts: It's good if the vocabulary covers all my needs for a given problem. It's good if the vocabulary doesn't contain much extra stuff that I don't need to solve my problem. It's good if the purpose and coverage of the vocabulary can be conveyed in a short term or phrase (e.g. “document metadata” or “issue tracking”). It's good if the level of abstraction is consistent throughout the vocabulary, e.g. don't mix high-level concepts like Service and Container into your down-to-earth photo annotation vocabulary.&lt;br /&gt;
 Provide excellent documentation::&lt;br /&gt;
* [from an email to rdf-schema-dev on 2008-04-05] Random thoughts again: Some introductory narrative. A bunch of good examples for typical usages of the vocabulary. An UML-style overview diagram if the vocab has more than a few classes. Some tutorial-style text. An excellent reference section with all terms and notes about how they are supposed to be used (including notes on what they are NOT supposed to be used for).&lt;br /&gt;
&lt;br /&gt;
== Namespace URIs ==&lt;br /&gt;
From danbri in an email to the DC Architecture list, 30 March 2010 09:40:47 IST:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;blockquote&amp;gt;My current preference / advice (for new work) is for the managers of&lt;br /&gt;
each serious namespace to invest in a distinct domain name for it, and&lt;br /&gt;
for us as a community to come up with social machinery for 'watching&lt;br /&gt;
each other's backs' to ensure that the domains are kept in good&lt;br /&gt;
working order, fees are paid, etc. Sometimes an additional level of&lt;br /&gt;
indirection can add as much risk as it saves.  Initially when I bought&lt;br /&gt;
xmlns.com I have idea it could be a home for lots of namespaces, and&lt;br /&gt;
then the more I thought about it, the less I liked that idea. Each new&lt;br /&gt;
namespace added to the bucket brings some risk to the others using the&lt;br /&gt;
domain, by adding to the complexity and burden for subsequent&lt;br /&gt;
maintainers. So I think a proliferation of independent domain names,&lt;br /&gt;
while painful in its own way, spreads the risk...&amp;lt;/blockquote&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Later in the thread:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;blockquote&amp;gt;Rule of thumb - when wondering what info to include in a namespace&lt;br /&gt;
URI, ... try to leave *out* as much as possible&amp;lt;/blockquote&amp;gt;&lt;br /&gt;
&lt;br /&gt;
For hash namespaces, the RDF document containing the vocabulary should be typed as owl:Ontology and should be the target of any rdfs:isDefinedBy statements. Note, the RDF document's URI is ''the namespace URI without the trailing hash''.&lt;br /&gt;
&lt;br /&gt;
= Publishing RDF on the Web =&lt;br /&gt;
&lt;br /&gt;
== Metadata in RDF documents ==&lt;br /&gt;
* Every RDF document has ''some'' relation to the thing(s) it talks about. It is useful to explicitly state what that relation is. For example, in my FOAF file I state that I'm both the foaf:maker and the foaf:primaryTopic of the file. Other useful properties are: rdfs:isDefinedBy, foaf:topic. A concrete benefit is that consumers can pick out the “important” things from the graph.&lt;br /&gt;
&lt;br /&gt;
== HTML descriptions of URI-denoted things ==&lt;br /&gt;
* To create trust into the stability, reliability and availability of a URI, its HTML description should explicitly state the URI, it should contain an explicit ''Statement of Purpose'' to the effect that the URI is intended to be used as an identifier for the thing, and it should contain a ''Publisher Identification''. It must provide sufficient information to enable human users to know exactly what is being referred to. (This is inspired by [http://www.oasis-open.org/committees/download.php/3050/pubsubj-pt1-1.02-cs.pdf Published Subjects].)&lt;br /&gt;
&lt;br /&gt;
== Content negotiation ==&lt;br /&gt;
* The benefit of CN is that all URIs also work in a standard Web browser, not just in RDF-enabled tools and browsers. Thus it's great for authoring and debugging and when your URIs are exposed to a lot of neophytes (e.g. DBpedia, FOAF, DC). On the other hand, content negotiation is very hard to get right, the devil is in the details and it has turned out to be quite an interop hassle in practice. So, CN should be thought of as icing on the cake, but not a requirement for publishing RDF.&lt;br /&gt;
* Rule of thumb: If your server solution does CN, then do CN. Otherwise, getting it right will be too much effort.&lt;br /&gt;
* Keep in mind the advice from [http://www.w3.org/2001/tag/doc/alternatives-discovery.html On Linking Alternative Representations]: Provide links between different variants to make them all accessible. This means, if some HTML can be returned in response to RDF/HTML negotiation, there should be an RDF icon nearby, which points to the RDF variant.&lt;br /&gt;
&lt;br /&gt;
== Blank nodes ==&lt;br /&gt;
* Should be avoided in general. Using a blank node is appropriate if the publisher thinks that no one should ever care about this resource except in the context of looking at another, identified, resource in the same RDF document, e.g. a geo:Point that exists solely to give the location of another resource. Another situation where a blank node would be appropriate is when used as an existential variable, but I've never seen them used that way in a Linked Data context.&lt;br /&gt;
&lt;br /&gt;
== Linked Data ==&lt;br /&gt;
* Have rdfs:label (or a subclass thereof) on everything, always&lt;br /&gt;
* Have rdfs:label for the document URI, always&lt;br /&gt;
* Have a foaf:primaryTopic triple connecting document URI and main resource, always&lt;br /&gt;
* Have as much dc: metadata as possible on the document URI&lt;br /&gt;
* Think hard about possible external links, to other web pages and other RDF documents and entities. Provide as many as possible. These make all the difference.&lt;br /&gt;
&lt;br /&gt;
== URI design ==&lt;br /&gt;
* RDF URIs are always case sensitive, while from HTTP's point of view, some parts of the URI can change without changing any behaviour. So, be clear about the case of your URIs and stick to it once a decision is made. If in doubt, use as much lowercase as possible. (Story: L3S changed case of the domain name in their URIs, broke a Semantic Web Pipes demo.)&lt;br /&gt;
&lt;br /&gt;
= Web Architecture =&lt;br /&gt;
&lt;br /&gt;
== Information resources ==&lt;br /&gt;
* [http://lists.w3.org/Archives/Public/www-tag/2007Sep/0123.html From Harry Halpin]: “If there is a URI that is used to identify a resource one would want to make logical statements about, and these statements do not apply to possible representations of that resource, then one should use the &amp;quot;hash&amp;quot; or 303 redirection to separate  these URIs.”&lt;/div&gt;</description>
			<pubDate>Fri, 08 Apr 2011 12:15:57 GMT</pubDate>			<dc:creator>Rcygania2</dc:creator>			<comments>http://www.w3.org/wiki/User_talk:Rcygania2/RulesOfThumb</comments>		</item>
		<item>
			<title>SweoIG/TaskForces/CommunityProjects/LinkingOpenData/HyderabadGathering</title>
			<link>http://www.w3.org/wiki/SweoIG/TaskForces/CommunityProjects/LinkingOpenData/HyderabadGathering</link>
			<description>&lt;p&gt;Rcygania2:&amp;#32;/* Participants */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;__NOTOC__&lt;br /&gt;
== Hyderabad Linked Data Community Gathering ==&lt;br /&gt;
&lt;br /&gt;
=== Date, Time, Venue ===&lt;br /&gt;
&lt;br /&gt;
'''What''': The latest in the ongoing tradition of Linked Data gatherings, combined with workshop dinner for [http://events.linkeddata.org/ldow2011 LDOW2011]. &lt;br /&gt;
&lt;br /&gt;
'''Where''': TBD&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
'''When''': 29th March 2011, after [http://events.linkeddata.org/ldow2011 LDOW2011].&lt;br /&gt;
&lt;br /&gt;
=== Participants ===&lt;br /&gt;
&lt;br /&gt;
* [http://www.eurecom.fr/~troncy/ Raphaël Troncy] (EURECOM)&lt;br /&gt;
* [http://twitter.com/tomayac Thomas Steiner] (Google)&lt;br /&gt;
* [http://harth.org/andreas/ Andreas Harth] (KIT)&lt;br /&gt;
* [http://multimedialab.elis.ugent.be/dvdeurse Davy Van Deursen] (Ghent University - IBBT)&lt;br /&gt;
* [http://olafhartig.de/ Olaf Hartig] (Humboldt-Universität zu Berlin)&lt;br /&gt;
* [http://www.inf.puc-rio.br/~dschwabe/ Daniel Schwabe] (PUC-Rio)&lt;br /&gt;
* [http://www.w3.org/wiki/User:Hglaser3 Hugh Glaser] (ECS - Southampton)&lt;br /&gt;
* [http://richard.cyganiak.de/ Richard Cyganiak] (DERI, NUI Galway)&lt;br /&gt;
* ADD yourself here&lt;/div&gt;</description>
			<pubDate>Mon, 21 Mar 2011 11:34:16 GMT</pubDate>			<dc:creator>Rcygania2</dc:creator>			<comments>http://www.w3.org/wiki/Talk:SweoIG/TaskForces/CommunityProjects/LinkingOpenData/HyderabadGathering</comments>		</item>
		<item>
			<title>ConverterToRdf</title>
			<link>http://www.w3.org/wiki/ConverterToRdf</link>
			<description>&lt;p&gt;Rcygania2:&amp;#32;mention any23 and RDF Extension for Google Refine&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;__NOTOC__&lt;br /&gt;
A Converter to RDF is a tool which converts application data from an application-specific format into RDF for use with RDF tools and integration with other data. Converters may be part of a one-time migration effort, or part of a running system which provides a semantic web view of a given application. See also: RDFImportersAndAdapters&lt;br /&gt;
&lt;br /&gt;
Please add converters as you make them or hear of them.&lt;br /&gt;
&lt;br /&gt;
== Formats ==&lt;br /&gt;
&lt;br /&gt;
in alphabetical order:&lt;br /&gt;
&lt;br /&gt;
=== [[BibTex]] ===&lt;br /&gt;
&lt;br /&gt;
[[BibTex]] is the format for bibliographic references in TeX.&lt;br /&gt;
&lt;br /&gt;
* [http://simile.mit.edu/repository/RDFizers/bibtex2rdf/ bibtex2rdf] transforms BibTEX files into RDF/XML. (Simile)&lt;br /&gt;
* [http://www.l3s.de/~siberski/bibtex2rdf/ bibtex2rdf] - A configurable BibTeX to RDF Converter by  Wolf Siberski. &lt;br /&gt;
* [http://www.cs.vu.nl/%7Emcaklein/bib2rdf/ An online service] set up at the Vrije Universiteit in Amsterdam, the Netherlands, following the [http://ontoweb.aifb.uni-karlsruhe.de/ OntoWeb portal] vocabulary. The [http://www.cs.vu.nl/%7Emcaklein/bib2rdf/bib2rdf perl source] can also be downloaded.&lt;br /&gt;
* [http://www.aifb.uni-karlsruhe.de/WBS/pha/bib/index.html Java BibTeX-To-RDF Converter] based on the [http://ontobroker.semanticweb.org/ontos/swrc.html SWRC] terminology.&lt;br /&gt;
&lt;br /&gt;
=== Bittorrent ===&lt;br /&gt;
&lt;br /&gt;
* http://www.inf.unideb.hu/~jeszy/rdfizers is alas now 404 (in 2007). This was a link from RDFizers but may be incorrect.&lt;br /&gt;
&lt;br /&gt;
=== Debian  ===&lt;br /&gt;
&lt;br /&gt;
The package information in Debian and similar systems (Ubuntu, Fink, etc), with its general usefulness and its graph-like nature, is a clear candidate for conversion to RDF.&lt;br /&gt;
&lt;br /&gt;
See [http://blog.drinsama.de/erich/en/xml/2007011204-rdf-representation-of-packages VitaVoni blog] about this.&lt;br /&gt;
&lt;br /&gt;
* [http://www.w3.org/2000/10/swap/util/fink2n3.py finkn3.py] Takes Fink (OS-X port of Debian packaging) dependencies and converts to to RDF/N3. (SWAP) No idea whether this would be a quick hack to export debian data.&lt;br /&gt;
* [http://github.com/nbarrientos/steamy STEAMY] converts Debian packages to RDF.&lt;br /&gt;
&lt;br /&gt;
=== Email (RFC822 headers) ===&lt;br /&gt;
&lt;br /&gt;
* [http://simile.mit.edu/repository/RDFizers/email2rdf/ email2rdf ] transforms email mbox files into RDF/XML. (Simile)&lt;br /&gt;
* [http://www.w3.org/2000/04/maillog2rdf/aboutMsg.py aboutMsg.py] converts email metadata to RDF. (SWAP)&lt;br /&gt;
* [http://swaml.berlios.de/ SWAML] transforms a mailing list into RDF/XML and XHTML+RDFa using [[SIOC]].&lt;br /&gt;
** And [http://linkedmarkmail.wikier.org/ LinkedMarkMail] live transforms into RDF/XML the mailing lists' archives indexed by [http://markmail.org/ MarkMail].&lt;br /&gt;
* [http://search.cpan.org/dist/Email-MIME-XMTP/ Email::MIME::XMTP] Perl extension to read and write [http://www.openhealth.org/xmtp/ XMTP] &lt;br /&gt;
* [http://aperture.sourceforge.net aperture.sf.net] IMAP crawler&lt;br /&gt;
&lt;br /&gt;
There are others in this vein which run over IMAP or mailbox files.@@&lt;br /&gt;
&lt;br /&gt;
=== Excel ===&lt;br /&gt;
&lt;br /&gt;
* Cambridge Semantics' [http://www.cambridgesemantics.com/products/anzo_for_excel Anzo for Excel] extracts RDF data from Excel spreadsheets while keeping the spreadsheet in-sync with the underlying data as things change&lt;br /&gt;
* [http://www.topbraidcomposer.com TopBraid Composer] can convert Excel spreadsheets into instances of an RDF schema.&lt;br /&gt;
* XLWrap, [http://xlwrap.sourceforge.net] wraps spreadsheets (including cross tables) to arbitrary RDF graphs; supports Excel/OpenDocument/CSV streamed processing, local/HTTP loading, expressions similar to Excel/OpenOffice Calc, custom functions, usage via API or SPARQL endpoint&lt;br /&gt;
* [http://www.tao-project.eu/researchanddevelopment/demosanddownloads/RDBToOnto.html RDBToOnto], see description below under SQL section. &lt;br /&gt;
=== EXIF ===&lt;br /&gt;
&lt;br /&gt;
See JPEG.&lt;br /&gt;
&lt;br /&gt;
=== File Systems ===&lt;br /&gt;
&lt;br /&gt;
* [http://www.cs.univie.ac.at/publication.php?pid=5750 TripFS] exposes an entire file system as linked data, tracks changes, and links files to external data sources.&lt;br /&gt;
&lt;br /&gt;
=== Flickr data ===&lt;br /&gt;
&lt;br /&gt;
* Dave Becketts [http://librdf.org/flickcurl/ flickurl] library can access Flickr information (including machine tags) and convert it to RDF &lt;br /&gt;
&lt;br /&gt;
=== Flat files ===&lt;br /&gt;
&lt;br /&gt;
Unix systems store data (such as /etc/passwd) in flat files with comma separation.&lt;br /&gt;
&lt;br /&gt;
* [http://simile.mit.edu/repository/RDFizers/flat2rdf/ flat2rdf]  converts classic unix text database files, like /etc/passwd, into RDF/N3 (Simile)&lt;br /&gt;
* [http://www.w3.org/2000/10/swap/tab2n3.py tab2n3.py]  Takes Tab-separated text (as typically output by all kinds of things including Microsoft Output and Spreadsheets) and converts it to N3, using the column headings to generate property URIs.  (SWAP)&lt;br /&gt;
* [http://www.topbraidcomposer.com TopBraid Composer] can convert tab-separated spreadsheet files into an RDF/OWL class with corresponding properties and instances.&lt;br /&gt;
* XLWrap, [http://xlwrap.sourceforge.net] wraps CSV files (and spreadsheets) to arbitrary RDF graphs; supports local/HTTP loading, expressions similar to Excel/OpenOffice Calc, custom functions, usage via API or SPARQL endpoint&lt;br /&gt;
&lt;br /&gt;
=== GPS ===&lt;br /&gt;
&lt;br /&gt;
* [http://www.hackdiary.com/archives/000040.html garmin2rdf.py] Reads a Garmin GOPS receiver, dumping the contents in RDF/XML. (Matt Biddulph)&lt;br /&gt;
* [http://www.w3.org/2000/10/swap/pim/fromGarmin.py fromGarmin.py] Downloads GPS data from a Garmin on a serial link to an RDF/N3 file. (SWAP)&lt;br /&gt;
&lt;br /&gt;
=== iCalendar ===&lt;br /&gt;
&lt;br /&gt;
iCalendar is an IETF standard for calendar (event and to-do list) data.  &lt;br /&gt;
Icalendar files typically are stored with a .ics extension.&lt;br /&gt;
&lt;br /&gt;
* [http://www.w3.org/2002/12/cal/fromIcal.py fromIcal.py] converts iCalendar form to RDF&lt;br /&gt;
* [http://www.w3.org/2000/10/swap/pim/toIcal.py toIcal.py] converts RDF back into iCalendar.&lt;br /&gt;
* [http://aperture.sourceforge.net aperture.sf.net] java converter for iCalendar included&lt;br /&gt;
* [http://torrez.us/ics2rdf/] iCal to RDF Service&lt;br /&gt;
&lt;br /&gt;
=== Java bytecode ===&lt;br /&gt;
&lt;br /&gt;
* [http://simile.mit.edu/repository/RDFizers/java2rdf/ java2rdf] scans [http://java.sun.com/ java] bytecode for method calls and creates a description of the dependencies between classes and the package/archive encoded in RDF/N3. (Simile)&lt;br /&gt;
&lt;br /&gt;
=== Javadoc ===&lt;br /&gt;
&lt;br /&gt;
* [http://simile.mit.edu/repository/RDFizers/javadoc2rdf/ javadoc2rdf] is a doclet that makes javadoc output metadata about your code (structure of the classes, methods, comments, etc.) encoded in RDF/N3. (Simile) &lt;br /&gt;
&lt;br /&gt;
=== Issue tracking: [http://www.atlassian.com/software/jira/ Jira] ===&lt;br /&gt;
&lt;br /&gt;
* [http://simile.mit.edu/repository/RDFizers/jira2rdf/ jira2rdf]  transforms Atlassian Jira's events about bug reports and issue tracking into RDF/N3.&lt;br /&gt;
&lt;br /&gt;
=== JPEG ===&lt;br /&gt;
&lt;br /&gt;
The metadata within JPEG photo is encoded in the EXIF standard.&lt;br /&gt;
&lt;br /&gt;
* [http://simile.mit.edu/repository/RDFizers/jpeg2rdf/ jpeg2rdf] scans a folder for JPEG files, parses the EXIF and IPCT metadata found in those files and dumps an RDF/N3 representation of it into a file. (Simile)&lt;br /&gt;
* [http://www.w3.org/2000/10/swap/pim/jhead/ An adapted version of jhead] extracts RDF data form the EXIT encoded in JPEG files within a directory. Generates RDF/N3. (SWAP)&lt;br /&gt;
&lt;br /&gt;
=== LDIF ===&lt;br /&gt;
&lt;br /&gt;
This is format used for contact information in LDAP server system. It is for example exported by Thunderbird's address-book.&lt;br /&gt;
&lt;br /&gt;
* [http://www.w3.org/2000/10/swap/pim/ldif2n3.py ldif2n3.py]  Very incomplete, but useful. Generates foaf. Hides email addresses by hashing in the FOAF style if -m command flag is given. (SWAP)&lt;br /&gt;
&lt;br /&gt;
=== Makefile ===&lt;br /&gt;
&lt;br /&gt;
The unix Makefile syntax expresses dependencies between files in a software build.&lt;br /&gt;
&lt;br /&gt;
* [http://www.w3.org/2000/10/swap/util/make2n3.py make2n3.py]  Convert the makefiles in several directories in RDF and merge them to get the big picture. (SWAP)&lt;br /&gt;
&lt;br /&gt;
=== MARC ===&lt;br /&gt;
&lt;br /&gt;
* [http://simile.mit.edu/repository/RDFizers/marcmods2rdf/ marcmods2rdf] &lt;br /&gt;
transforms [http://www.loc.gov/marc/ MARC] records from Z39.2 format into [http://www.loc.gov/standards/mods/ MODS] and then from MODS to an RDF representation of MODS.&lt;br /&gt;
&lt;br /&gt;
=== Meteographical ===&lt;br /&gt;
* [http://inamidst.com/sw/meteo/ Meteo] is UK weather forecast data in RDF, extracted from NOAA's public domain GRIB files. Example: [http://inamidst.com/sw/meteo/rdf/London London].&lt;br /&gt;
&lt;br /&gt;
=== Microformats ===&lt;br /&gt;
* [http://developers.any23.org/ any23] is a Java library for parsing multiple formats to RDF, including many microformats. It is used by [http://sindice.com/ Sindice.com]. The microformats support is [http://sindice.com/developers/microformat detailed in the Sindice.com documentation].&lt;br /&gt;
&lt;br /&gt;
=== Multimedia ===&lt;br /&gt;
&lt;br /&gt;
Following the [http://en.wikipedia.org/wiki/Don't_repeat_yourself DRY principle], a pointer to tools in the realm of multimedia (origin: [http://www.w3.org/2005/Incubator/mmsem MMSEM-XG]):&lt;br /&gt;
&lt;br /&gt;
* [http://www.w3.org/2005/Incubator/mmsem/wiki/Tools_and_Resources Multimedia Semantics Tools]&lt;br /&gt;
&lt;br /&gt;
=== OAI-PMH ===&lt;br /&gt;
&lt;br /&gt;
* [http://simile.mit.edu/repository/RDFizers/oai2rdf/ oai2rdf] harvests an [http://www.openarchives.org/OAI/openarchivesprotocol.html OAI-PMH] repository and transforms the captured metadata in an RDF representation thru pluggable XSLT stylesheets.&lt;br /&gt;
&lt;br /&gt;
=== Outlook ===&lt;br /&gt;
&lt;br /&gt;
Microsoft Outlook contains contact and event data, and so on in a proprietary format.&lt;br /&gt;
&lt;br /&gt;
* [http://www.w3.org/2000/10/swap/pim/lookout.py Lookout.py] convers the Microsoft Outlook calendar and address format into RDF.  (SWAP)&lt;br /&gt;
* [http://aperture.sourceforge.net aperture.sf.net] includes Java crawler for MS Outlook&lt;br /&gt;
&lt;br /&gt;
=== Open Financial Exchange (OFX) ===&lt;br /&gt;
&lt;br /&gt;
[http://www.ofx.net/ OFX] is the format for downloaded bank statements and other financial information.&lt;br /&gt;
There are various levels of OFX, the early ones being HTTP headers followed by SGML, the later ones being HTTP-like headers followed by XML.&lt;br /&gt;
&lt;br /&gt;
* [http://www.w3.org/2000/10/swap/pim/financial/OFX-to-n3.py OFX-to-n3.y] converts OFX format to RDF/N3. The conversion is only syntactic.  The OFX modeling is pretty well thought out, so taking it as defining an RDF ontology seems to make sense. Rules can then be used to define mapping into your favorite ontology.&lt;br /&gt;
&lt;br /&gt;
=== Open [[CourseWare]] ===&lt;br /&gt;
&lt;br /&gt;
* [http://simile.mit.edu/repository/RDFizers/ocw2rdf/ ocw2rdf] harvests metadata from the MIT [http://ocw.mit.edu/ OpenCourseWare] web site and transforms it into an RDF representation of [http://meta.wikimedia.org/wiki/IEEE_LOM  IEEE LOM].&lt;br /&gt;
&lt;br /&gt;
=== Palm OS ===&lt;br /&gt;
&lt;br /&gt;
* [http://dev.w3.org/cvsweb/2001/palmagent Palmagent] converts the calendar format of PalmOS into RDF. (SWAP)&lt;br /&gt;
&lt;br /&gt;
=== plist ===&lt;br /&gt;
&lt;br /&gt;
The Apple OS-X property list (.plist) filetype is an XML fromat for arbitrary structured data.&lt;br /&gt;
Numeric keys are used as local IDs.   OS X applications store many kinds uf data in these files, including configuration data,  iPhoto almum and photo data, iTunes metadata, and so on.&lt;br /&gt;
&lt;br /&gt;
To convert plists well, added information is necessary, such as a namespace for the properties.&lt;br /&gt;
&lt;br /&gt;
[http://dev.w3.org/cvsweb/2000/10/swap/util/plist2rdf.xsl plist2rdf.xsl] is an XSLT script to convert a plist file into RDF/XML. It does not add namespaces to the exported data.&lt;br /&gt;
&lt;br /&gt;
=== Quicken Interchange Format (QIF) ===&lt;br /&gt;
&lt;br /&gt;
* [http://www.w3.org/2000/10/swap/pim/qif2n3.py qif2n3.py] Takes Quicken interchange format and converts to to RDF/N3. (SWAP)&lt;br /&gt;
&lt;br /&gt;
=== Quick and Dirty CSV to RDF Converter (QUIDICRC) ===&lt;br /&gt;
&lt;br /&gt;
* [http://www.mindswap.org/~anant/quidicrc/ quidicrc] A perl script for rapidly transferring csv to RDF with some translation in the middle. (not actively being maintained, available open source -- SWAP)&lt;br /&gt;
&lt;br /&gt;
=== Random ===&lt;br /&gt;
&lt;br /&gt;
Seriously.&lt;br /&gt;
&lt;br /&gt;
* [http://simile.mit.edu/repository/RDFizers/random2rdf/ random2rdf] generates synthetic random graphs encoded in RDF/N3.&lt;br /&gt;
&lt;br /&gt;
=== Spreadsheet ===&lt;br /&gt;
&lt;br /&gt;
* An [http://lab.linkeddata.deri.ie/2010/grefine-rdf-extension/ RDF Extension] is available for [http://code.google.com/p/google-refine/ Google Refine]. It can export Excel, CSV, and other tabular data to RDF. The schema mapping can be defined in a graphical UI.&lt;br /&gt;
* [http://www.mindswap.org/%7Erreck/excel2rdf.shtml Esxcel2rdf] is a Microsoft Windows program (exe) that converts Excel files into valid RDF. It has been tested on Windows 98, and Windows 2000 Professional. ([[MindSwap]]) Export can be done via comma- or tab- separated values. See Flat Files above.&lt;br /&gt;
* [http://aperture.sourceforge.net aperture.sf.net] includes Java crawler for Excel and open document. Does only extract plaintext and basic metadata, though.&lt;br /&gt;
* [http://rdf123.umbc.edu/ RDF123] has Windows and Linux applications to download, a Java application and servlet. &lt;br /&gt;
* Cambridge Semantics' [http://www.cambridgesemantics.com/products/anzo_for_excel Anzo for Excel] extracts RDF data from Excel spreadsheets while keeping the spreadsheet in-sync with the underlying data as things change&lt;br /&gt;
* XLWrap, [http://xlwrap.sourceforge.net] wraps spreadsheets (including cross tables) to arbitrary RDF graphs; supports Excel/OpenDocument/CSV streamed processing, local/HTTP loading, expressions similar to Excel/OpenOffice Calc, custom functions, usage via API or SPARQL endpoint&lt;br /&gt;
&lt;br /&gt;
=== SQL ===&lt;br /&gt;
&lt;br /&gt;
SQL databases are rich stores of relational data ideal for export as RDF. &lt;br /&gt;
Conference tracks and many papers cover this subject from different angles. See also: [[RdfAndSql]]&lt;br /&gt;
&lt;br /&gt;
* [http://sites.wiwiss.fu-berlin.de/suhl/bizer/d2r-server/ D2R Server] provides a mapping from a SQL server (tested with several brands), producing both linked virtual RDF data files and a SPARQL service.  Uses a configuration file in N3. (Bizer et al., Freie Universität Berlin)&lt;br /&gt;
* [http://www.w3.org/2000/10/swap/dbork/dbview.py dbview.py] provides a mapping from a SQL server (tested with mySQL), producing linked virtual RDF data files. Uses a configuration file in N3. (SWAP)&lt;br /&gt;
* [[VirtuosoUniversalServer|OpenLink Virtuoso]]'s [http://virtuoso.openlinksw.com/wiki/main/Main/VOSSQLRDF declarative N3/Turtle based Metaschema Language] enables the creation of RDF Instance Data for associated RDF Ontologies via RDF VIEWs of ODBC, JDBC, ADO.NET, and OLE-DB accessible SQL Data. It is important to note that these VIEWs also apply to Native Virtuoso Data and/or Heterogeneous Data from other Web Services, HTTP/WebDAV, NNTP, and other Data Sources known to Virtuoso. This is an enhancement of the traditional SQL VIEW concept than enables multiple use of the same base SQL Data from a variety of data access points.&lt;br /&gt;
* [http://triplify.org Triplify] is a small plugin for Web applications, which reveals the semantic structures encoded in relational databases by making database content available as RDF, JSON or Linked Data.&lt;br /&gt;
* [http://www.tao-project.eu/researchanddevelopment/demosanddownloads/RDBToOnto.html RDBToOnto] is a full-fledged conversion tool that can produce accurate RDF/OWL models from various types of relational databases and Excel spreadsheets. The conversion is fully automated while various parameters can be set through the user interface to refine the resulting models (e.g., derivation of rich class hierarchies, proper naming of instances, database optimization before conversion, etc).  &lt;br /&gt;
&lt;br /&gt;
Many RDF Triple stores are implemented using SQL databases, but that is not covered here.&lt;br /&gt;
&lt;br /&gt;
=== Subversion ===&lt;br /&gt;
&lt;br /&gt;
[http://subversion.tigris.org/ Subversion] is a code-management system.&lt;br /&gt;
&lt;br /&gt;
* [http://simile.mit.edu/repository/RDFizers/svn2rdf/ svn2rdf] A pair of scripts; one can be used in a post-commit subversion hook to generate RDF/N3 with each commit, the other on a working copy. (Simile)&lt;br /&gt;
&lt;br /&gt;
=== Tab Separated Text ===&lt;br /&gt;
&lt;br /&gt;
See flat files.&lt;br /&gt;
&lt;br /&gt;
=== Talis SW Format Converter ===&lt;br /&gt;
&lt;br /&gt;
* [http://convert.test.talis.com/ Talis' converter], convert from various format to various formats (including RDF-&amp;gt;RDF with various serializations, RDF-&amp;gt;HTML, etc)&lt;br /&gt;
&lt;br /&gt;
=== UML ===&lt;br /&gt;
&lt;br /&gt;
* [http://www.topbraidcomposer.com TopBraid Composer] can convert UML Class Diagrams (XMI format) into RDF/OWL models.&lt;br /&gt;
* [http://eulergui.sourceforge.net/ EulerGUI] is a lightweight IDE that translates on the fly UML and eCore XMI into N3. Moreover there are N3 rules to convert UML to OWL.&lt;br /&gt;
&lt;br /&gt;
=== VCARD, Addressbook, … ===&lt;br /&gt;
&lt;br /&gt;
VACRD is a standard for interchange of contact data, such as business cards and address books.&lt;br /&gt;
&lt;br /&gt;
[http://www.w3.org/TR/vcard-rdf &amp;quot;Representing vCard Objects in RDF/XML&amp;quot;] is a W3C note defining an [http://www.w3.org/2006/vcard/ns ontology] for VCARD. FOAF is widely used ontology covering some of the domain.&lt;br /&gt;
&lt;br /&gt;
* [http://www.holygoat.co.uk/applications/address-book-foaf/projects/ab/ab.py code to convert your Apple Addressbook into FOAF file] (Richard Newman)&lt;br /&gt;
* [http://people.no-distance.net/ol/software/ab-foaf/ ab-foaf] does the same. &lt;br /&gt;
* [http://search.cpan.org/dist/XML-FOAFKnows-FromvCard/ XML::FOAFKnows::FromvCard], Perl extension to create FOAF dumps from vCards. Does not attempt to create a full model, just foaf:knows. It also has some privacy features. In addition to the module, which conforms with the Formatter API specification, comes with a command-line tool. &lt;br /&gt;
&lt;br /&gt;
=== Weather ===&lt;br /&gt;
&lt;br /&gt;
* [http://simile.mit.edu/repository/RDFizers/weather2rdf/ weather2rdf] Given a US city or ZIP code, retrieves weather report data from weather.com and returns it in RDF. (Simile)&lt;br /&gt;
&lt;br /&gt;
=== XML ===&lt;br /&gt;
&lt;br /&gt;
* '''GRDDL:''' Any XML files can be marked up with pointers to XSLT files which convert them to RDF.  The standard for this is [http://www.w3.org/TR/grddl/ GRDDL].  A GRDDL pointer can even be put in an XML schema, so that automatically all XML documents written to that schema will have a defined RDF mapping which any GRDDL-aware processor will benefit from. Several XSLT conversion transformations can be found linked from [[MicroModels]]&lt;br /&gt;
* &amp;lt;span id=&amp;quot;krextor&amp;quot;&amp;gt;&amp;lt;/span&amp;gt;[http://kwarc.info/projects/krextor/ Krextor] is a framework for extracting RDF in various notations from various XML languages and can easily be extended for additional input languages.  Support for RDFa and some mathematical markup languages is built in.  The implementation is done in XSLT, with a command-line frontend and a Java wrapper.&lt;br /&gt;
* [http://www.topbraidcomposer.com TopBraid Composer] can convert XML Schema (and their XML instance files) into RDF/OWL models.&lt;br /&gt;
* Rhizomik [http://rhizomik.net/redefer/ ReDeFer] includes XSD2OWL and XML2RDF plus MPEG-7 to RDF (all XSLT-based)&lt;br /&gt;
* '''XHTML:''' Convert ''existing'' pages to RDF. For example, see [[HtmlToRdf]].&lt;br /&gt;
&lt;br /&gt;
=== XMP ===&lt;br /&gt;
&lt;br /&gt;
[http://www.adobe.com/products/xmp/ XMP] is an Adobe-sponsored specification for putting RDF metadata in virtually any form of file, including binary formats.  XMP metadata is RDF data in fact, but it has to be extracted from the file.&lt;br /&gt;
&lt;br /&gt;
* [http://www.inf.unideb.hu/~jeszy/xmp/ xmpextractor] extracts XMP data. ([http://www.inf.unideb.hu/~jeszy/ Jeszenszky Péter])&lt;br /&gt;
* [http://dev.w3.org/cvsweb/~checkout~/2004/PythonLib-IH/xmp.py A python script to extract XMP]. There is also a service to do that on-line, see [http://www.ivan-herman.net/WebLog/WorkRelated/SemanticWeb/xmpextract.html separate page]&lt;br /&gt;
&lt;br /&gt;
== Frameworks ==&lt;br /&gt;
&lt;br /&gt;
The following are general tools which provide conversion from many formats.&lt;br /&gt;
&lt;br /&gt;
=== AnnoCultor ===&lt;br /&gt;
&lt;br /&gt;
[http://annocultor.eu/ AnnoCultor] was built during several years of practical work on porting various datasets to RDF. It allows converting data from the following data sources:&lt;br /&gt;
* databases via SQL and JDBC;&lt;br /&gt;
* XML files, also in batch;&lt;br /&gt;
* RDF files,&lt;br /&gt;
* Solr servers,&lt;br /&gt;
* custom formats, via format-specific parsers written in Java.&lt;br /&gt;
&lt;br /&gt;
AnnoCultor is specifically suited for the situations where XSLT is not sufficient.&lt;br /&gt;
&lt;br /&gt;
It comes with built-in converters for Geonames and Getty vocabularies (AAT, ULAN, TGN), that are ready to use. &lt;br /&gt;
Several additional specific converters illustrate advanced use: converters for collections of Louvre and Joconde, &lt;br /&gt;
Institute Collection Netherlands, Dutch Museum of Asian Ceramics, Tropenmuseum Amsterdam.&lt;br /&gt;
&lt;br /&gt;
As part of conversion, AnnoCultor can semantically tag (enrich) data with links to various vocabularies, with advanced customised disambiguation and term processing possibilities. &lt;br /&gt;
These vocabularies should be represented in RDF or SKOS to be imported via SPARQL queries. &lt;br /&gt;
AnnoCultor comes with built-in tagging with Geonames and a custom time ontology.&lt;br /&gt;
&lt;br /&gt;
AnnoCultor is written in Java, but conversion rules are written in XML. They are extendible with either small Java snippets, or custom rules implementions in Java.&lt;br /&gt;
AnnoCultor has been practically used with datasets ranging from a few records to more than ten millions, containing up to dozens fields each.&lt;br /&gt;
&lt;br /&gt;
=== any23 ===&lt;br /&gt;
&lt;br /&gt;
[http://developers.any23.org/ Anything To Triples (any23)] is a library, a web service (at [http://any23.org/ any23.org]) and a command line tool that extracts structured data in RDF format from a variety of Web documents. Currently it supports the following input formats:&lt;br /&gt;
&lt;br /&gt;
* RDF/XML, Turtle, Notation 3&lt;br /&gt;
* RDFa&lt;br /&gt;
* Microformats: Adr, Geo, hCalendar, hCard, hListing, hResume, hReview, License, XFN and Species&lt;br /&gt;
&lt;br /&gt;
Any23 is used in major Web of Data applications such as [http://sindice.com/ sindice.com] and [http://sig.ma/ sig.ma]. It is written in Java.&lt;br /&gt;
&lt;br /&gt;
=== Aperture ===&lt;br /&gt;
&lt;br /&gt;
* [http://aperture.sourceforge.net/ Aperture] is a project written in Java gathering RDF extractors for many formats, mentioned in the list above.&lt;br /&gt;
Aperture supports crawling, making it not a converter but a framework to crawl updates of data (like rsync).&lt;br /&gt;
&lt;br /&gt;
=== [[PiggyBank]] ===&lt;br /&gt;
&lt;br /&gt;
* [http://simile.mit.edu/piggy-bank/ Piggy-bank] is a [http://simile.mit.edu/ Simile] project which allows the Firefox-based clent to automatically load &amp;quot;[http://simile.mit.edu/RDFizers/ RDFizers]&amp;quot;, javascript-based converters to RDF. &lt;br /&gt;
Piggy-bank associates given scarping scripts with given web sites. (How?) &lt;br /&gt;
&lt;br /&gt;
=== Triplr ===&lt;br /&gt;
&lt;br /&gt;
[http://triplr.org/ Triplr] is a general “Stuff in, triples out” system by Dave Beckett. Triplr handles GRDDL, RSS, Atom, and other formats.&lt;br /&gt;
&lt;br /&gt;
=== Virtuoso Sponger ===&lt;br /&gt;
&lt;br /&gt;
[[OpenLinkSoftware|OpenLink Software]] via the &amp;quot;[http://virtuoso.openlinksw.com/dataspace/dav/wiki/Main/VirtSponger Sponger]&amp;quot; component of [[VirtuosoUniversalServer|Virtuoso]]'s SPARQL Processor and Proxy Web Service (used by default by [[OpenLinkDataExplorer| OpenLink Data Explorer]]) provides RDFization for:&lt;br /&gt;
* RDFa &lt;br /&gt;
* GRDDL&lt;br /&gt;
* Amazon Web Services&lt;br /&gt;
* eBay Web Services&lt;br /&gt;
* Freebase Web Services&lt;br /&gt;
* Facebook Web Services&lt;br /&gt;
* Yahoo! Finance&lt;br /&gt;
* XBRL Instance documents&lt;br /&gt;
* DOI (includes a custom resolver for HTTP)&lt;br /&gt;
* OAI&lt;br /&gt;
* RSS/Atom Feeds&lt;br /&gt;
* Digital Music Files (various formats via ID3 Tags)&lt;br /&gt;
* Image Files&lt;br /&gt;
* vCard&lt;br /&gt;
* iCalendar&lt;br /&gt;
* Microformats - hCard, hCalendar&lt;br /&gt;
* HR-XML Resumes &lt;br /&gt;
* Flickr &lt;br /&gt;
* Del.icio.us&lt;br /&gt;
* Bugzilla &lt;br /&gt;
* ODBC or JDBC accessible SQL Data&lt;br /&gt;
* [http://virtuoso.openlinksw.com/dataspace/dav/wiki/Main/VirtSpongerCartridgeSupportedDataSources Many others]&lt;br /&gt;
&lt;br /&gt;
= Notes =&lt;br /&gt;
&lt;br /&gt;
Historically, this list was made from a lists of [http://simile.mit.edu/RDFizers/ RDFizers] and&lt;br /&gt;
[http://www.w3.org/2002/Talks/09-lcs-sweb-tbl/slide31-1.html SWAP converters].&lt;br /&gt;
It has grown significantly from community input since then.&lt;br /&gt;
&lt;br /&gt;
This should be in a data format like Semantic Media Wiki or in N3 -- TimBL&lt;br /&gt;
&lt;br /&gt;
&amp;gt; Would there an advantage to have this kind of list in an RDF file specifically to make queries on it. Maybe if we add a format on how to declare it here, we could create a converter to RDF. -- [[KarlDubost]]&lt;br /&gt;
&lt;br /&gt;
&amp;gt; The task force [http://esw.w3.org/topic/SweoIG/TaskForces/InfoGathering InfoGathering] from SWEO works on such a vocabulary, if you want to rewrite this list using this vocab, look here: [http://esw.w3.org/topic/SweoIG/TaskForces/InfoGathering/DataVocabulary DataVocabulary] or contact me -- [[LeoSauermann]] on 22.1.2007&lt;br /&gt;
----&lt;br /&gt;
&lt;br /&gt;
----&lt;br /&gt;
[[Category:SwTools]] [[Category:SwTools]]&lt;/div&gt;</description>
			<pubDate>Tue, 08 Mar 2011 00:16:25 GMT</pubDate>			<dc:creator>Rcygania2</dc:creator>			<comments>http://www.w3.org/wiki/Talk:ConverterToRdf</comments>		</item>
		<item>
			<title>Ontology Dowsing</title>
			<link>http://www.w3.org/wiki/Ontology_Dowsing</link>
			<description>&lt;p&gt;Rcygania2:&amp;#32;/* Evaluation */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt; &amp;quot;Dowsing is a type of divination employed in attempts to locate ground water,&lt;br /&gt;
 buried metals or ores, gemstones, oil, gravesites, and many other objects and&lt;br /&gt;
 materials, as well as so-called currents of earth radiation, without the use of &lt;br /&gt;
 scientific apparatus.&amp;quot;&lt;br /&gt;
--[http://en.wikipedia.org/wiki/Dowsing Wikipedia article on Dowsing], retrieved on 14th January 2010.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
At the moment, the methods used in practice to locate an adequate vocabulary for describing one's data in RDF are more akin to dowsing than to an educated, technically-guided choice, supported by scientific tools and methodologies. While the situation is improving with the progress of Semantic Web search engines and better education, oftentimes data publishers still rely on informal criteria such as word-of-mouth, reputation or follow-your-nose strategies.&lt;br /&gt;
&lt;br /&gt;
This page tries to identify methods, tools, applications, websites or communities that can help Linked Data publishers to discover or build the right vocabulary they need. The tools identified below are sorted from the ones that require less time and efforts from the publisher's side to those that require hard work.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Lists of ontologies ==&lt;br /&gt;
&lt;br /&gt;
There are several webpages that reference ontologies by simply matching a theme (e.g., People, Product) to a URI.&lt;br /&gt;
Examples:&lt;br /&gt;
* [http://semanticweb.org/ Semanticweb.org] gives a short list on its homepage;&lt;br /&gt;
* [[VocabularyMarket]] provides links to ontologies, answering simple questions (''how about music collections?'');&lt;br /&gt;
* [http://semanticweb.org/wiki/Ontology Ontology] on Semanticweb.org has a list of ontologies, ranked according to their usage.&lt;br /&gt;
&lt;br /&gt;
This category requires minimal effort: if the publisher's data are in the domains referenced in the list, the corresponding ontology can readily used.&lt;br /&gt;
&lt;br /&gt;
These lists pose the question &amp;quot;how to define what's in these lists?&amp;quot; Popularity is one aspects, quality may be another. What is a quality ontology? When does it become popular? Who decides?&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Search engines ==&lt;br /&gt;
&lt;br /&gt;
Semantic Web search engines are applications for finding ontologies that require reasonable effort: queries are usually written as natural language keywords and results are ranked. Some additional information is often provided. Examples:&lt;br /&gt;
* [http://sindice.com/ Sindice] generic Semantic Web document search;&lt;br /&gt;
* [http://iws.seu.edu.cn/services/falcons/ FalconS] has a term search feature;&lt;br /&gt;
* [http://swoogle.umbc.edu/ Swoogle] is the grand-father of Semantic Web search engines;&lt;br /&gt;
* [http://swse.org/ SWSE] is an RDF entity search engine;&lt;br /&gt;
* [http://watson.kmi.open.ac.uk/WatsonWUI/ Watson] is an ontology search engine;&lt;br /&gt;
* [http://www.semanticwebsearch.com/query/ Semantic Web Search] is yet another search engine. &lt;br /&gt;
&lt;br /&gt;
The problem here is that it is still hard to choose between two matching ontologies.  What should guide publishers to the right choice?  Should these ontologies be reused at all?  See also [[BuildOrBuyTerms]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Repositories ==&lt;br /&gt;
&lt;br /&gt;
Ontology repositories are usually more specific that semantic web search engines and their navigation/search interfaces can vary greatly. They offer tools that may be specific to the type of applications the repository was designed for. Examples;&lt;br /&gt;
* [http://www.schemaweb.info/ SchemaWeb] is a quite old ontology directory, but it is still used;&lt;br /&gt;
* [http://schemapedia.com/ Schemapedia] is another ontology directory;&lt;br /&gt;
* [http://cupboard.open.ac.uk:8081/cupboard-search/ Cupboard] is an ontology repository with some advanced features, powered by Watson Semantic search engine;&lt;br /&gt;
* [http://knoodl.com/ Knoodl] is a repository and collaborative ontology management tool;&lt;br /&gt;
* [http://ontologydesignpatterns.org/ Ontology Design Patterns] repository for design patterns and ontology modules following the patterns&lt;br /&gt;
* [http://prefix.cc Prefix.cc] is a namespace lookup service, which can be seen as a kind of vocabulary directory;&lt;br /&gt;
* [http://vocab.deri.ie DERI Vocabularies] is a repository and can be used as an online ontology editor.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Mailing lists/online community ==&lt;br /&gt;
&lt;br /&gt;
If other tools are not sufficient to find an appropriate vocabulary, publishers can (and often do) rely on online communities by asking them directly. Examples:&lt;br /&gt;
* [http://lists.w3.org/Archives/Public/semantic-web/ W3C Semantic Web mailing list];&lt;br /&gt;
* [http://lists.w3.org/Archives/Public/public-lod/ Linking Open Data ML];&lt;br /&gt;
* [http://groups.yahoo.com/group/semanticweb/ Semantic Web Yahoo! Group];&lt;br /&gt;
* [http://semanticoverflow.com/ SemanticOverflow] is a Q&amp;amp;A service about semantic technologies;&lt;br /&gt;
&lt;br /&gt;
This is a rather effortless solution which can be really efficient in some case. However, repeated enquiries about vocabularies can easily polute the traffic and publishers should first try to find a solution on their own, e.g., by following the links and indications and this wiki page.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Ontology Editors ==&lt;br /&gt;
&lt;br /&gt;
If a data publishers cannot find a relevant vocabulary, or existing vocabularies are not good enough/suitable for the use case, they can make their own ontology. They can be helped by editors, such as:&lt;br /&gt;
* [http://protege.stanford.edu/ Protégé] ontology editor (popular, pluggable)&lt;br /&gt;
* [http://neon-toolkit.org/ NeOn Toolkit] is another ontology editor with many pluggins available. It is especially suited for heavy-weight projects (e.g., multi-modular ontologies, multi-lingual, ontology integration, etc);&lt;br /&gt;
* [http://www.mindswap.org/2004/SWOOP/ SWOOP] is a small and simple ontology editor;&lt;br /&gt;
* [http://neologism.deri.ie Neologism] is an online vocabulary editor and publishing platform;&lt;br /&gt;
* [http://www.topquadrant.com/products/TB_Composer.html TopBraid Composer] is a multipurpose Semantic Web editor;&lt;br /&gt;
* [http://vitro.mannlib.cornell.edu/ Vitro] is an Integrated Ontology Editor and Semantic Web Application;&lt;br /&gt;
* [http://www.knoodl.com/ Knoodl] is a community-oriented ontology and knowledge base editor&lt;br /&gt;
&lt;br /&gt;
This requires considerable efforts and requires some guidelines. Here are best practices:&lt;br /&gt;
* [[DontWorryBeCrappy]]: it's ok to do it wrong, it can improve later;&lt;br /&gt;
* [http://www.w3.org/TR/swbp-vocab-pub/ Best Practice Recipes for Publishing RDF Vocabularies].&lt;br /&gt;
&lt;br /&gt;
== Evaluation ==&lt;br /&gt;
&lt;br /&gt;
In addition to finding or making an ontology that contains the terms that are needed for the dataset, publishers may like to assess the quality of the ontologies, especially when they have the choice between several of them.&lt;br /&gt;
Some possible factors:&lt;br /&gt;
* Fully documented;&lt;br /&gt;
* Used by independent data pubslihers;&lt;br /&gt;
* There exist tools that support the vocabulary specifically;&lt;br /&gt;
* The ontology is highly ranked by users in a voting system;&lt;br /&gt;
* all terms are dereferencable;&lt;br /&gt;
* The ontology just covers the right domain (not an upper level &amp;quot;ontology of everything&amp;quot;);&lt;br /&gt;
* expressive enough: the ontology has axioms that make valuable inferences;&lt;br /&gt;
* not too expressive: the ontology does not define axioms that have limited utility and would make reasoning costly;&lt;br /&gt;
&lt;br /&gt;
Tools:&lt;br /&gt;
* [http://owl.cs.manchester.ac.uk/validator/ OWL 2 Validator] determines whether an ontology is in OWL 2 DL, OWL 2 EL, OWL 2 QL, OWL 2 RL or OWL 2 Full;&lt;br /&gt;
* [http://www.mygrid.org.uk/OWL/Validator OWL 1 Validator] determines whether an ontology is in OWL 1 DL, OWL 1 Lite or OWL 1 Full;&lt;br /&gt;
* [http://www.w3.org/RDF/Validator/ RDF Validator] the official W3C validator for RDF/XML syntax validation;&lt;br /&gt;
* [http://swse.deri.org/SWSEAlerts/ rdf:alerts] is a tool for finding potential problems in linked data;&lt;br /&gt;
* ...&lt;br /&gt;
&lt;br /&gt;
== Related Events, Projects, etc. ==&lt;br /&gt;
&lt;br /&gt;
There is an important amount of research work going on to solve parts of the problem of guiding publishers to the right vocabulary:&lt;br /&gt;
* [http://miuras.inf.um.es/ontoqual2010/index.html EKAW 2010 Workshop on Ontology Quality];&lt;br /&gt;
* [http://www.ontologydynamics.org/od/index.php/seres2010/ ISWC 2010 Workshop on Semantic Repositories for Web, SERES 2010];&lt;br /&gt;
* [http://www.seals-project.eu/ SEALS] is a European project on evaluating semantic applications, including ontologies;&lt;br /&gt;
* [http://www.semantic-web-journal.net/ Semantic Web Journal] is an academic journal which encourages the publication of ontology description (=&amp;gt; peer reviewed ontologies =&amp;gt; good ontologies, in principle).&lt;/div&gt;</description>
			<pubDate>Wed, 04 Aug 2010 19:15:34 GMT</pubDate>			<dc:creator>Rcygania2</dc:creator>			<comments>http://www.w3.org/wiki/Talk:Ontology_Dowsing</comments>		</item>
		<item>
			<title>Ontology Dowsing</title>
			<link>http://www.w3.org/wiki/Ontology_Dowsing</link>
			<description>&lt;p&gt;Rcygania2:&amp;#32;/* Ontology Editors */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt; &amp;quot;Dowsing is a type of divination employed in attempts to locate ground water,&lt;br /&gt;
 buried metals or ores, gemstones, oil, gravesites, and many other objects and&lt;br /&gt;
 materials, as well as so-called currents of earth radiation, without the use of &lt;br /&gt;
 scientific apparatus.&amp;quot;&lt;br /&gt;
--[http://en.wikipedia.org/wiki/Dowsing Wikipedia article on Dowsing], retrieved on 14th January 2010.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
At the moment, the methods used in practice to locate an adequate vocabulary for describing one's data in RDF are more akin to dowsing than to an educated, technically-guided choice, supported by scientific tools and methodologies. While the situation is improving with the progress of Semantic Web search engines and better education, oftentimes data publishers still rely on informal criteria such as word-of-mouth, reputation or follow-your-nose strategies.&lt;br /&gt;
&lt;br /&gt;
This page tries to identify methods, tools, applications, websites or communities that can help Linked Data publishers to discover or build the right vocabulary they need. The tools identified below are sorted from the ones that require less time and efforts from the publisher's side to those that require hard work.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Lists of ontologies ==&lt;br /&gt;
&lt;br /&gt;
There are several webpages that reference ontologies by simply matching a theme (e.g., People, Product) to a URI.&lt;br /&gt;
Examples:&lt;br /&gt;
* [http://semanticweb.org/ Semanticweb.org] gives a short list on its homepage;&lt;br /&gt;
* [[VocabularyMarket]] provides links to ontologies, answering simple questions (''how about music collections?'');&lt;br /&gt;
* [http://semanticweb.org/wiki/Ontology Ontology] on Semanticweb.org has a list of ontologies, ranked according to their usage.&lt;br /&gt;
&lt;br /&gt;
This category requires minimal effort: if the publisher's data are in the domains referenced in the list, the corresponding ontology can readily used.&lt;br /&gt;
&lt;br /&gt;
These lists pose the question &amp;quot;how to define what's in these lists?&amp;quot; Popularity is one aspects, quality may be another. What is a quality ontology? When does it become popular? Who decides?&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Search engines ==&lt;br /&gt;
&lt;br /&gt;
Semantic Web search engines are applications for finding ontologies that require reasonable effort: queries are usually written as natural language keywords and results are ranked. Some additional information is often provided. Examples:&lt;br /&gt;
* [http://sindice.com/ Sindice] generic Semantic Web document search;&lt;br /&gt;
* [http://iws.seu.edu.cn/services/falcons/ FalconS] has a term search feature;&lt;br /&gt;
* [http://swoogle.umbc.edu/ Swoogle] is the grand-father of Semantic Web search engines;&lt;br /&gt;
* [http://swse.org/ SWSE] is an RDF entity search engine;&lt;br /&gt;
* [http://watson.kmi.open.ac.uk/WatsonWUI/ Watson] is an ontology search engine;&lt;br /&gt;
* [http://www.semanticwebsearch.com/query/ Semantic Web Search] is yet another search engine. &lt;br /&gt;
&lt;br /&gt;
The problem here is that it is still hard to choose between two matching ontologies.  What should guide publishers to the right choice?  Should these ontologies be reused at all?  See also [[BuildOrBuyTerms]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Repositories ==&lt;br /&gt;
&lt;br /&gt;
Ontology repositories are usually more specific that semantic web search engines and their navigation/search interfaces can vary greatly. They offer tools that may be specific to the type of applications the repository was designed for. Examples;&lt;br /&gt;
* [http://www.schemaweb.info/ SchemaWeb] is a quite old ontology directory, but it is still used;&lt;br /&gt;
* [http://schemapedia.com/ Schemapedia] is another ontology directory;&lt;br /&gt;
* [http://cupboard.open.ac.uk:8081/cupboard-search/ Cupboard] is an ontology repository with some advanced features, powered by Watson Semantic search engine;&lt;br /&gt;
* [http://knoodl.com/ Knoodl] is a repository and collaborative ontology management tool;&lt;br /&gt;
* [http://ontologydesignpatterns.org/ Ontology Design Patterns] repository for design patterns and ontology modules following the patterns&lt;br /&gt;
* [http://prefix.cc Prefix.cc] is a namespace lookup service, which can be seen as a kind of vocabulary directory;&lt;br /&gt;
* [http://vocab.deri.ie DERI Vocabularies] is a repository and can be used as an online ontology editor.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Mailing lists/online community ==&lt;br /&gt;
&lt;br /&gt;
If other tools are not sufficient to find an appropriate vocabulary, publishers can (and often do) rely on online communities by asking them directly. Examples:&lt;br /&gt;
* [http://lists.w3.org/Archives/Public/semantic-web/ W3C Semantic Web mailing list];&lt;br /&gt;
* [http://lists.w3.org/Archives/Public/public-lod/ Linking Open Data ML];&lt;br /&gt;
* [http://groups.yahoo.com/group/semanticweb/ Semantic Web Yahoo! Group];&lt;br /&gt;
* [http://semanticoverflow.com/ SemanticOverflow] is a Q&amp;amp;A service about semantic technologies;&lt;br /&gt;
&lt;br /&gt;
This is a rather effortless solution which can be really efficient in some case. However, repeated enquiries about vocabularies can easily polute the traffic and publishers should first try to find a solution on their own, e.g., by following the links and indications and this wiki page.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Ontology Editors ==&lt;br /&gt;
&lt;br /&gt;
If a data publishers cannot find a relevant vocabulary, or existing vocabularies are not good enough/suitable for the use case, they can make their own ontology. They can be helped by editors, such as:&lt;br /&gt;
* [http://protege.stanford.edu/ Protégé] ontology editor (popular, pluggable)&lt;br /&gt;
* [http://neon-toolkit.org/ NeOn Toolkit] is another ontology editor with many pluggins available. It is especially suited for heavy-weight projects (e.g., multi-modular ontologies, multi-lingual, ontology integration, etc);&lt;br /&gt;
* [http://www.mindswap.org/2004/SWOOP/ SWOOP] is a small and simple ontology editor;&lt;br /&gt;
* [http://neologism.deri.ie Neologism] is an online vocabulary editor and publishing platform;&lt;br /&gt;
* [http://www.topquadrant.com/products/TB_Composer.html TopBraid Composer] is a multipurpose Semantic Web editor;&lt;br /&gt;
* [http://vitro.mannlib.cornell.edu/ Vitro] is an Integrated Ontology Editor and Semantic Web Application;&lt;br /&gt;
* [http://www.knoodl.com/ Knoodl] is a community-oriented ontology and knowledge base editor&lt;br /&gt;
&lt;br /&gt;
This requires considerable efforts and requires some guidelines. Here are best practices:&lt;br /&gt;
* [[DontWorryBeCrappy]]: it's ok to do it wrong, it can improve later;&lt;br /&gt;
* [http://www.w3.org/TR/swbp-vocab-pub/ Best Practice Recipes for Publishing RDF Vocabularies].&lt;br /&gt;
&lt;br /&gt;
== Evaluation ==&lt;br /&gt;
&lt;br /&gt;
In addition to finding or making an ontology that contains the terms that are needed for the dataset, publishers may like to assess the quality of the ontologies, especially when they have the choice between several of them.&lt;br /&gt;
Some possible factors:&lt;br /&gt;
* Fully documented;&lt;br /&gt;
* Used by independent data pubslihers;&lt;br /&gt;
* There exist tools that support the vocabulary specifically;&lt;br /&gt;
* The ontology is highly ranked by users in a voting system;&lt;br /&gt;
* all terms are dereferencable;&lt;br /&gt;
* The ontology just covers the right domain (not an upper level &amp;quot;ontology of everything&amp;quot;);&lt;br /&gt;
* expressive enough: the ontology has axioms that make valuable inferences;&lt;br /&gt;
* not too expressive: the ontology does not define axioms that have limited utility and would make reasoning costly;&lt;br /&gt;
&lt;br /&gt;
Tools:&lt;br /&gt;
* [http://owl.cs.manchester.ac.uk/validator/ OWL 2 Validator] determines whether an ontology is in OWL 2 DL, OWL 2 EL, OWL 2 QL, OWL 2 RL or OWL 2 Full;&lt;br /&gt;
* [http://www.mygrid.org.uk/OWL/Validator OWL 1 Validator] determines whether an ontology is in OWL 1 DL, OWL 1 Lite or OWL 1 Full;&lt;br /&gt;
* [http://www.w3.org/RDF/Validator/ RDF Validator] the official W3C validator;&lt;br /&gt;
* [http://swse.deri.org/SWSEAlerts/ rdf:alterts] is a tool for finding potential problems in linked data;&lt;br /&gt;
* ...&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Related Events, Projects, etc. ==&lt;br /&gt;
&lt;br /&gt;
There is an important amount of research work going on to solve parts of the problem of guiding publishers to the right vocabulary:&lt;br /&gt;
* [http://miuras.inf.um.es/ontoqual2010/index.html EKAW 2010 Workshop on Ontology Quality];&lt;br /&gt;
* [http://www.ontologydynamics.org/od/index.php/seres2010/ ISWC 2010 Workshop on Semantic Repositories for Web, SERES 2010];&lt;br /&gt;
* [http://www.seals-project.eu/ SEALS] is a European project on evaluating semantic applications, including ontologies;&lt;br /&gt;
* [http://www.semantic-web-journal.net/ Semantic Web Journal] is an academic journal which encourages the publication of ontology description (=&amp;gt; peer reviewed ontologies =&amp;gt; good ontologies, in principle).&lt;/div&gt;</description>
			<pubDate>Wed, 04 Aug 2010 19:15:04 GMT</pubDate>			<dc:creator>Rcygania2</dc:creator>			<comments>http://www.w3.org/wiki/Talk:Ontology_Dowsing</comments>		</item>
		<item>
			<title>User:Rcygania2/RulesOfThumb</title>
			<link>http://www.w3.org/wiki/User:Rcygania2/RulesOfThumb</link>
			<description>&lt;p&gt;Rcygania2:&amp;#32;/* Namespace URIs */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;A loose collection of [[SemWeb]]-related rules of thumb that I would propose as good practice.&lt;br /&gt;
&lt;br /&gt;
= Modelling a dataset in RDF =&lt;br /&gt;
&lt;br /&gt;
== Labels ==&lt;br /&gt;
* Make sure that everything has an rdfs:label, either directly specified, or by using some property that is defined as a subproperty of rdfs:label&lt;br /&gt;
* Don't be overly concerned with ambiguous labels; just consider the resource in isolation. That's because labels cannot do the job of disambiguation anyway, and trying to do it results in artificial and awkward labels.&lt;br /&gt;
&lt;br /&gt;
== Language tags ==&lt;br /&gt;
* On untyped literals, if the literal is likely to be understood only by speakers of a single language, then add a language tag. If it is likely to work for speakers of many languages, keep it without a language tag. If the file or dataset has only a few exceptions, then it is perhaps better to go for consistency and mark them the same way as the rest of the file.&lt;br /&gt;
&lt;br /&gt;
= Designing vocabularies and ontologies =&lt;br /&gt;
&lt;br /&gt;
== Naming of properties ==&lt;br /&gt;
* Properties that point to documents (information resources) should have names that announce this fact, e.g. userProfile, userPage, userList, eventRecord, eventForm&lt;br /&gt;
* Relationship nouns make good propery names, e.g “parent” is better than “hasParent” or “isParentOf” (as per TimBL)&lt;br /&gt;
 Focus on one problem::&lt;br /&gt;
* [from an email to rdf-schema-dev on 2008-04-05] Some random half-formed thoughts: It's good if the vocabulary covers all my needs for a given problem. It's good if the vocabulary doesn't contain much extra stuff that I don't need to solve my problem. It's good if the purpose and coverage of the vocabulary can be conveyed in a short term or phrase (e.g. “document metadata” or “issue tracking”). It's good if the level of abstraction is consistent throughout the vocabulary, e.g. don't mix high-level concepts like Service and Container into your down-to-earth photo annotation vocabulary.&lt;br /&gt;
 Provide excellent documentation::&lt;br /&gt;
* [from an email to rdf-schema-dev on 2008-04-05] Random thoughts again: Some introductory narrative. A bunch of good examples for typical usages of the vocabulary. An UML-style overview diagram if the vocab has more than a few classes. Some tutorial-style text. An excellent reference section with all terms and notes about how they are supposed to be used (including notes on what they are NOT supposed to be used for).&lt;br /&gt;
&lt;br /&gt;
== Namespace URIs ==&lt;br /&gt;
From danbri in an email to the DC Architecture list, 30 March 2010 09:40:47 IST:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;blockquote&amp;gt;My current preference / advice (for new work) is for the managers of&lt;br /&gt;
each serious namespace to invest in a distinct domain name for it, and&lt;br /&gt;
for us as a community to come up with social machinery for 'watching&lt;br /&gt;
each other's backs' to ensure that the domains are kept in good&lt;br /&gt;
working order, fees are paid, etc. Sometimes an additional level of&lt;br /&gt;
indirection can add as much risk as it saves.  Initially when I bought&lt;br /&gt;
xmlns.com I have idea it could be a home for lots of namespaces, and&lt;br /&gt;
then the more I thought about it, the less I liked that idea. Each new&lt;br /&gt;
namespace added to the bucket brings some risk to the others using the&lt;br /&gt;
domain, by adding to the complexity and burden for subsequent&lt;br /&gt;
maintainers. So I think a proliferation of independent domain names,&lt;br /&gt;
while painful in its own way, spreads the risk...&amp;lt;/blockquote&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Later in the thread:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;blockquote&amp;gt;Rule of thumb - when wondering what info to include in a namespace&lt;br /&gt;
URI, ... try to leave *out* as much as possible&amp;lt;/blockquote&amp;gt;&lt;br /&gt;
&lt;br /&gt;
For hash namespaces, the RDF document containing the vocabulary should be typed as owl:Ontology and should be the target of any rdfs:isDefinedBy statements. Note, the RDF document's URI is ''the namespace URI without the trailing hash''.&lt;br /&gt;
&lt;br /&gt;
= Publishing RDF on the Web =&lt;br /&gt;
&lt;br /&gt;
== Metadata in RDF documents ==&lt;br /&gt;
* Every RDF document has ''some'' relation to the thing(s) it talks about. It is useful to explicitly state what that relation is. For example, in my FOAF file I state that I'm both the foaf:maker and the foaf:primaryTopic of the file. Other useful properties are: rdfs:isDefinedBy, foaf:topic. A concrete benefit is that consumers can pick out the “important” things from the graph.&lt;br /&gt;
&lt;br /&gt;
== HTML descriptions of URI-denoted things ==&lt;br /&gt;
* To create trust into the stability, reliability and availability of a URI, its HTML description should explicitly state the URI, it should contain an explicit ''Statement of Purpose'' to the effect that the URI is intended to be used as an identifier for the thing, and it should contain a ''Publisher Identification''. It must provide sufficient information to enable human users to know exactly what is being referred to. (This is inspired by [http://www.oasis-open.org/committees/download.php/3050/pubsubj-pt1-1.02-cs.pdf Published Subjects].)&lt;br /&gt;
&lt;br /&gt;
== Content negotiation ==&lt;br /&gt;
* The benefit of CN is that all URIs also work in a standard Web browser, not just in RDF-enabled tools and browsers. Thus it's great for authoring and debugging and when your URIs are exposed to a lot of neophytes (e.g. DBpedia, FOAF, DC). On the other hand, content negotiation is very hard to get right, the devil is in the details and it has turned out to be quite an interop hassle in practice. So, CN should be thought of as icing on the cake, but not a requirement for publishing RDF.&lt;br /&gt;
* Rule of thumb: If your server solution does CN, then do CN. Otherwise, getting it right will be too much effort.&lt;br /&gt;
* Keep in mind the advice from [http://www.w3.org/2001/tag/doc/alternatives-discovery.html On Linking Alternative Representations]: Provide links between different variants to make them all accessible. This means, if some HTML can be returned in response to RDF/HTML negotiation, there should be an RDF icon nearby, which points to the RDF variant.&lt;br /&gt;
&lt;br /&gt;
== Blank nodes ==&lt;br /&gt;
* Should be avoided in general. Using a blank node is appropriate if the publisher thinks that no one should ever care about this resource except in the context of looking at another, identified, resource in the same RDF document, e.g. a geo:Point that exists solely to give the location of another resource. Another situation where a blank node would be appropriate is when used as an existential variable, but I've never seen them used that way in a Linked Data context.&lt;br /&gt;
&lt;br /&gt;
== Linked Data ==&lt;br /&gt;
* Have rdfs:label (or a subclass thereof) on everything, always&lt;br /&gt;
* Have rdfs:label for the document URI, always&lt;br /&gt;
* Have a foaf:primaryTopic triple connecting document URI and main resource, always&lt;br /&gt;
* Have as much dc: metadata as possible on the document URI&lt;br /&gt;
* Think hard about possible external links, to other web pages and other RDF documents and entities. Provide as many as possible. These make all the difference.&lt;br /&gt;
&lt;br /&gt;
== URI design ==&lt;br /&gt;
* RDF URIs are always case sensitive, while from HTTP's point of view, some parts of the URI can change without changing any behaviour. So, be clear about the case of your URIs and stick to it once a decision is made. If in doubt, use as much lowercase as possible. (Story: L3S changed case of the domain name in their URIs, broke a Semantic Web Pipes demo.)&lt;br /&gt;
&lt;br /&gt;
= Web Architecture =&lt;br /&gt;
&lt;br /&gt;
== Information resources ==&lt;br /&gt;
* [http://lists.w3.org/Archives/Public/www-tag/2007Sep/0123.html From Harry Halpin]: “If there is a URI that is used to identify a resource one would want to make logical statements about, and these statements do not apply to possible representations of that resource, then one should use the &amp;quot;hash&amp;quot; or 303 redirection to separate  these URIs.”&lt;/div&gt;</description>
			<pubDate>Fri, 07 May 2010 19:10:26 GMT</pubDate>			<dc:creator>Rcygania2</dc:creator>			<comments>http://www.w3.org/wiki/User_talk:Rcygania2/RulesOfThumb</comments>		</item>
		<item>
			<title>SweoIG/TaskForces/CommunityProjects/LinkingOpenData/RaleighGathering</title>
			<link>http://www.w3.org/wiki/SweoIG/TaskForces/CommunityProjects/LinkingOpenData/RaleighGathering</link>
			<description>&lt;p&gt;Rcygania2:&amp;#32;/* Participants */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;__NOTOC__&lt;br /&gt;
== Raleigh Linked Data Community Gathering ==&lt;br /&gt;
&lt;br /&gt;
=== Date, Time, Venue ===&lt;br /&gt;
&lt;br /&gt;
'''What''': The latest in the ongoing tradition of Linked Data gatherings, combined with workshop dinner for [http://events.linkeddata.org/ldow2010 LDOW2010]. All welcome, but please add your name to the list below so we can confirm numbers to the restaurant.&lt;br /&gt;
&lt;br /&gt;
'''Where''': [http://www.101raleigh.com/ 101 Lounge]&lt;br /&gt;
&lt;br /&gt;
444 S. Blount Street, Raleigh‎ NC&lt;br /&gt;
&lt;br /&gt;
[http://maps.google.com/maps?f=d&amp;amp;source=s_d&amp;amp;saddr=500+S+Salisbury+St,+Raleigh,+North+Carolina+27601&amp;amp;daddr=35.774246,-78.640319+to:101+Lounge+%2B+Cafe,+Raleigh,+NC&amp;amp;geocode=FU_cIQIdLwZQ-yHachRDc9mAaClJoFpHcV-siTEaBxCJb_mcWw%3B%3BFXLjIQIdlRhQ-yGOf6hhX1IXIQ&amp;amp;hl=en&amp;amp;mra=dpe&amp;amp;mrcr=0&amp;amp;mrsp=1&amp;amp;sz=18&amp;amp;via=1&amp;amp;dirflg=w&amp;amp;sll=35.774655,-78.63975&amp;amp;sspn=0.002611,0.005627&amp;amp;ie=UTF8&amp;amp;ll=35.774528,-78.639354&amp;amp;spn=0.005223,0.011255&amp;amp;z=17 Directions from Convention Center]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
'''When''': 27th April 2010, after [http://events.linkeddata.org/ldow2010 LDOW2010].&lt;br /&gt;
&lt;br /&gt;
=== Participants ===&lt;br /&gt;
&lt;br /&gt;
# [[OlafHartig]]&lt;br /&gt;
# [[JuanSequeda]]&lt;br /&gt;
# [[DanielSchwabe]]&lt;br /&gt;
# [[RaphaelTroncy]]&lt;br /&gt;
# [[ChrisBizer]]&lt;br /&gt;
# [[AlexandrePassant]]&lt;br /&gt;
# [[MatthewRowe]]&lt;br /&gt;
# [[TomHeath]]&lt;br /&gt;
# [[RobVesse]]&lt;br /&gt;
# [[EmanueleDellaValle]]&lt;br /&gt;
# [[BernhardHaslhofer]]&lt;br /&gt;
# [[NikoPopitsch]]&lt;br /&gt;
# [[HughGlaser]]&lt;br /&gt;
# [[JohnSheridan]]&lt;br /&gt;
# [[JeniTennison]]&lt;br /&gt;
# [[DavyVanDeursen]]&lt;br /&gt;
# [[SamCoppens]]&lt;br /&gt;
# [[PaulGroth]]&lt;br /&gt;
# [[RichardCyganiak]]&lt;/div&gt;</description>
			<pubDate>Sat, 24 Apr 2010 23:51:35 GMT</pubDate>			<dc:creator>Rcygania2</dc:creator>			<comments>http://www.w3.org/wiki/Talk:SweoIG/TaskForces/CommunityProjects/LinkingOpenData/RaleighGathering</comments>		</item>
		<item>
			<title>Camps:LODCampW3CTrack</title>
			<link>http://www.w3.org/wiki/Camps:LODCampW3CTrack</link>
			<description>&lt;p&gt;Rcygania2:&amp;#32;/* Who's coming? */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;__NOTOC__&lt;br /&gt;
&amp;lt;!-- #acl All:read,write --&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Linked Open Data (LOD) - W3C Track @ WWW2010 =&lt;br /&gt;
&lt;br /&gt;
== Context ==&lt;br /&gt;
The Linked Open Data Camp, organized by W3C, will be held at the upcoming [http://www2010.org 19th International World Wide Web Conference] in Raleigh, North Carolina (USA), on '''29 April 2010'''. See [http://www.w3.org/2010/04/w3c-track.html W3C Track @ WWW2010] for a more detailed agenda.&lt;br /&gt;
&lt;br /&gt;
The event will feature a mix of structured content (talks, demos, lightning talks, etc.) and unstructured content. Topics of discussion for the two afternoon sessions will be selected at the camp during the mroning session. This Wiki page is intended to collect suggestions in advance and to record the discussions that will be held on site.&lt;br /&gt;
&lt;br /&gt;
If you're willing to lead a discussion, please add your name to a topic below. Thx!&lt;br /&gt;
&lt;br /&gt;
== Pre-camp Topic suggestions ==&lt;br /&gt;
Feel free to edit this section and append your own suggestion to the list or refine an already suggested topic!&lt;br /&gt;
&lt;br /&gt;
* Vocabularies&lt;br /&gt;
* Identity (alternatives and refinement for owl:sameAs)&lt;br /&gt;
* Crawling through the LOD&lt;br /&gt;
* Visualizing&lt;br /&gt;
* User Interfaces&lt;br /&gt;
* User scripts for server-side Semantic Apps&lt;br /&gt;
* Methods of mapping public databases to LOD&lt;br /&gt;
* ...&lt;br /&gt;
&lt;br /&gt;
&amp;lt;span id=&amp;quot;LT&amp;quot;&amp;gt;&amp;lt;/span&amp;gt;&lt;br /&gt;
=== Lightning Talks (LTs) ===&lt;br /&gt;
Anything from announcements, controversial statements, project proposals, observations, etc. is great material for a lightning talk. &lt;br /&gt;
&lt;br /&gt;
'''LT format''' = one slide (optional) and two minutes time (sharp). &lt;br /&gt;
&lt;br /&gt;
Suggested procedure: add your talk/topic, here or let EricP or IvanH know at least a bit before the session (just to keep track how many people are interested).&lt;br /&gt;
&lt;br /&gt;
* [[HCLSIG/LODD| Linked Open Drug Data]]&lt;br /&gt;
   interested: ericP&lt;br /&gt;
* eGov opportunities&lt;br /&gt;
   interested: ericP&lt;br /&gt;
* Health Care/Life Sciences opportunities&lt;br /&gt;
   interested: ericP&lt;br /&gt;
* Financial and Business Data&lt;br /&gt;
   interested: Dave Raggett&lt;br /&gt;
* ...&lt;br /&gt;
&lt;br /&gt;
== Who's coming? ==&lt;br /&gt;
&lt;br /&gt;
If you're planning on participating to the camp, feel free to let other know by adding your name below:&lt;br /&gt;
&lt;br /&gt;
* Marie-Claire Forgue, W3C Track chair [mailto:mcf@w3.org contact]&lt;br /&gt;
* Ivan Herman, W3C&lt;br /&gt;
* Eric Prud'hommeaux, W3C&lt;br /&gt;
* Tim Berners-Lee, W3C&lt;br /&gt;
* Dave Raggett, W3C&lt;br /&gt;
* Fabien Gandon, INRIA&lt;br /&gt;
* [[RaphaelTroncy]], EURECOM&lt;br /&gt;
* Davy Van Deursen, Ghent University - IBBT&lt;br /&gt;
* [http://milstan.net Milan Stankovic], Hypios.com&lt;br /&gt;
* Harald Sack, Hasso-Plattner Institute (HPI), University of Potsdam&lt;br /&gt;
* Jeni Tennison, The Stationery Office&lt;br /&gt;
* Elena Montiel-Ponsoda, Ontology Engineering Group, UPM, Spain&lt;br /&gt;
* Richard Cyganiak, DERI&lt;br /&gt;
* ... 'add your name here!'&lt;/div&gt;</description>
			<pubDate>Tue, 20 Apr 2010 17:27:06 GMT</pubDate>			<dc:creator>Rcygania2</dc:creator>			<comments>http://www.w3.org/wiki/Camps_talk:LODCampW3CTrack</comments>		</item>
		<item>
			<title>User:Rcygania2/RulesOfThumb</title>
			<link>http://www.w3.org/wiki/User:Rcygania2/RulesOfThumb</link>
			<description>&lt;p&gt;Rcygania2:&amp;#32;/* Modelling a dataset in RDF */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;A loose collection of [[SemWeb]]-related rules of thumb that I would propose as good practice.&lt;br /&gt;
&lt;br /&gt;
= Modelling a dataset in RDF =&lt;br /&gt;
&lt;br /&gt;
== Labels ==&lt;br /&gt;
* Make sure that everything has an rdfs:label, either directly specified, or by using some property that is defined as a subproperty of rdfs:label&lt;br /&gt;
* Don't be overly concerned with ambiguous labels; just consider the resource in isolation. That's because labels cannot do the job of disambiguation anyway, and trying to do it results in artificial and awkward labels.&lt;br /&gt;
&lt;br /&gt;
== Language tags ==&lt;br /&gt;
* On untyped literals, if the literal is likely to be understood only by speakers of a single language, then add a language tag. If it is likely to work for speakers of many languages, keep it without a language tag. If the file or dataset has only a few exceptions, then it is perhaps better to go for consistency and mark them the same way as the rest of the file.&lt;br /&gt;
&lt;br /&gt;
= Designing vocabularies and ontologies =&lt;br /&gt;
&lt;br /&gt;
== Naming of properties ==&lt;br /&gt;
* Properties that point to documents (information resources) should have names that announce this fact, e.g. userProfile, userPage, userList, eventRecord, eventForm&lt;br /&gt;
* Relationship nouns make good propery names, e.g “parent” is better than “hasParent” or “isParentOf” (as per TimBL)&lt;br /&gt;
 Focus on one problem::&lt;br /&gt;
* [from an email to rdf-schema-dev on 2008-04-05] Some random half-formed thoughts: It's good if the vocabulary covers all my needs for a given problem. It's good if the vocabulary doesn't contain much extra stuff that I don't need to solve my problem. It's good if the purpose and coverage of the vocabulary can be conveyed in a short term or phrase (e.g. “document metadata” or “issue tracking”). It's good if the level of abstraction is consistent throughout the vocabulary, e.g. don't mix high-level concepts like Service and Container into your down-to-earth photo annotation vocabulary.&lt;br /&gt;
 Provide excellent documentation::&lt;br /&gt;
* [from an email to rdf-schema-dev on 2008-04-05] Random thoughts again: Some introductory narrative. A bunch of good examples for typical usages of the vocabulary. An UML-style overview diagram if the vocab has more than a few classes. Some tutorial-style text. An excellent reference section with all terms and notes about how they are supposed to be used (including notes on what they are NOT supposed to be used for).&lt;br /&gt;
&lt;br /&gt;
== Namespace URIs ==&lt;br /&gt;
From danbri in an email to the DC Architecture list, 30 March 2010 09:40:47 IST:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;blockquote&amp;gt;My current preference / advice (for new work) is for the managers of&lt;br /&gt;
each serious namespace to invest in a distinct domain name for it, and&lt;br /&gt;
for us as a community to come up with social machinery for 'watching&lt;br /&gt;
each other's backs' to ensure that the domains are kept in good&lt;br /&gt;
working order, fees are paid, etc. Sometimes an additional level of&lt;br /&gt;
indirection can add as much risk as it saves.  Initially when I bought&lt;br /&gt;
xmlns.com I have idea it could be a home for lots of namespaces, and&lt;br /&gt;
then the more I thought about it, the less I liked that idea. Each new&lt;br /&gt;
namespace added to the bucket brings some risk to the others using the&lt;br /&gt;
domain, by adding to the complexity and burden for subsequent&lt;br /&gt;
maintainers. So I think a proliferation of independent domain names,&lt;br /&gt;
while painful in its own way, spreads the risk...&amp;lt;/blockquote&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Later in the thread:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;blockquote&amp;gt;Rule of thumb - when wondering what info to include in a namespace&lt;br /&gt;
URI, ... try to leave *out* as much as possible&amp;lt;/blockquote&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Publishing RDF on the Web =&lt;br /&gt;
&lt;br /&gt;
== Metadata in RDF documents ==&lt;br /&gt;
* Every RDF document has ''some'' relation to the thing(s) it talks about. It is useful to explicitly state what that relation is. For example, in my FOAF file I state that I'm both the foaf:maker and the foaf:primaryTopic of the file. Other useful properties are: rdfs:isDefinedBy, foaf:topic. A concrete benefit is that consumers can pick out the “important” things from the graph.&lt;br /&gt;
&lt;br /&gt;
== HTML descriptions of URI-denoted things ==&lt;br /&gt;
* To create trust into the stability, reliability and availability of a URI, its HTML description should explicitly state the URI, it should contain an explicit ''Statement of Purpose'' to the effect that the URI is intended to be used as an identifier for the thing, and it should contain a ''Publisher Identification''. It must provide sufficient information to enable human users to know exactly what is being referred to. (This is inspired by [http://www.oasis-open.org/committees/download.php/3050/pubsubj-pt1-1.02-cs.pdf Published Subjects].)&lt;br /&gt;
&lt;br /&gt;
== Content negotiation ==&lt;br /&gt;
* The benefit of CN is that all URIs also work in a standard Web browser, not just in RDF-enabled tools and browsers. Thus it's great for authoring and debugging and when your URIs are exposed to a lot of neophytes (e.g. DBpedia, FOAF, DC). On the other hand, content negotiation is very hard to get right, the devil is in the details and it has turned out to be quite an interop hassle in practice. So, CN should be thought of as icing on the cake, but not a requirement for publishing RDF.&lt;br /&gt;
* Rule of thumb: If your server solution does CN, then do CN. Otherwise, getting it right will be too much effort.&lt;br /&gt;
* Keep in mind the advice from [http://www.w3.org/2001/tag/doc/alternatives-discovery.html On Linking Alternative Representations]: Provide links between different variants to make them all accessible. This means, if some HTML can be returned in response to RDF/HTML negotiation, there should be an RDF icon nearby, which points to the RDF variant.&lt;br /&gt;
&lt;br /&gt;
== Blank nodes ==&lt;br /&gt;
* Should be avoided in general. Using a blank node is appropriate if the publisher thinks that no one should ever care about this resource except in the context of looking at another, identified, resource in the same RDF document, e.g. a geo:Point that exists solely to give the location of another resource. Another situation where a blank node would be appropriate is when used as an existential variable, but I've never seen them used that way in a Linked Data context.&lt;br /&gt;
&lt;br /&gt;
== Linked Data ==&lt;br /&gt;
* Have rdfs:label (or a subclass thereof) on everything, always&lt;br /&gt;
* Have rdfs:label for the document URI, always&lt;br /&gt;
* Have a foaf:primaryTopic triple connecting document URI and main resource, always&lt;br /&gt;
* Have as much dc: metadata as possible on the document URI&lt;br /&gt;
* Think hard about possible external links, to other web pages and other RDF documents and entities. Provide as many as possible. These make all the difference.&lt;br /&gt;
&lt;br /&gt;
== URI design ==&lt;br /&gt;
* RDF URIs are always case sensitive, while from HTTP's point of view, some parts of the URI can change without changing any behaviour. So, be clear about the case of your URIs and stick to it once a decision is made. If in doubt, use as much lowercase as possible. (Story: L3S changed case of the domain name in their URIs, broke a Semantic Web Pipes demo.)&lt;br /&gt;
&lt;br /&gt;
= Web Architecture =&lt;br /&gt;
&lt;br /&gt;
== Information resources ==&lt;br /&gt;
* [http://lists.w3.org/Archives/Public/www-tag/2007Sep/0123.html From Harry Halpin]: “If there is a URI that is used to identify a resource one would want to make logical statements about, and these statements do not apply to possible representations of that resource, then one should use the &amp;quot;hash&amp;quot; or 303 redirection to separate  these URIs.”&lt;/div&gt;</description>
			<pubDate>Mon, 19 Apr 2010 17:35:35 GMT</pubDate>			<dc:creator>Rcygania2</dc:creator>			<comments>http://www.w3.org/wiki/User_talk:Rcygania2/RulesOfThumb</comments>		</item>
		<item>
			<title>User:Rcygania2/RulesOfThumb</title>
			<link>http://www.w3.org/wiki/User:Rcygania2/RulesOfThumb</link>
			<description>&lt;p&gt;Rcygania2:&amp;#32;&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;A loose collection of [[SemWeb]]-related rules of thumb that I would propose as good practice.&lt;br /&gt;
&lt;br /&gt;
= Modelling a dataset in RDF =&lt;br /&gt;
&lt;br /&gt;
== Language tags ==&lt;br /&gt;
* On untyped literals, if the literal is likely to be understood only by speakers of a single language, then add a language tag. If it is likely to work for speakers of many languages, keep it without a language tag. If the file or dataset has only a few exceptions, then it is perhaps better to go for consistency and mark them the same way as the rest of the file.&lt;br /&gt;
&lt;br /&gt;
= Designing vocabularies and ontologies =&lt;br /&gt;
&lt;br /&gt;
== Naming of properties ==&lt;br /&gt;
* Properties that point to documents (information resources) should have names that announce this fact, e.g. userProfile, userPage, userList, eventRecord, eventForm&lt;br /&gt;
* Relationship nouns make good propery names, e.g “parent” is better than “hasParent” or “isParentOf” (as per TimBL)&lt;br /&gt;
 Focus on one problem::&lt;br /&gt;
* [from an email to rdf-schema-dev on 2008-04-05] Some random half-formed thoughts: It's good if the vocabulary covers all my needs for a given problem. It's good if the vocabulary doesn't contain much extra stuff that I don't need to solve my problem. It's good if the purpose and coverage of the vocabulary can be conveyed in a short term or phrase (e.g. “document metadata” or “issue tracking”). It's good if the level of abstraction is consistent throughout the vocabulary, e.g. don't mix high-level concepts like Service and Container into your down-to-earth photo annotation vocabulary.&lt;br /&gt;
 Provide excellent documentation::&lt;br /&gt;
* [from an email to rdf-schema-dev on 2008-04-05] Random thoughts again: Some introductory narrative. A bunch of good examples for typical usages of the vocabulary. An UML-style overview diagram if the vocab has more than a few classes. Some tutorial-style text. An excellent reference section with all terms and notes about how they are supposed to be used (including notes on what they are NOT supposed to be used for).&lt;br /&gt;
&lt;br /&gt;
== Namespace URIs ==&lt;br /&gt;
From danbri in an email to the DC Architecture list, 30 March 2010 09:40:47 IST:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;blockquote&amp;gt;My current preference / advice (for new work) is for the managers of&lt;br /&gt;
each serious namespace to invest in a distinct domain name for it, and&lt;br /&gt;
for us as a community to come up with social machinery for 'watching&lt;br /&gt;
each other's backs' to ensure that the domains are kept in good&lt;br /&gt;
working order, fees are paid, etc. Sometimes an additional level of&lt;br /&gt;
indirection can add as much risk as it saves.  Initially when I bought&lt;br /&gt;
xmlns.com I have idea it could be a home for lots of namespaces, and&lt;br /&gt;
then the more I thought about it, the less I liked that idea. Each new&lt;br /&gt;
namespace added to the bucket brings some risk to the others using the&lt;br /&gt;
domain, by adding to the complexity and burden for subsequent&lt;br /&gt;
maintainers. So I think a proliferation of independent domain names,&lt;br /&gt;
while painful in its own way, spreads the risk...&amp;lt;/blockquote&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Later in the thread:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;blockquote&amp;gt;Rule of thumb - when wondering what info to include in a namespace&lt;br /&gt;
URI, ... try to leave *out* as much as possible&amp;lt;/blockquote&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Publishing RDF on the Web =&lt;br /&gt;
&lt;br /&gt;
== Metadata in RDF documents ==&lt;br /&gt;
* Every RDF document has ''some'' relation to the thing(s) it talks about. It is useful to explicitly state what that relation is. For example, in my FOAF file I state that I'm both the foaf:maker and the foaf:primaryTopic of the file. Other useful properties are: rdfs:isDefinedBy, foaf:topic. A concrete benefit is that consumers can pick out the “important” things from the graph.&lt;br /&gt;
&lt;br /&gt;
== HTML descriptions of URI-denoted things ==&lt;br /&gt;
* To create trust into the stability, reliability and availability of a URI, its HTML description should explicitly state the URI, it should contain an explicit ''Statement of Purpose'' to the effect that the URI is intended to be used as an identifier for the thing, and it should contain a ''Publisher Identification''. It must provide sufficient information to enable human users to know exactly what is being referred to. (This is inspired by [http://www.oasis-open.org/committees/download.php/3050/pubsubj-pt1-1.02-cs.pdf Published Subjects].)&lt;br /&gt;
&lt;br /&gt;
== Content negotiation ==&lt;br /&gt;
* The benefit of CN is that all URIs also work in a standard Web browser, not just in RDF-enabled tools and browsers. Thus it's great for authoring and debugging and when your URIs are exposed to a lot of neophytes (e.g. DBpedia, FOAF, DC). On the other hand, content negotiation is very hard to get right, the devil is in the details and it has turned out to be quite an interop hassle in practice. So, CN should be thought of as icing on the cake, but not a requirement for publishing RDF.&lt;br /&gt;
* Rule of thumb: If your server solution does CN, then do CN. Otherwise, getting it right will be too much effort.&lt;br /&gt;
* Keep in mind the advice from [http://www.w3.org/2001/tag/doc/alternatives-discovery.html On Linking Alternative Representations]: Provide links between different variants to make them all accessible. This means, if some HTML can be returned in response to RDF/HTML negotiation, there should be an RDF icon nearby, which points to the RDF variant.&lt;br /&gt;
&lt;br /&gt;
== Blank nodes ==&lt;br /&gt;
* Should be avoided in general. Using a blank node is appropriate if the publisher thinks that no one should ever care about this resource except in the context of looking at another, identified, resource in the same RDF document, e.g. a geo:Point that exists solely to give the location of another resource. Another situation where a blank node would be appropriate is when used as an existential variable, but I've never seen them used that way in a Linked Data context.&lt;br /&gt;
&lt;br /&gt;
== Linked Data ==&lt;br /&gt;
* Have rdfs:label (or a subclass thereof) on everything, always&lt;br /&gt;
* Have rdfs:label for the document URI, always&lt;br /&gt;
* Have a foaf:primaryTopic triple connecting document URI and main resource, always&lt;br /&gt;
* Have as much dc: metadata as possible on the document URI&lt;br /&gt;
* Think hard about possible external links, to other web pages and other RDF documents and entities. Provide as many as possible. These make all the difference.&lt;br /&gt;
&lt;br /&gt;
== URI design ==&lt;br /&gt;
* RDF URIs are always case sensitive, while from HTTP's point of view, some parts of the URI can change without changing any behaviour. So, be clear about the case of your URIs and stick to it once a decision is made. If in doubt, use as much lowercase as possible. (Story: L3S changed case of the domain name in their URIs, broke a Semantic Web Pipes demo.)&lt;br /&gt;
&lt;br /&gt;
= Web Architecture =&lt;br /&gt;
&lt;br /&gt;
== Information resources ==&lt;br /&gt;
* [http://lists.w3.org/Archives/Public/www-tag/2007Sep/0123.html From Harry Halpin]: “If there is a URI that is used to identify a resource one would want to make logical statements about, and these statements do not apply to possible representations of that resource, then one should use the &amp;quot;hash&amp;quot; or 303 redirection to separate  these URIs.”&lt;/div&gt;</description>
			<pubDate>Fri, 02 Apr 2010 18:41:28 GMT</pubDate>			<dc:creator>Rcygania2</dc:creator>			<comments>http://www.w3.org/wiki/User_talk:Rcygania2/RulesOfThumb</comments>		</item>
	</channel>
</rss>