Re: Bioschemas profile specification: how to name a profile

*Second sub-thread: How to name a profile?

* Three different options are being discussed.
(a) the context defines the profile name to be the chosen type URI, e.g. 
"Protein": { "@id": "http://purl.obolibrary.org/obo/PR_000000001" }
(b) the context defines a type within namespace http://bioschemas.org 
like http://bioschemas.org/Protein. This is a hollow shell that just 
denotes we're talking about a Bioschemas profile.
(c) We use the new schema.org concepts of defined term and defined term 
set, such as in the example provided by Mélanie:
            "@type": "DefinedTerm",
             "@id": "http://purl.obolibrary.org/obo/PR_000000001",
             "name": "Protein",
             "inDefinedTermSet": "http://bioschemas.org/terms",
             "description": "An amino acid chain that is produced de 
novo by ribosome-mediated translation of a genetically-encoded mRNA.",
             "sameAs": "http://purl.obolibrary.org/obo/NCIT_C17021",
             "sameAs": "http://semanticscience.org/resource/SIO_010043"

Here are a few thoughts with respect to these options:

My concern with (a) is that a JSON-LD context is just a handy way to 
write data: the string "Protein" is a sheer shorthand, it could be named 
anything else. A webpage may use it this way:
     "@type": "Protein"
But it would be perfectly equivalent to _not_ use the context and write 
this instead:
     "@type": "http://purl.obolibrary.org/obo/PR_000000001"
My point is that a tool extracting Bioschemas markup should _not_ rely 
on the use of any specific shorthand.
Besides, doing so would force using Bioschemas with JSON-LD only, but 
what about webpages using other markup formats? Unless I'm missing 
something here?

Hence, I'm more inclined to go for (b) that defines a hollow shell for 
each profile such as http://bioschemas.org/Protein. The advantage is 
that it will always look the same whether a webpage uses the Bioschemas 
context or not. And this works the same across markup formats, JSON-LD, 
RDFa etc.

(c) seems a interesting alternative. Instead of defining a JSON-LD 
context, we would define a Bioschemas vocabulary by means of 
DefinedTerms. For now, I don't quite understand how we would refer to 
the "Protein" defined term in a webpage markup. Any clues?
Advantage: this solution avoids defining a Bioschemas profile as a type 
(option (b)), which makes the distinction between a type and a profile 
quite unclear.
Still, I agree with Justin that there is a need for specific code to 
cope with such DefinedTerms. However, is this really an issue since, in 
any case, a Bioschemas extractor tool will have to know the profiles 
specifications to figure out what it looks for. Also, this is not much 
different from the additionalProperty case: there has to be some 
specific code to cope with it too. Right?

Franck.


Le 28/06/2018 à 19:40, Justin Clark-Casey a écrit :
> On Thu, 28 Jun 2018 at 16:42, ljgarcia <ljgarcia@ebi.ac.uk 
> <mailto:ljgarcia@ebi.ac.uk>> wrote:
>
>     Hi,
>
>     What Melanie suggests is useful to describe profiles, they would
>     become
>     a DefinedTerm. That would help as well to avoid type/profile
>     confusion.
>     We would talk then about DefinedTerms. If we find a way to also
>     described the properties accepted with their restrictions, that
>     would be
>     even better. That might be a good subject for a different discussion.
>
>
> This means there will have to be special Bioschemas code that knows to 
> look in a DefinedTerm somewhere for this information.  I still think 
> using a subtype to signify a profile will be simpler.
>
> I also disagree with Alasdair in that I think there should be a 
> http://bioschema.org/Protein type.  This would be an empty type that 
> just signifies we're talking about a Bioschemas defined protein. so it 
> isn't treading on anybodies toes.  This would have information saying 
> it's defined by http://purl.obolibrary.org/obo/PR_000000001 and it's 
> same as terms.  Without this, there's not much point having a 
> bioschemas context, and requiring people to use this specific string 
> every time is cumbersome, especially if every group chooses something 
> from a different ontology.  This makes writing and consuming markup 
> harder.
>
>
>     The question remains. How do we choose a term over others to
>     associate
>     it to a profile/DefinedTerm?
>
>
> I suggest having members of each specification group propose which 
> term they want and then come to consensus via discussion and/or vote.
>
>
>     Regards,
>
>
>     On 2018-06-28 15:45, Melanie Courtot wrote:
>     > Hi,
>     >
>     > We could consider using the defined terms,
>     >
>     https://dataliberate.com/2018/06/18/schema-org-introduces-defined-terms/,
>     > to do that.
>     >
>     > So have a protein be defined as
>     >
>     >            "@type": "DefinedTerm",
>     >             "@id": "http://purl.obolibrary.org/obo/PR_000000001",
>     >             "name": "Protein",
>     >             "inDefinedTermSet": "http://bioschemas.org/terms",
>     >             "description": "An amino acid chain that is produced de
>     > novo by ribosome-mediated translation of a genetically-encoded
>     mRNA.",
>     >             "sameAs": "http://purl.obolibrary.org/obo/NCIT_C17021",
>     >             "sameAs":
>     "http://semanticscience.org/resource/SIO_010043"
>     >
>     > (Using random examples of sameAs from
>     > https://www.ebi.ac.uk/ols/search?q=protein)
>     >
>     > Cheers,
>     > Melanie
>     >
>     > ---
>     > Melanie Courtot, PhD
>     > EMBL-EBI
>     > GA4GH/BioSamples project lead
>     >
>     >> On 28 Jun 2018, at 15:18, ljgarcia <ljgarcia@ebi.ac.uk
>     <mailto:ljgarcia@ebi.ac.uk>> wrote:
>     >> Hi,
>     >>
>     >> I understood Franck's question in a different way.
>     >>
>     >> Alasdair says
>     >>
>     >>> I also agree that a context file should be provided which has the
>     >>> chosen types and terms in it, i.e. the context file would define
>     >>> Protein to be the URI http://purl.obolibrary.org/obo/PR_000000001.
>     >>
>     >> I think what Franck is asking is how to choose
>     >> http://purl.obolibrary.org/obo/PR_000000001 over other possible
>     >> terms to define a Protein. For the taxon case, same as it happens
>     >> with proteins, there are multiple possibilities. Franck, is this
>     >> your question? If it is, I do not think there is any agreement on
>     >> how to choose, other than going for well-known ontologies broadly
>     >> accepted by the community of interest, even better if the term is
>     >> mapped to other possible ones.
>     >>
>     >> Regards,
>     >>
>     >> On 2018-06-28 11:50, Gray, Alasdair J G wrote:
>     >> On 27 Jun 2018, at 19:19, Justin Clark-Casey
>     <justinccdev@gmail.com <mailto:justinccdev@gmail.com>>
>     >> wrote:
>     >> I think we should have mandatory known @types and properties.  In
>     >> my view, Bioschemas should be as easy as possible to write and
>     >> consume.  Multiple options will increase cognitive load on writers
>     >> (which one do I choose?  Why are these 2 examples using these
>     >> different terms?) and open the door to greater inconsistency.
>     >> Non-mandatory types will also raise the barriers for writing
>     >> Bioschemas software that will have to be aware of equivalent
>     >> mappings.
>     >> I completely agree that we should have a single approved type for
>     >> each profile, and likewise for each property a single chosen term.
>     >> This is the whole point of having the profiles.
>     >> I would go one step further and say that Bioschemas should provide
>     >> an http://bioschemas.org [1] [1]context that will define types such
>     >> as
>     >> Taxon, rather than blessing particular ontology terms.
>     >> I also agree that a context file should be provided which has the
>     >> chosen types and terms in it, i.e. the context file would define
>     >> Protein to be the URI http://purl.obolibrary.org/obo/PR_000000001.
>     >> To
>     >> be completely explicit, we would not be defining a type in the
>     >> bioschemas namespace, e.g. http://bioschemas.org/Protein.
>     >> This context can also document equivalent terms in different
>     >> ontologies.
>     >> I like the idea that this also contains mappings to the equivalent
>     >> terms in other ontologies.
>     >> Alasdair
>     >> Alasdair J G Gray
>     >> Fellow of the Higher Education Academy
>     >> Assistant Professor in Computer Science,
>     >> School of Mathematical and Computer Sciences
>     >> (Athena SWAN Bronze Award)
>     >> Heriot-Watt University, Edinburgh UK.
>     >> Email: A.J.G.Gray@hw.ac.uk <mailto:A.J.G.Gray@hw.ac.uk>
>     >> Web: http://www.macs.hw.ac.uk/~ajg33
>     <http://www.macs.hw.ac.uk/%7Eajg33>
>     >> ORCID: http://orcid.org/0000-0002-5711-4872
>     >> Office: Earl Mountbatten Building 1.39
>     >> Twitter: @gray_alasdair
>     >> Untitled Document
>     >> -------------------------
>     >> _HERIOT-WATT UNIVERSITY IS THE TIMES & THE SUNDAY TIMES
>     >> INTERNATIONAL
>     >> UNIVERSITY OF THE YEAR 2018_
>     >> Founded in 1821, Heriot-Watt is a leader in ideas and solutions.
>     >> With
>     >> campuses and students across the entire globe we span the world,
>     >> delivering innovation and educational excellence in business,
>     >> engineering, design and the physical, social and life sciences.
>     >> This email is generated from the Heriot-Watt University Group,
>     which
>     >> includes:
>     >> * Heriot-Watt University, a Scottish charity registered under
>     >> number
>     >> SC000278
>     >> * Edinburgh Business School a Charity Registered in Scotland,
>     >> SC026900. Edinburgh Business School is a company limited by
>     >> guarantee,
>     >> registered in Scotland with registered number SC173556 and
>     >> registered
>     >> office at Heriot-Watt University Finance Office, Riccarton, Currie,
>     >> Midlothian, EH14 4AS
>     >> * Heriot- Watt Services Limited (Oriam), Scotland's national
>     >> performance centre for sport. Heriot-Watt Services Limited is a
>     >> private limited company registered is Scotland with registered
>     >> number
>     >> SC271030 and registered office at Research & Enterprise Services
>     >> Heriot-Watt University, Riccarton, Edinburgh, EH14 4AS.
>     >> The contents (including any attachments) are confidential. If you
>     >> are
>     >> not the intended recipient of this e-mail, any disclosure, copying,
>     >> distribution or use of its contents is strictly prohibited, and you
>     >> should please notify the sender immediately and then delete it
>     >> (including any attachments) from your system.
>     >> Links:
>     >> ------
>     >> [1] http://bioschemas.org/
>     >
>     >
>     >
>     > Links:
>     > ------
>     > [1] http://bioschemas.org/
>

Received on Friday, 29 June 2018 10:29:53 UTC