Re: dwbp-ACTION-123: Call for comments

Hi everyone, some comments inline below.

On 09/12/2014 18:38, Annette Greiner wrote:
> Thanks for writing up a nice introduction to metadata. I really like that you addressed the issues of different granularity and different types. We may not even need to include the term as something readers need to be familiar with in advance.

+1


  In general, I like the idea of defining terms where they are first 
used in the text. I tend to think we should consider both technical 
people and their managers when determining what level of technicality to 
write to, so that someone charged with publishing data on the web can 
easily point a senior decision-maker to specific best practices in order 
to get buy-in.

+1

>
> Because we are really targeting publishers of data, I think the first few sentences are unnecessary.

-1

I don't agree that we're only targeting publishers. The charter includes 
this:

"Developers would like easy access to data that is 100% accurate, 
regularly updated and guaranteed to be available at all times. Data 
publishers are likely to take a different view. There are disparities 
between different developers too: for many, data means CSV files and 
APIs, for others it means linked data and the two sides are often 
disparaging of each other."

So it talks about developers as much as publishers.

The data usage vocab is a clear example where users are in scope.

It's also often the case that data publishers are also data users and, 
actually, I rather like the term data broker (a much nicer word than the 
horrible mangling of the English language that the European Commission 
uses: 'Infomediary' - gah!). Data brokers seem particularly relevant if 
we're talking about data enrichment etc.

So personally, I like a lot of what Laufer has captured, modulo trivial 
editorial nit picks.

Oh heck, the only way I can explain what I mean is to edit it...

[An hour passes]

Right. I've edited the metadata section in my current fork of the doc at 
http://philarcher1.github.io/dwbp-1/bp.html#metadata

My edited version of Laufer's text presents two primary classes of 
stakeholder - publisher and consumer - but then goes on to say that 
there are many other roles including data brokers.

Incidentally, I had to look up the word 'subjacent.' I kicked off a 
short Twitter discussion and, after that, changed it to 'underlying.'
https://twitter.com/philarcher1/status/543014767884259328

 > You could start with the sentence, “Metadata is data about data.” 
That nicely clues the reader to the fact that this is an introduction 
that will explain what metadata is.
>
> I don’t understand why there is a paragraph about distribution formats included here. Not only is it out of scope, it seems largely off topic.

Yes and no. I don't think we can be completely silent about different 
data formats. What we can do, as I've tried to do in my edits, is to say 
that the intentions are normative (an instance of an RFC 2119 keyword is 
coming up shortly). But the implementation is a suggestion. More in a sec.

>
> I think we should have here some explicit best practices that are about metadata more generally than specific fields, like “metadata should be available in human readable and machine-readable forms”.

+1

Actually, the fact that you have to provide metadata at all is a BP as 
far as I'm concerned. So I write it out. That gave me a chance to write 
an actual best practice which is not as easy as one might imagine, even 
for one as basic as "provide metadata."

I used RFC 2119 in the Intended Outcome section. My proposal is that 
each BP has such a keyword (MUST, SHOULD, MAY).

Two more of Laufer's paragraphs could also be turned into BPs:

1. Human and Machine Readable
2. Standard vocabularies
3. A BP on descriptive metadata - in more detail than in the BP already 
provided.
4. A BP on structural metadata - ditto.
5. Domain-specific (I'm sure Annette and Eric S can come up with 
examples), mine might be GTFS for transport data.

  That is a best practice in itself, so I think it should get more than 
just a mention in the introduction.
>
> The organization of the numbered sections is confusing to me. The last sentence of the intro suggests that the data licenses and other sections below are subsections of metadata, but the numbers indicate otherwise, and it’s not at all clear where the metadata section is meant to end. There is also an allusion to an introduction for a “data organization” subsection that seems to be between the metadata level and the examples of metadata.
>
> In a larger issue, probably not something we can address in the current draft, I’m not sure that the data lifecycle-based document structure is very helpful in terms of finding a specific best practice. I’m finding it difficult to guess where things are. In a way, everything should fit under the rubric of best practices for data publication.

Bernadette has answered these last two points.

HTH

Phil.
(goes off to write to GitHub guru Yaso to work out how to heck to force 
a merge...)

Phil.


>
> --
> Annette Greiner
> NERSC Data and Analytics Services
> Lawrence Berkeley National Laboratory
> 510-495-2935
>
> On Dec 5, 2014, at 9:38 AM, Laufer <laufer@globo.com> wrote:
>
>> Hello all,
>>
>> I wrote a description for the beginning of the metadata section and I want to ask the group to comment:
>>
>> http://w3c.github.io/dwbp/bp.html#metadata
>>
>> Thank you.
>>
>> Cheers,
>> Laufer
>>
>> --
>> .  .  .  .. .  .
>> .        .   . ..
>> .     ..       .
>
>

-- 


Phil Archer
W3C Data Activity Lead
http://www.w3.org/2013/data/

http://philarcher.org
+44 (0)7887 767755
@philarcher1

Received on Thursday, 11 December 2014 15:13:15 UTC