This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 17417 - Define a security model for requesting access to the MIDIAccess interface
Summary: Define a security model for requesting access to the MIDIAccess interface
Status: CLOSED MOVED
Alias: None
Product: AudioWG
Classification: Unclassified
Component: MIDI API (show other bugs)
Version: unspecified
Hardware: PC All
: P2 normal
Target Milestone: TBD
Assignee: Chris Wilson
QA Contact: This bug has no owner yet - up for the taking
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2012-06-05 12:47 UTC by Michael[tm] Smith
Modified: 2013-01-11 09:00 UTC (History)
7 users (show)

See Also:


Attachments
UA Permissioning (67.43 KB, image/jpeg)
2012-12-24 08:45 UTC, Marcos Caceres
Details
Mockup of permissioning model (site preferences) (67.43 KB, text/plain)
2012-12-24 08:48 UTC, Marcos Caceres
Details

Description Michael[tm] Smith 2012-06-05 12:47:58 UTC
Audio-ISSUE-104: Define a security model for requesting access to the MIDIAccess interface [MIDI API]

http://www.w3.org/2011/audio/track/issues/104

Raised by: Jussi Kalliokoski
On product: MIDI API

The initial idea was that we'd use getUserMedia("midi") but this is potentially confusing, as MIDIAccess is not a MediaStream. Maybe extend the Navigator object with a similar function, such as getMIDIAccess(successCallback, ?failureCallback)?
Comment 1 Olivier Thereaux 2012-06-06 14:33:06 UTC
From the editor: change set https://dvcs.w3.org/hg/audio/rev/1ab2a972b9bc adds a security model for the MIDI API. 

Please review.
Comment 2 Olivier Thereaux 2012-06-15 11:35:51 UTC
Seeing no objection after more than a week, closing.
Comment 3 Chris Wilson 2012-12-13 19:14:31 UTC
Open question of what precisely the security model around MIDI should be, and what the terminology should be around prompting the user.
Comment 4 Chris Wilson 2012-12-13 19:22:33 UTC
On Thursday, December 13, 2012 at 1:53 PM, Marcos Caceres wrote:
> Obtains an interface to enumerate and request access to MIDI devices on the user's system.
>
> This call may prompt the user for access to MIDI devices.
The above needs to be a SHOULD.
> If the user accepts

accepts should be "If the user gives express permission"
> or the call is otherwise approved, successCallback is invoked, with a MIDIAccess object as its argument.

>
> If the user declines or the call is denied, the errorCallback (if any) is invoked.
All the above should really be in the algorithm or all this should be labelled as non-normative (i.e., this is a note of how it works conceptually, but can't be implemented).
Comment 5 Jussi Kalliokoski 2012-12-13 20:20:25 UTC
I agree that the word should be "SHOULD". After all, it's the ideal, and "SHOULD" still isn't "MUST".
Comment 6 Chris Wilson 2012-12-13 21:04:58 UTC
(In reply to comment #5)
> I agree that the word should be "SHOULD". After all, it's the ideal, and
> "SHOULD" still isn't "MUST".

It's true, SHOULD isn't MUST - but I've become much less convinced there's a real fingerprinting issue here, particularly since Java has had unprompted MIDI support for a vary long time - and the exploits would be VERY uncommon and very equipment-dependent.  I'm exploring internally with security folks to get their sense, but I don't think that the UA SHOULD prompt the user in the default case.
Comment 7 Jussi Kalliokoski 2012-12-14 09:33:54 UTC
(In reply to comment #6)
> (In reply to comment #5)
> > I agree that the word should be "SHOULD". After all, it's the ideal, and
> > "SHOULD" still isn't "MUST".
> 
> It's true, SHOULD isn't MUST - but I've become much less convinced there's a
> real fingerprinting issue here, particularly since Java has had unprompted
> MIDI support for a vary long time

Yes, Java is quite well-known for its security features... Hahaha, sorry, that fruit was hanging way too low for me to resist.

> and the exploits would be VERY uncommon
> and very equipment-dependent.  I'm exploring internally with security folks
> to get their sense, but I don't think that the UA SHOULD prompt the user in
> the default case.

I agree with you on exploits, they're likely to be a very uncommon and relatively meaningless, but they're still exploits. The last thing we need is more attack-vector surface on the web.

As for fingerprinting, if the default is not to ask, we void every other working group's often extreme efforts to avoid user fingerprinting and practically give the user's identity on a plate to anyone who wants to take it. That is, if they have any distinguishable MIDI devices. The main reason Java's MIDI API isn't used for fingerprinting often is that it's not very subtle (you want fingerprinting to be subtle). Add that to the fact that just the MIDI information isn't enough to form a reliable pool of entropy to identify users (usually), and it's not a very tempting choice. However, if the user doesn't even notice that you're getting the info, it's a very nice source of entropy. We don't want to add a freebie to the already-too-large pool of entropy each user carries with their browsing session.
Comment 8 Florian Bomers 2012-12-14 15:09:20 UTC
I've always had second thoughts about the fact that MIDI access wasn't governed by a security manager in Java. After all, an exploit is not impossible: with MIDI, we're often communicating directly with kernel drivers, and there are many BAD drivers around. At least a denial of service attack seems possible, provided that you find a corresponding bug.

Also, MIDI can be used with virtual ports to communicate outside any sandbox. E.g. http://audiob.us/ on iOS, which started off by using a virtual MIDI port to transport audio data from app to app in real time (something which is normally not possible due to the sandbox). However, Apple seems to allow this.

Do audio streams require an explicit acknowledgement of the user?
Comment 9 Jussi Kalliokoski 2012-12-14 15:32:35 UTC
(In reply to comment #8)
> Do audio streams require an explicit acknowledgement of the user?

Depends. Generally, you need permission to read streams, e.g. Web Audio API doesn't require explicit permission while accessing a microphone does with MediaStreams does. On iOS (afaik), though, you need user interaction to activate any audio playback in the browser.
Comment 10 Marcos Caceres 2012-12-24 08:45:27 UTC
Created attachment 1303 [details]
UA Permissioning

(of course, this won't go into the spec)... this is what I was thinking for the permission model, except the lists would be broken into inputs and outputs. Permissioning then just becomes part of the site preferences of a UA. 

I've been working on implementing a mockup of this (based Chris' implementation):
 
http://marcoscaceres.github.com/WebMIDIAPIShim/

Though I have not yet added the ability for the user to select individual inputs and outputs. Will add that over next few days.
Comment 11 Marcos Caceres 2012-12-24 08:48:04 UTC
Created attachment 1304 [details]
Mockup of permissioning model (site preferences)

(of course, this won't go into the spec)... this is what I was thinking for the permission model, except the lists would be broken into inputs and outputs. Permissioning then just becomes part of the site preferences of a UA. 

I've been working on implementing a mockup of this (based Chris' implementation):
 
http://marcoscaceres.github.com/WebMIDIAPIShim/

Though I have not yet added the ability for the user to select individual inputs and outputs. Will add that over next few days.
Comment 12 Chris Wilson 2012-12-25 23:14:02 UTC
(In reply to comment #11)
> Created attachment 1304 [details]
> Mockup of permissioning model (site preferences)
> 
> (of course, this won't go into the spec)... this is what I was thinking for
> the permission model, except the lists would be broken into inputs and
> outputs. Permissioning then just becomes part of the site preferences of a
> UA. 
> 
> I've been working on implementing a mockup of this (based Chris'
> implementation):
>  
> http://marcoscaceres.github.com/WebMIDIAPIShim/
> 
> Though I have not yet added the ability for the user to select individual
> inputs and outputs. Will add that over next few days.

This level of tweakiness is exactly what I'm worried about.  I don't think any sane user will walk through the list of their available MIDI ports and carefully select which they're comfortable "sharing" with a web application - they'll either say OK or not OK.  And even that, I'd like to minimize as much as possible, and I want the spec to continue to make it clear that the implementation does not NEED to prompt the user with UI; this may come inside a web app that has permissions already set, or in a loose environment that has already had MIDI access approved.
Comment 13 Marcos Caceres 2012-12-25 23:40:15 UTC
> This level of tweakiness is exactly what I'm worried about.  I don't think
> any sane user will walk through the list of their available MIDI ports and
> carefully select which they're comfortable "sharing" with a web application
> - they'll either say OK or not OK. 

Right, but there can be multiple representations of this dialog. I personally like having the ability to choose, if only because it allows me as a user to see that everything is plugged in (I know, different use case... but it's still related because I may choose to unplug something at this point for privacy/personal reasons). 

Another representation could be like the geolocation permission bar, but with a way to expand it to give the view that I linked to. 

> And even that, I'd like to minimize as
> much as possible, and I want the spec to continue to make it clear that the
> implementation does not NEED to prompt the user with UI; this may come
> inside a web app that has permissions already set, or in a loose environment
> that has already had MIDI access approved.

Agree. This would be good for just output. For example, in a game, it would suck to have to ask the user if they want to hear MIDI sound effects.

Regardless, I think the point is that there needs to be enough flexibility in the security model to allow for these various scenarios (and that both implementors and users understand the risks ... I know, d'uh Marcos!). 

I think the current text gives that flexibility already and anything else might be overreaching.
Comment 14 Chris Wilson 2012-12-26 17:09:46 UTC
I think I've lost track of what the requested changes are here.

I believe there should be enough flexibility so that an implementation that chooses, under any circumstances, to not prompt the user is not considered non-compliant (or even "making poor choices").  Although I understand that the current language is somewhat loose to allow this, I don't think it "can't be implemented" - it simply offers a choice.  Other specifications have similar security options, implemented differently across browsers; how do we mirror that?
Comment 15 Marcos Caceres 2012-12-29 01:24:25 UTC
I've been trying to come up with a more incremental security model to address the common use case of just getting access to system default ports without needing to ask for permission (perfect for games sound effects) - while at the same time incrementally increasing the security controls to allow users to control what inputs and outputs are made available to an application (and also handle the case of hot plugging and unplugging devices). FWIW, I don't think the current security model handles this well (and may even break if the API does eventually have to deal with people plugging and unplugging devices).

My extremely preliminary thoughts are captured in the link below:

https://gist.github.com/4384745

It would require some significant changes to the API (e.g., having a single midi access point and doing away with the MIDIAccess object). 

As I now have a more or less functional reference implementation of the MIDI API, I'll try to prototype a demo over the next week. However, if anyone wants to help me hash out these ideas, that would be greatly appreciated.
Comment 16 Chris Lilley 2013-01-08 17:03:27 UTC
(In reply to comment #12)
> This level of tweakiness is exactly what I'm worried about.  I don't think
> any sane user will walk through the list of their available MIDI ports and
> carefully select which they're comfortable "sharing" with a web application
> - they'll either say OK or not OK.  And even that, I'd like to minimize as
> much as possible

In general I agree, but I can think of one case where a user might want to have more fine-grained control. Suppose they are happy to share their input devices (keyboards, pads etc) and the output devices that can be played (including bank switch etc) *except* for a device that can be written to destructively (e.g. can have new patches or samples uploaded, loosing the previously stored ones).

But maybe that is better addressed as write access or disabling sysex rather than port-by-port.
Comment 17 Chris Wilson 2013-01-08 17:51:43 UTC
(In reply to comment #16)
> (In reply to comment #12)
> > This level of tweakiness is exactly what I'm worried about.  I don't think
> > any sane user will walk through the list of their available MIDI ports and
> > carefully select which they're comfortable "sharing" with a web application
> > - they'll either say OK or not OK.  And even that, I'd like to minimize as
> > much as possible
> 
> In general I agree, but I can think of one case where a user might want to
> have more fine-grained control. Suppose they are happy to share their input
> devices (keyboards, pads etc) and the output devices that can be played
> (including bank switch etc) *except* for a device that can be written to
> destructively (e.g. can have new patches or samples uploaded, loosing the
> previously stored ones).

If someone really wants this power, of course it's not my place to say no.  My point is that this is a very advanced tweaky configuration, and experience leads me to believe 99.99...% of users will not mess with such things (kinda like people hand-editing their security zones in IE - super-useful tool, few people mess with it.)  I'm not say I want to prevent a UA from working this way, I am saying I do not want to mandate it.  The current spec would absolutely let a UA selectively decide to expose each port independently and be compliant.

> But maybe that is better addressed as write access or disabling sysex rather
> than port-by-port.

It would have to be sysex - there's no "write access", other than access to output ports.  And you could selectively enable sysex port-by-port.  That would be marginally acceptable (there are still a lot of devices that use sysex heavily for normal operatio
Comment 18 Chris Wilson 2013-01-08 17:56:57 UTC
Grr.  tab-sended accidentally.

> It would have to be sysex - there's no "write access", other than access to
> output ports.  And you could selectively enable sysex port-by-port.  That
> would be marginally acceptable (there are still a lot of devices that use
> sysex heavily for normal operatio
n - for example, the standardized MIDI Machine Control messages (start/stop/ffw/rewind) are actually sysex messages.  I've been talking to Incident about the GTar; it, like some other devices, uses sysex for normal communication.

Really - I see the potential risks of exposing MIDI; in fact, I've detailed them personally in the specification.  At the same time, in the balance with user experience - I do not see the need to throw a dialog up in the user's face every time they want to use a MIDI controller.  If all I have attached to my machine is a keyboard input device, I should be able to say once "yes it's cool, don't ask me again" and that should be compliant.
Comment 19 Marcos Caceres 2013-01-08 19:57:16 UTC
(In reply to comment #18)
> Really - I see the potential risks of exposing MIDI; in fact, I've detailed
> them personally in the specification.  At the same time, in the balance with
> user experience - I do not see the need to throw a dialog up in the user's
> face every time they want to use a MIDI controller.  If all I have attached
> to my machine is a keyboard input device, I should be able to say once "yes
> it's cool, don't ask me again" and that should be compliant.

I agree. 

Regarding outputs: Ideally, for system default output (if it can be determined by the UA) you should not have to ask for permission. It's not really that different to using <audio autoplay>.

I still think we need to have a bigger discussion about folding MIDIAccess into a single naviagator.midi. I think it would simplify the API.
Comment 20 Chris Wilson 2013-01-08 20:11:16 UTC
(In reply to comment #19)
> Regarding outputs: Ideally, for system default output (if it can be
> determined by the UA) you should not have to ask for permission. It's not
> really that different to using <audio autoplay>.

Which browsers have differing opinions of (autoplay without user interaction).  But it is still a bit different, because you can write data that will overwrite patches, etc. - as long as you have access to sysex.  Without it, all you could do maliciously would be to switch to different patches, etc.

> I still think we need to have a bigger discussion about folding MIDIAccess
> into a single naviagator.midi. I think it would simplify the API.

Now would be a really really good time.  Can you make Thursday's call?
Comment 21 Marcos Caceres 2013-01-08 20:14:28 UTC
(In reply to comment #20) 
> > I still think we need to have a bigger discussion about folding MIDIAccess
> > into a single naviagator.midi. I think it would simplify the API.
> 
> Now would be a really really good time.  Can you make Thursday's call?

Yes, I'll be there.
Comment 22 Olivier Thereaux 2013-01-11 09:00:25 UTC
This issue now tracked at:
https://github.com/WebAudio/web-midi-api/issues/3