W3C Workshop on Web and Machine Learning

AI (Machine Learning): Bias & Garbage In, Bias & Garbage Out - by John Rochford (University of Massachusetts Medical School)

Previous: We Count: Fair Treatment, Disability and Machine Learning All talks Next: Cognitive Accessibility and Machine Learning



Slide 1 of 40

Hi, this is John Rochford.

It's my honor and privilege to present to you today.

My talk title is AI Machine Learning: Bias and Garbage In, Bias and Garbage Out.

I am the INDEX program director and a faculty member at the Eunice Kennedy Shriver Center, part of UMass Medical School.

I'm an Invited Expert for the World Wide Web Consortium.

A little about me, I am legally blind.

I trade privacy for utility.

The 14 Alexas in my home enable me to control devices with interfaces I can't see.

Hey Alexa, what time is it now?

I am tired of telling you what time it is.

Look at a clock.

What's the heck Alexa!

And of course, she records to the cloud everything I say.

I'm an AI researcher focused on ML text simplification, ML fairness and empowerment and I'm a 35-year developer of technology for the disabled.

These are some of the INDEX staff.

We are from all over the world.

We are speakers of 16 languages.

We are women.

We are the disabled.

A big INDEX project is creating easy-to-understand COVID-19 information worldwide.

With our easy Text.AI startup, INDEX ML engineers and disabled staff are developing our ML model and using crowdsourcing to simplify COVID-19 texts from the websites of every country's governments.

Simple COVID-19 texts will help huge populations of people with cognitive disabilities, low literacy, non-native language speakers and seniors make the world safe and healthy.

My presentation targets are ML Social Bias, why we should care, what we should do about it, how we can do it.

And I hope to persuade you that for your machine learning efforts to be successful, their development and data must include disabled people.

But first, a shout out to Black Lives Matter.

University of Toronto and MIT researchers found that every facial recognition system they tested performed better on lighter-skinned faces.

That includes a one-in-three failure rate with identifying darker-skinned females.

This is from the article, Gender Shades, Intersectional Accuracy Disparities and Commercial Gender Classification.

The author of that article, Joy, has a TED talk.

It's called How I'm Fighting Bias in Algorithms.

And what she's showing here in the picture is that for her face to be recognized, she has to wear a mask of a white female.

Why care?

Profound responsibility.

“There's nothing artificial about AI.

It's inspired by people, it's created by people and most importantly, it impacts people.

It is a powerful tool we are only just beginning to understand and that is a profound responsibility.” From the article Bias in, Bias out, the Stanford Scientist Out to Make AI Less White and Male.

Why care about machine learning fairness and disability?

It's the most significant challenge of our time.

If it doesn't work for all, nobody will trust it.

The disabled are part of every segment of society.

Thus, if we can solve the problem for the disabled, we can solve the problem for everyone.

Why care?

What about you?

“Bias in ML has been almost ubiquitous when the application is involved in people and it has already hurt the benefit of people in minority groups or historically disadvantageous groups.

Not only people in minority groups but everyone should care about the bias in AI.

If no one cares, it is highly likely that the next person who suffers from bias treatment is one of us.” From the article, A Tutorial in Fairness in machine learning.

Being smart can be a deficit in ML.

“People who perform better on a test of pattern detection, a measure of cognitive ability, are also quicker to form and apply stereotypes.” “In other words, being smart might put you at a greater risk of prejudice but you can still fight against those instincts by challenging your thinking and getting to know people who aren't like you.” This is from the article, Smart People Are More Likely To Stereotype.

How can you get to know people who aren't like you?

Include the disabled in machine learning development.

“To ensure AI-based systems are treating people with disabilities fairly, it is essential to include them in the development process.” This is from the article, How to tackle AI Bias For People With Disabilities.

Here's an MIT failure to include the disabled.

An MIT project claimed to translate American Sign Language with machine learning and sign-language gloves.

It failed because sign language is much more than communicating with hands.

It's about body language and facial expressions.

MIT would have known that was essential had it involved the Deaf community.

From the article, Why Sign-Language Gloves Don't Help Deaf People.

A big issue is lack of data from the disabled due to privacy concerns.

Many of us do not disclose our disabilities.

When we do, we are denied employment, housing, and more.

Thus, despite being 15% of the population, we are significantly underrepresented in training data.

What can we do about a lack of data from the disabled?

Accurate analysis.

“We are helping machine learning engineers, figure out what questions to ask of their data, to diagnose why their systems may be making unfair predictions.” This is from the article, MIT Researchers Show How To Detect and Address AI Bias Without Loss in Accuracy.

What can we do with data we do have from the disabled?

Fairness through unawareness: “No information about protected attributes, e.g gender, age, disability is gathered and used in the decision-making.

Fairness through awareness: Membership in a protected group is explicitly known and fairness can be formally defined, tested, and enforced algorithmically.” This is from the article, AI Fairness For People With Disabilities Point of View.

What can we do to mitigate our lack of data from the disabled?

One possibility is to use synthetic data.

“Given a data set involving a protected attribute with a privileged and unprivileged group, we create an ideal world data set.

For every data sample, we create a new sample having the same features except the protected attributes and label as the original sample but with the opposite protected attribute value.” This is from the article, Data Augmentation for Discrimination Prevention and Bias Disambiguation.

What tools will help us implement machine learning fairness?

Well there are commercial tools.

Many companies such as Microsoft, Amazon, IBM offer bias mitigation services in their ML platforms.

There are open source tools.

There are an increasing number on GitHub.

The following are the best of the current ones.

The Google What-If Tool visually probes the behavior of trained machine learning models with minimal coding.

This interface looks pretty cool to me.

But then again, I'm blind.

Fairlearn, a Python package to assess and improve fairness of machine learning models.

Skater, a Python library designed to demystify the learned structures of a black box model, both globally (inference on the basis of a complete dataset) and locally (inference about an individual prediction).


It detects demographic differences and the output of machine learning models or other assessments.

IBM AI Fairness 360 Open Source Toolkit.

A comprehensive set of fairness metrics for dataset machine learning models, explanations for these metrics, and algorithms to mitigate bias in data sets and models.

IBM AI Factsheets 360.

A research effort to foster trust in AI by increasing transparency and enabling governance.

Final thought about disability.

There is a saying we shout from the rooftops for all to hear.

I hope you do.

“Nothing about us without us”.

A final thought for you, you need to be woke if you want your AI to be woke.

Here's my contact info.

There's a QR code for my presentation and a link to it.

There's another link to AI Fairness resources.

I thank you for listening to the little I know about machine learning fairness.

Keyboard shortcuts in the video player
  • Play/pause: space
  • Increase volume: up arrow
  • Decrease volume: down arrow
  • Seek forward: right arrow
  • Seek backward: left arrow
  • Captions on/off: C
  • Fullscreen on/off: F
  • Mute/unmute: M
  • Seek percent: 0-9

Previous: We Count: Fair Treatment, Disability and Machine Learning All talks Next: Cognitive Accessibility and Machine Learning

Thanks to Futurice for sponsoring the workshop!


Video hosted by WebCastor on their StreamFizz platform.