Basics To Understanding Data Bias

Karen
3 min readJul 17, 2020

--

There is so much to data bias that this small blog post can barely scratch. I was introduced to this subject after discovering the book Invisible Women by Caroline Criado Perez. I do want to add the author only presumes only two sexes and genders in the book. As you learn more about data bias try to become more inclusive and grow your knowledge.

The Basics:

Researchers in the field from medicine, city planning, to aerospace have historically failed to collect and analyze data on women and other underrepresented groups.

The reporting bias creates a situation where guidelines based on the study of one group may be generalized and applied to both. This can result in a situation that becomes a user pain point or destructive.

Thoughtful design benefits everyone.

Heart Attacks
Heart attacks are often misdiagnosed in women based on the symptoms based on studies on men. For men, chest pain is a prominent symptom. For women, heart attacks often present as fatigue or symptoms like indigestions, with chest pain appearing in 1 out of 8 cases. As a result fewer women overall seek medical help during heart attacks and often misdiagnosed.

Machine Learning & Amazon Hiring Tool

Machine learning has been highly successful in the past decade and used in hiring, predictive policing, systems, immigration, voice-activated devices, etc. One glaring flaw is that algorithms are largely being trained on white-male-centric data and can amplify biases; if you feed them biased data, they will become more and biased (e.g. see racists chatbots.

“If we’re building a machine-learning model and we calibrate it on historical data, we’re just going to propagate the inherent biases in the data.
For example, in the criminal justice context, there are racial inequities in sentencing. So that data has inequities built-in, and they’re built into the new models.” — Martin Wells

Amazon shut down a model to score candidates for employment after they realized that it penalized women. The models were trained to vet applicants by observing patterns in resumes submitted to them over the last 10 years, largely male. The system taught itself that male candidates were better and filtered out resumes that included women-centric clubs or colleges.

Higher Education

Universities utilizing predictive analytics and machine learning algorithms too. Universities utilize this technology to:
• To identify students most in need of advising services
• To develop adaptive learning courseware that personalizes learning
• To manage enrollment.

Under the wrong hands, the use of predictive analytics can lead to unethical outcomes. The president of Mount Saint Mary’s was willing to use a survey to weed out students who could hurt the college’s retention rate in order to make the school look better.

My General Thoughts:

When institutions use race, ethnicity, age, gender, or socioeconomic status to target students for enrollment or intervention, they can intentionally, or not, reinforce inequities.

As we learn and understand more about how everything is built. It is important to take into account other people as we build products. Thoughtful design benefits everyone. Having to build something accessible doesn’t harm someone abled but it harms someone if certain things are not in place.

Given the human element can big data ever be bias-free in the context of diversity?

Recommended Reads:

--

--

Karen
Karen

Written by Karen

Hi 🙋🏻‍♀️ HCI - UX Researcher here. I enjoy to break down research papers into insightful bits. Thank you so much for visiting 🙏

No responses yet