dreamstime.com

This is not one of those articles that sensationalize effects of Artificial Intelligence (AI) and tells everyone to build an apocalyptic bunker because robots will take over the world. It is intended to inform social justice advocates to get up to speed on the social injustices that some “machines” might unintendedly be causing because of unintended prejudice in their nature. These biases can range from discrimination by characteristics such as race, age, gender, disability, or ethnicity. There is a high chance that you or someone you know is working or will work in an organization that uses Machine Learning (ML) to make decisions. These decisions could be related to offering someone a job, loan, amount of salary, scholarship etc. This article will equip you to ask confirming questions that might have been overlooked in the development process of ML models in your current or future organizations. 

Getting up to Speed on Machine Learning

Have you ever wondered how Netflix, YouTube, or Spotify correctly (sometimes) recommends a movie or song you like? That is just Machine Learning (ML) predicting what you might like next because a million other people with similar characteristics as you liked the first song or movie you chose liked the song or movie that has been recommended for you. It does that by automatically learning using statistical algorithms from massive datasets containing a million recorded observations and improving from experience without being explicitly programmed to do so. There’s heavy math behind these algorithms that some Data Scientist do not understand either. For now, don’t worry about it; just remember three things. (i) ML algorithms learn from many data, (ii) learns the patterns in the data and comes up with algorithms that link input data to output (iii) makes predictions using the “current data” based on the patterns it learned

Machine Learning in Action 

The Application of ML goes beyond entertainment. The list is endless even the social sector makes that list. Compelling evidence has been published pointing at how ML is or could better society. This includes how the  outbreak of infectious disease can be predicted  with an unprecedented accuracy of when and where and mitigated with appropriate public health prevention interventions. One other exciting application was published in the Journal for peace research titled Predicting local violence: Evidence from a panel survey in Liberia. The publication highlights how machine learning techniques proved to be a better method for “Identifying risk factors and forecasting where local violence is most likely to occur,” this should help allocate scarce peacekeeping and policing resources. 

Machine learning algorithms are also standard in the credit industry; banks and loaning institutions use them to predict the probability of a borrower to pay back as a basis whether to give them a loan or not. Even insurance companies use these algorithms to determine the premium an individual ought to pay based on the risk associated with them.

Bias in Machine Learning

Despite excelling at accuracy, speed and eliminating human subjectivity, ML algorithms are liable to bias. Machine Learning bias occurs when machine learning algorithms produce results that are systematically prejudiced in favour or against an individual, group, or characteristic that is considered to be unfair such as race, age, gender, disability, or ethnicity.

When searching for Machine Learning bias, almost all the articles in the data science community greets with the infamous Correctional Offender Management Profiling for Alternative Sanctions (COMPAS) example, so I will also greet you with the same. COMPAS is a prisoner ML-based assessment tool that is used to predict prisoners’ probability of re-offending once they are out of prison. Due to overcrowding in prisons, the system is used to identify prisoners with a low likelihood of re-offending.  “These prisoners are then scrutinized for potential release as a way to make room for incoming prisoners.” In 2016, third party audits were conducted to determine the truthfulness of the system. FindingS from and the review found that the algorithm’s predictions were biased by race. You can find the full COMPAS Machine Bias article here.

The other typical example is Amazon’s recruitment system that was meant to automate the recruitment system to increase recruitment efficiency and eliminate human bias. The model was meant to make predictions based on the candidate’s resumes. The results of the model resulted in having a model that favoured male over female candidates in engineering positions even when female candidates were more qualified. The project was discontinued by amazon after realizing the unintended bias that the algorithm had.

Sources of Bias.

Just like any other decisions, ML algorithms depend on data. All predictions that ML models make are a representation of data that it drew the patterns from to make decisions.  In the Amazon recruitment projects, the model learned on ten years pull of resumes dominated by male. The model favoured male candidates because male candidates use certain words that female candidates do not apply in their resumes.  The model prejudiced against female because it could not identify words that were used by successful male candidates in the training data in the female candidates’ resumes.

Conclusion

Inclusion of certain classes of people can be threatened if machine learning-based decision tools are developed with less conscious. As people who are in the forefront endorsing diversity and inclusion, we might once in a while want to check in with our ML colleagues to find out how inclusive their models are.