Define your scope or domain where the use case is relevant or prevalent?
This use case relates to bias in data, that is used train AI models.
What is your main story?
Who are the characters or people in the main story?
I am a Data scientist in well-known company. I love to analyze data and use machine learning algorithms to find the hidden patterns in it. My company collects lot of data about customers and my job is to deploy AI models on this data to enhance our products and services. Most of the data is collected by human activities that can introduce bias in the data and this bias, if unchecked, can affect the model performance that can lead to customer dissatisfaction and loss for the company. Also, company’s legal team is very concerned about my work because they want to make sure that I don’t use any aspect of the data (for example age, sex, gender, etc.) that makes the model bias towards any particular group of people and may cause legal issues for the company. Our company’s policy is to offer fair services to all of our customers.
Recently, there are two new services that we want to launch. These services perform predictive analysis on customer data and try to automate the process of predicting what should be the credit range the company should give to a applicant. Before, developing any model I want to make sure that there is no bias in the data and if there is any then the system should let me know about it and provide some visualization so that I can plan the mitigation strategy and present it to my team for further discussions.
I am looking for a model that ensures fairness in AI algorithms by inspecting data to detect and visualize bias and communicate it to other components of the system before any predictive modeling on that data.
This bias detection and visualizing model should be available before the development starts for our new release. We are expecting a solution before 4/15/2020.
AI techniques using big data and algorithmic processing are increasingly used to guide important social decisions, including hiring, admissions, loan granting, and crime prediction. However, AI is just as fair as the data, and the data are gathered from human activities. Data are often biased; data are as biased and flawed as human beings. While we assume that machines are neutral, there is strong evidence that algorithms may sometimes learn human biases and discrimination from data, rather than mitigating them. This results in flawed production models which can lead to legal action against businesses and loss of revenue. The demand for Fairness in AI is growing as most of the businesses are moving to automate their services and use more and more predictive analysis.
I have done some research to find a reliable method that can detect, visualize, and communicate bias in data, but so far I am not able to find one. This method should be general enough to accept various data types, for example structured, unstructured, video, image and text data. This method should also address issues like imbalanced data, small data within big data. The focus is not only detection, but also communication. Machines need to communicate with human beings via good visualization so that AI models become more reliable.