- Who will be impacted? Patients who got infected with bacterial pneumonia, viral pneumonia and COVID-19.
- Who were sampled? Patients who are normal and got infected with bacterial pneumonia, viral pneumonia and COVID-19. At this point in our project, we have sampled 800 images for training and 800 images for testing/ validation.
- Who were over sampled or under sampled? Even though the complete dataset is imbalanced, we sampled the dataset with class imbalance in mind which resulted in fairly balanced test/train dataset with viral pneumonia being slightly under-sampled.
- Who were the data scientists (yourself and your collaborators)? Sriram Thakur, Shivani Sivasankar , Deepak Sireeshan students at UMKC.
- What are the social and cultural impacts? There are no known social or cultural impacts.
- What are the concerns about data privacy, security, and fairness? No. We collected a dataset from multiple sources but each of them were a Kaggle challenge so the data set is clean in the sense that it doesn’t disclose any personal information of patients or the physicians.
- When will the social and cultural impacts take place? There are no known social or cultural impacts.
- When should people be concerned about data privacy, security, and fairness? No personal or sensitive information like name, SSN or DOB of patient of physician has been collected.
- Where will the social and cultural impacts take place? No known social and cultural impacts.
- Where will data privacy, security, and fairness issues, like data breach, and evaluative bias, likely to happen? As there is no relevant personal information collected for any of the X-ray chest images, there is no chance of data privacy or data breach issues.
- Why are the social and cultural impacts important or consequential to the people and/or the community? If individuals identity or information has been leaked there might be a chance of identity theft and upon knowing the condition of the patient [Infected with COVID] might experience isolation and estrangement from the community. Hence, it is very important to protect the patient personal information. However, the datasets/information used in this project has no importance of patients personal/geographical information, hence there is no records of such data.
- Why should we be concerned about data privacy, security, and fairness issues? To avoid identity thefts and prevent any sensitive information exposed in social networking groups.
- How can we address these societal issues in ML using a community-in-the-loop approach? It is highly important to not collect unnecessary information from the individuals or groups participating in research. Our ML model does not require any of such information that would cause privacy issues and all the images collected are global and does not pertain to an individual or a community.