- Who: In this story, there are three main characters: 1) the people/the community who needs help, 2) the data scientist (that is you), and 3) AI. How much does the data scientist understand Assignment 1 (domain) and Assignment 2 (data)?
The student/data scientist will understand the domain chapter as it will resemble the complete idea of the project which needs to done can also be considered as requirements technically. Domain knowledge is required to develop the project in later stages. Also, after getting the information related to the domain the data scientist needs to work on the data to process and analyze based on the requirements which stated in the chapter 2 Data.
- What models and analysis did the data scientist and AI apply to fulfill the need of the people or the community? Can the data scientist estimate and select data for their goals from Assignment 1? Can they map data sets from Assignment 2 onto appropriate ML models? Can the data scientist connect Story 1 with ML models/stories about what a ML model can do? To perform good ML research, what in-depth knowledge and experience with ML algorithms and ML stories does a data scientist need?
Data scientist selects the data according to the specified goals in the chapter 1. Using the same data, data scientist will gather the data as specified in the goals of chapter 2. Then works on the data using the domain knowledge. In this specific project, the data scientist needs the have knowledge about the hadoop ecosystem.
- When has to do with the iterations (Calibration 2). How much time did it take for experimentation? How efficient is the modeling/algorithm?
Can the data scientist determine the acceptance level of the model (validation with accuracy and runtime performance) considering the targeted users?
Various technologies were used in this project. For every analysis done using one technology in hadoop ecosystem we will be having different run time performances and accurancies. In this project we dealt only NLP technique which is sentimental analysis and data analysis with different technologies in hadoop ecosystem.
- Where has to do with the learning environment. Where did this experiential learning process take place? For example, it was part of an online Deep Learning course.
It was part of the Big Data Programming course offered in University of Missouri-Kansas City and also few online courses for NLP Techniques.
- How: If you would like, you can add a dimension of how. How did it happen? Sometimes, the answer to how can be covered by what, when and where.
The data scientist can use data in the way it is required for the analysis or based on the tool. New data or data source can be added for further analysis as per required nothing static in this process.