In ML experience, we’d like to address these critical questions, such as
- How well does the story-based machine learning approach work?
- Is ML needed in all stories? Which ML algorithms are useful to what kind of story?
- How does the number of training examples influence accuracy?
- Is nearly all ML still dependent on human “guidance?”
- Are there any benefits through the feedback processes of machines and human interaction and share?
- How can machine learning pipelines be evaluated in terms of the collaboration of human beings and machines?
- What are the best ways to conduct supervised learning when manually labeled training sets are expensive to build or acquire?
Why Machine Learning?
The recent surge in machine learning and data-driven methods and tools have drastically improved the state-of-art results in an ever-growing number of domains, leading to the creation of new applications of Data Science and AI.
What is Machine Learning Experience (MLE) in OCEL.AI?
- MLE aims to support the machine learning process relating to users’ experiences and focus on designing a story-based machine approach with real-data to modeling.
- MLE focuses on building data analysis skills and data analytics with both structured and unstructured data, including image and text.
- MLE is based on story-based machine learning via feedback processes stemming from a mutual understanding between machines and human beings.
- MLL will show that OCEL.AI supports traceability of training data and interpretability/reproducibility of outputs effectively and efficiently.
What are human-machine partnerships in the life cycle?
Our goal is to focus on designing a set of educational methodologies that utilize technologies to assess and integrate the role of human experts in the data science/AI life cycle of 1) Creating/updating Model, 2) Training Model, 3) Evaluation, and 4) Deploying and Sharing models.
This above figure illustrates the life cycle of data science projects and the phases in which human interactions with the overall life cycle occur. The significant obstacles in AI-based learning technologies and experiences is namely the gap between human beings and machines. We notice that human developers and learners play critical roles in every phase of the overall development life cycle. Yet, their roles are still ignored in the teaching of such technologies.
OCEL.AI aims to develop teaching pedagogies and learning strategies that focus on human’s role in addressing the current challenges in each task through the story-based teaching of the underlying technology.
Phase 1: Data Mapping:
Collecting and preparing the datasets needed to develop the model. Data scientists spend more time on data management than on building algorithms or AI model training. While data science courses focus on collecting and labeling data, they largely ignore the challenges of mislabelled data, privacy, and bias. In OCEL.AI, learners will assess the role of humans in each development stage and ensure that data security and privacy-preserving techniques are used.
Phase 2: ML Experience: Search, Training, and Evaluation:
Developing a deep learning model is an iterative, experimental process that produces tens to hundreds of models before arriving at a satisfactory result. While there has been a surge in the number of software tools that aim to facilitate deep learning, managing the models and their artifacts is still surprisingly challenging and time-consuming.
OCEL.AI is a research assistant who provides useful guidelines and visualization for the model building in machine learning. The iteration of comparing the modeling experiments and reproducing them according to the domain help the learner see the imperfection of their modeling. When experiencing the ML modeling, it will be a design assistant for the following steps:
- First, searching for a model with a similar goal for reusability through the transfer learning process, rather than creating a new model from scratch. However, if no such model exists, a new model needs to be created.
- Second, training a deep learning model can be defined as an iterative search and learning problem that goes through tens to hundreds of iterations before arriving at a satisfactory result.
- Third, the validation step will help us to build confidence that the model is robust enough to accept estimated values for many inputs and still produce accurate results.
Phase 3: Model Deployment and Sharing:
With a recent surge in the number of research papers reporting state-of-the-art results in deep learning, the challenge of reproducing a deep learning experiment has come to the forefront. (This will be covered in the ML application site).
OCEL.AI will support sharing the models through local and cloud-based repositories and enable model reproducibility and exchange between different frameworks.
After training a model, it is necessary to deploy it in a production environment (e.g., a software system) or expose its functionality through REST APIs. Deployed models still require continuous monitoring to identify defects and continuously maintain their performance. In addition, the maintenance process often involves retraining the model on new data instances that were not included in the previous training dataset.
OCEL.AI will support users that wish to explore, deploy, maintain models and build applications using such models. These will be discussed on the ML application page.
Activities for Student Learning
OCEL.AI designs two types of conceptualization activities for students to learn ML, particularly those who have low or zero knowledge about programming.
Activity 1: Attribute Map
An attribute map is a visual summary of key entities (people and agencies) in the use case, and relevant attributes of each entity. For example, in the EduKC case, there are three key entities: parents who are looking for spring break camps, kids who will be attending the camps, and camps that offer the service. Each entity has some key attributes that may influence consumers’ decision-making.
Activity 2: Creating a machine learning modeling plan.
The modeling plan is a very important and useful activity for students. Here are the steps of developing this modeling plan:
Column 1 “Needs”Prioritizing consumers’ needs based upon the use case. This thinking process helps you to create research questions and hypotheses regarding, for example, how a family chooses spring break camps. We summarize the hierarchy of decision-making in Column 1 “Needs.”
Column 2 “Data Types”Then, you will go back to your assignment of data preparation, and fill out Column 2 Data Types.
Column 3 “If You Do It Manually”Column 3 “If You Do It Manually” helps you to take a first step toward modeling. Imagining that you are going to do this manually, which kind of analysis would you do to answer the research questions in Column 1? Remember machine learns from human, your intelligence, not the other way around.
Column 4 “Types of Analysis”The good thing is that you do not have to do any of those analyses manually. That is why AI and Machine Learning is useful to human beings. Column 4 “Types of Analysis” identifies the type of analyses be performed, such as classification, correlation, regression, etc. of structured and unstructured (text or image) data.
Column 5 “Machine/Deep Learning Models”Column 5 “Machine/Deep Learning Models” is most exciting. The chart below helps you to choose from different machine and deep learning models. There are learning resources on the website to explain each model for various levels of learning needs.