In this series ? we’ve been exploring the topic of guided labeling Here are the by looking at active learning and label density. In the first episode ? we introduced the topic of active learning and active learning sampling and moved on to look at label density in the second article. the two previous episodes:
Guided Labeling Episode 1: An Introduction to Active Learning
Guided Labeling Episode 2: Label Density
In this third episode ? we are moving on to look at model uncertainty.
Using label density
We explore the feature space and retrain the model each time overseas data with new labels that are both representative of a good subset of unlabeled data and different from already labeled data of past iterations. However ? besides selecting data points based on the overall distribution ? we should also prioritize missing labels based on the attached model predictions. In every iteration ? we can score the data that still needs to be labeled with the retrained model. What can we infer ? given those predictions by the constantly retrained model?
Before we can answer this question ? there is another common concept in to register support requests machine learning classification related to the feature space: the decision boundary. The decision boundary defines a hyper-surface in the feature space of n dimensions ? which separates data points depending on the predicted label.
In Figure 1 below
We point again to our data set with only two columns: weight and height. In framework angola latest email list is this case ? the decision boundary is a line-drawn machine learning model to predict overweight and underweight conditions. In this example ? we use a line. However ? we could have also used a curve or a closed shape.
Figure 1: In the 2D feature space of weight vs. height ? we train a machine learning model to distinguish overweight and underweight subjects. The model prediction is visually and conceptually represented by the decision boundary — a line dividing the subjects in the two categories.