Stopping Criteria
Stopping criteria indicate when to exit the active learning loop.
Pre-implemented Stopping Criteria
Interface
This interface is one of the trickiest, since you might stop on any information available within
the active learning process
(excluding experiment only information like the test set of course).
Therefore all arguments here are optional and None
by default, and the interface only provides a
very loose frame on how stopping criteria should be built.
class StoppingCriterion(ABC):
@abstractmethod
def stop(self, active_learner=None, predictions=None, proba=None, indices_stopping=None):
"""
Parameters
----------
active_learner : small_text.active_learner.PoolBasedActiveLearner
An active learner instance.
predictions : np.ndarray[int]
Predictions for a fixed subset (usually the full train set).
proba : np.ndarray[float]
Probability distribution over the possible classes for a fixed subset. This is expected
to have the same length as `predictions` unless one of `predictions` and `proba`
is `None`.
indices_stopping : np.ndarray[int]
Uses the given indices to select a subset for stopping from either `predictions`
or `proba` if not `None`. The indices are relative to `predictions` and `proba`.
"""
pass
For an example, see the KappaAverage
,
which stops when the change in the predictions over multiple iterations falls below a fixed threshold.
Classes
- class small_text.stopping_criteria.kappa.KappaAverage(num_classes, window_size=3, kappa=0.99)[source]
A stopping criterion which measures the agreement between sets of predictions [BV09].
- __init__(num_classes, window_size=3, kappa=0.99)
- num_classesint
Number of classes.
- window_sizeint, default=3
Defines number of iterations for which the predictions are taken into account, i.e. this stopping criterion only sees the last window_size-many states of the prediction array passed to stop().
- kappafloat, threshold=0.05
The criterion stops when the agreement between two consecutive predictions within the window falls below this threshold.
- stop(active_learner=None, predictions=None, proba=None, indices_stopping=None)
- Parameters
active_learner (small_text.active_learner.PoolBasedActiveLearner) – An active learner instance.
predictions (np.ndarray[int]) – Predictions for a fixed subset (usually the full train set).
proba (np.ndarray[float]) – Probability distribution over the possible classes for a fixed subset. This is expected to have the same length as predictions unless one of predictions and proba is None.
indices_stopping (np.ndarray[int]) – Uses the given indices to select a subset for stopping from either predictions or proba if not None. The indices are relative to predictions and proba.
- class small_text.stopping_criteria.base.DeltaFScore(num_classes, window_size=3, threshold=0.05)[source]
A stopping criterion which stops if the predicted change of the F-score falls below a threshold [AB19].
Note
This criterion is only applicable for binary classification.
- __init__(num_classes, window_size=3, threshold=0.05)
- num_classesint
Number of classes.
- window_sizeint, default=3
Defines number of iterations for which the predictions are taken into account, i.e. this stopping criterion only sees the last window_size-many states of the prediction array passed to stop().
- thresholdfloat, threshold=0.05
The criterion stops when the predicted F-score falls below this threshold.
- stop(active_learner=None, predictions=None, proba=None, indices_stopping=None)
- Parameters
active_learner (small_text.active_learner.PoolBasedActiveLearner) – An active learner instance.
predictions (np.ndarray[int]) – Predictions for a fixed subset (usually the full train set).
proba (np.ndarray[float]) – Probability distribution over the possible classes for a fixed subset. This is expected to have the same length as predictions unless one of predictions and proba is None.
indices_stopping (np.ndarray[int]) – Uses the given indices to select a subset for stopping from either predictions or proba if not None. The indices are relative to predictions and proba.