Training

Apart from active learning, small-text includes several helpers for classification.

Early Stopping

Early stopping is a mechanism which tries to avoid overfitting when training a model. For this purpose, an early stopping mechanism monitors certain metrics during the training process —usually after each epoch—in order to check if early stopping should be triggered. If the early stopping handler deems an early stop to be necessary according to the given constraints then it returns True when check_early_stop() is called. This response has to be subsequently handled in the respective classifier.

Interface

class EarlyStoppingHandler(ABC):

    @abstractmethod
    def check_early_stop(self, epoch: int, measured_values: Dict[str, float]) -> bool:
        """Checks if the training should be stopped early. The decision is made based on
        the measured values of one or more quantitative metrics over time.

        Parameters
        ----------
        epoch : int
            The number of the current epoch. Multiple checks per epoch are allowed.
        measured_values : dict of str to float
            A dictionary of measured values.
        """
        pass

Example Usage

Monitoring validation loss (lower is better):

from small_text.training.early_stopping import EarlyStopping
from small_text.training.metrics import Metric

early_stopping = EarlyStopping(Metric('val_loss'), patience=2)

print(early_stopping.check_early_stop(1, {'val_loss': 0.060}))
print(early_stopping.check_early_stop(2, {'val_loss': 0.061}))  # no improvement, don't stop
print(early_stopping.check_early_stop(3, {'val_loss': 0.060}))  # no improvement, don't stop
print(early_stopping.check_early_stop(3, {'val_loss': 0.060}))  # no improvement, stop

Output:

False
False
False
True

Monitoring training accuracy (higher is better) with patience=1:

from small_text.training.early_stopping import EarlyStopping
from small_text.training.metrics import Metric

early_stopping = EarlyStopping(Metric('val_acc', lower_is_better=False), patience=1)

print(early_stopping.check_early_stop(1, {'val_acc': 0.80}))
print(early_stopping.check_early_stop(3, {'val_acc': 0.79}))  # no improvement, don't stop
print(early_stopping.check_early_stop(2, {'val_acc': 0.81}))  # improvement
print(early_stopping.check_early_stop(3, {'val_acc': 0.81}))  # no improvement, don't stop
print(early_stopping.check_early_stop(3, {'val_acc': 0.80}))  # no improvement, stop

Output:

False
False
False
False
True

Combining Early Stopping Conditions

What if we want to early stop based on either one of two conditions? For example, if validation loss does not change during the last 3 checks or training accuracy crosses 0.99? This can be easily done by using EarlyStoppingOrCondition which sequentially applies a list of early stopping handlers.

from small_text.training.early_stopping import EarlyStopping, EarlyStoppingOrCondition
from small_text.training.metrics import Metric

early_stopping = EarlyStoppingOrCondition([
    EarlyStopping(Metric('val_loss'), patience=3),
    EarlyStopping(Metric('train_acc', lower_is_better=False), threshold=0.99)
])

EarlyStoppingOrCondition returns True, i.e. triggers an early stop, iff at least one of the early stopping handlers within the given list returns True. Similarly, we have EarlyStoppingAndCondition which stops only when all of the early stopping handlers return True.

Implementations

class small_text.training.early_stopping.EarlyStopping(metric, min_delta: float = 1e-14, patience: int = 5, threshold: float = 0.0)[source]

A default early stopping implementation which supports stopping based on thresholds or based on patience-based improvement.

Added in version 1.1.0.

__init__(metric, min_delta: float = 1e-14, patience: int = 5, threshold: float = 0.0)

Parameters:

metric (small_text.training.metrics.Metric) – The measured training metric which will be monitored for early stopping.
min_delta (float, default=1e-14) – The minimum absolute value to consider a change in the masured value as an improvement.
patience (int, default=5) – The maximum number of steps (i.e. calls to check_early_stop()) which can yield no improvement. Disable patience-based improvement monitoring by setting patience to a value less than zero.
threshold (float, default=0.0) – If greater zero, then early stopping is triggered as soon as the current measured value crosses (‘valid_acc’, ‘train_acc’) or falls below (‘valid_loss’, ‘train_loss’) the given threshold. Disable threshold-based stopping by setting the threshold to a value lesser than or equal zero.

check_early_stop(epoch, measured_values)

Checks if the training should be stopped early. The decision is made based on the masured values of one or more quantitative metrics over time.

Returns True if the threshold is crossed/undercut (for accuracy/loss respectively).
Checks for an improvement and returns True if patience has been execeeded.
Otherwise, return False.

Parameters:

epoch (int) – The number of the current epoch (1-indexed). Multiple checks per epoch are allowed.
measured_values (dict of str to float) – A dictionary of measured values.

Note

Currently, supported metrics are validation accuracy (val_acc), validation loss (val_loss), training accuracy (train_acc), and training loss (train_loss). For the accuracy metric, a higher value is better, i.e. patience triggers only when the respective metric has not exceeded the previous best value, and for loss metrics when the respective metric has not fallen below the previous best value respectively.

class small_text.training.early_stopping.EarlyStoppingAndCondition(early_stopping_handlers)[source]

A sequential early stopping handler which bases its response on a list of sub handlers. Whenever all sub early stopping handler return True the aggregated response will be True, i.e. the answer is the combination of single answers aggregated by a logical and.

Added in version 1.1.0.

__init__(early_stopping_handlers)

Parameters:: early_stopping_handlers (list of EarlyStoppingHandler) – A list of early stopping (sub-)handlers.

check_early_stop(epoch, measured_values)

Checks if the training should be stopped early. The decision is made based on the masured values of one or more quantitative metrics over time.

Parameters:

epoch (int) – The number of the current epoch (1-indexed). Multiple checks per epoch are allowed.
measured_values (dict of str to float) – A dictionary of measured values.

class small_text.training.early_stopping.EarlyStoppingOrCondition(early_stopping_handlers: List[EarlyStoppingHandler])[source]

A sequential early stopping handler which bases its response on a list of sub handlers. As long as one early stopping handler returns True the aggregated response will be True, i.e. the answer is the combination of single answers aggregated by a logical or.

Added in version 1.1.0.

__init__(early_stopping_handlers: List[EarlyStoppingHandler])

Parameters:: early_stopping_handlers (list of EarlyStoppingHandler) – A list of early stopping (sub-)handlers.

check_early_stop(epoch, measured_values)

Checks if the training should be stopped early. The decision is made based on the masured values of one or more quantitative metrics over time.

Parameters:

epoch (int) – The number of the current epoch (1-indexed). Multiple checks per epoch are allowed.
measured_values (dict of str to float) – A dictionary of measured values.

class small_text.training.early_stopping.NoopEarlyStopping[source]

A no-operation early stopping handler which never stops. This is for developer convenience only, you will likely not need this in an application setting.

Added in version 1.1.0.

check_early_stop(epoch: int, measured_values: Dict[str, float])

Checks if the training should be stopped early. The decision is made based on the masured values of one or more quantitative metrics over time.

Parameters:

epoch (int) – The number of the current epoch (1-indexed). Multiple checks per epoch are allowed.
measured_values (dict of str to float) – A dictionary of measured values.

Model Selection

Given a set of models that have been trained on the same data, model selection chooses the model that is considered best according to some criterion. In the context of neural networks, a typical use case for this is the training process, where the set of models is given by the respective model after each epoch, or hyperparameter search, where one model for each hyperparameter configuration is trained.

Interface

class ModelSelectionManager(ABC):

    def add_model(self, model_id, model_path, measured_values, step=0, fields=dict()):
        """Adds the data for a trained model. This includes measured values of certain metrics
        and additional fields by which a model selection strategy then selects the model.

        Parameters
        ----------
        model_id : str
            Unique identifier for this model.
        model_path : str
            Path to the model.
        measured_values : dict of str to object
            A dictionary of measured values.
        step : int
            The number of the epoch (1-indexed) which is associated with this model.
        fields : dict of str to object
            A dictionary of additional measured fields.
        """

    def select(self, select_by=None):
        """Selects the best model.

        Parameters
        ----------
        select_by : str or list of str
            Name of the strategy that chooses the model. The choices are specific to the
            implementation.

        Returns
        -------
        model_selection_result : ModelSelectionResult or None
            A model selection result object which contains the data of the selected model
            or None if no model could be selected.
        """
        pass

Example Usage

from small_text.training.model_selection import ModelSelection

model_selection = ModelSelection()

measured_values = {'val_acc': 0.87, 'train_acc': 0.89, 'val_loss': 0.123}
model_selection.add_model('model_id_1', 'model_1.bin', measured_values)
measured_values = {'val_acc': 0.88, 'train_acc': 0.91, 'val_loss': 0.091}
model_selection.add_model('model_id_2', 'model_2.bin', measured_values)
measured_values = {'val_acc': 0.87, 'train_acc': 0.92, 'val_loss': 0.101}
model_selection.add_model('model_id_3', 'model_3.bin', measured_values)

print(model_selection.select(select_by='val_acc'))
print(model_selection.select(select_by='train_acc'))
print(model_selection.select(select_by=['val_acc', 'train_acc']))

Output:

ModelSelectionResult('model_id_2', 'model_2.bin', {'val_loss': 0.091, 'val_acc': 0.88, 'train_loss': nan, 'train_acc': 0.91}, {'early_stop': False})
ModelSelectionResult('model_id_3', 'model_3.bin', {'val_loss': 0.101, 'val_acc': 0.87, 'train_loss': nan, 'train_acc': 0.92}, {'early_stop': False})
ModelSelectionResult('model_id_2', 'model_2.bin', {'val_loss': 0.091, 'val_acc': 0.88, 'train_loss': nan, 'train_acc': 0.91}, {'early_stop': False})

Implementations

class small_text.training.model_selection.ModelSelection(default_select_by=DEFAULT_DEFAULT_SELECT_BY, metrics=DEFAULT_METRICS, required=DEFAULT_REQUIRED_METRIC_NAMES, fields_config=dict())[source]

A default model selection implementation.

Added in version 1.1.0.

DEFAULT_METRICS = [Metric('val_loss', dtype=float, lower_is_better=True), Metric('val_acc', dtype=float, lower_is_better=False), Metric('train_loss', dtype=float, lower_is_better=True), Metric('train_acc', dtype=float, lower_is_better=False)]: Default metric configuration to be used.

DEFAULT_REQUIRED_METRIC_NAMES = ['val_loss', 'val_acc']: Names of the metrics that must be reported to add_model().

FIELD_NAME_EARLY_STOPPING = 'early_stop': Field name for the early stopping default field.

DEFAULT_DEFAULT_SELECT_BY = ['val_loss', 'val_acc']: Metrics by which the select() function chooses the best model.

__init__(default_select_by=DEFAULT_DEFAULT_SELECT_BY, metrics=DEFAULT_METRICS, required=DEFAULT_REQUIRED_METRIC_NAMES, fields_config=dict())

Parameters:

default_select_by (str or list of str) – Metric or list of metrics. In case a list is given, the model selection starts with the first metric and switches to the next one in case of a tie.
metrics (list of small_text.training.metrics.Metric) – The metrics whose measured values which will be used for deciding which model to use.
required (list of str) – Names of the metrics given by metrics that are required. Non-required metrics can be reported as None.
fields_config (dict of str to type) – A configuration for additional data fields that can be measured and taken into account when selecting the model. Fields can be None by default but can be required by model selection strategies.

add_model(model_id, model_path, measured_values, fields=dict())

Adds the data for a trained model. This includes measured values of certain metrics and additional fields by which a model selection strategy then selects the model.

Parameters:

model_id (str) – Unique identifier for this model.
model_path (str) – Path to the model.
measured_values (dict of str to object) – A dictionary of measured values.
step (int) – The number of the epoch (1-indexed) which is associated with this model.
fields (dict of str to object) – A dictionary of additional measured fields.

select(select_by=None)

Parameters:: select_by (str or list of str) – Metric or list of metrics. Takes precedence over self.default_select_by if not None. In case a list is given, the model selection starts with the first metric and switches to the next one in case of a tie.
Returns:: model_selection_result – A model selection result object which contains the data of the selected model or None if no model could be selected.
Return type:: ModelSelectionResult or None

class small_text.training.model_selection.NoopModelSelection[source]

A no-operation model selection handler which. This is for developer convenience only, you will likely not need this in an application setting.

Added in version 1.1.0.

add_model(model_id, model_path, measured_values, step=0, fields=dict())

Adds the data for a trained model. This includes measured values of certain metrics and additional fields by which a model selection strategy then selects the model.

Parameters:

model_id (str) – Unique identifier for this model.
model_path (str) – Path to the model.
measured_values (dict of str to object) – A dictionary of measured values.
step (int) – The number of the epoch (1-indexed) which is associated with this model.
fields (dict of str to object) – A dictionary of additional measured fields.

select(select_by=None)

Selects the best model.

Parameters:: select_by (str or list of str) – Name of the strategy that chooses the model. The choices are specific to the implementation.
Returns:: model_selection_result – A model selection result object which contains the data of the selected model or None if no model could be selected.
Return type:: ModelSelectionResult or None