Transformers Integration Classes

Classification

class small_text.integrations.transformers.classifiers.classification.FineTuningArguments(base_lr, layerwise_gradient_decay, gradual_unfreezing=-1, cut_fraction=0.1)[source]

Arguments to enable and configure gradual unfreezing and discriminative learning rates as used in Universal Language Model Fine-tuning (ULMFiT) [HR18].

__init__(base_lr, layerwise_gradient_decay, gradual_unfreezing=-1, cut_fraction=0.1)
class small_text.integrations.transformers.classifiers.classification.TransformerBasedClassification(transformer_model, num_classes, multi_label=False, num_epochs=10, lr=2e-05, mini_batch_size=12, validation_set_size=0.1, validations_per_epoch=1, early_stopping_no_improvement=5, early_stopping_acc=-1, model_selection=True, fine_tuning_arguments=None, device=None, memory_fix=1, class_weight=None, verbosity=VERBOSITY_MORE_VERBOSE, cache_dir='.active_learning_lib_cache/')[source]
__init__(transformer_model, num_classes, multi_label=False, num_epochs=10, lr=2e-05, mini_batch_size=12, validation_set_size=0.1, validations_per_epoch=1, early_stopping_no_improvement=5, early_stopping_acc=-1, model_selection=True, fine_tuning_arguments=None, device=None, memory_fix=1, class_weight=None, verbosity=VERBOSITY_MORE_VERBOSE, cache_dir='.active_learning_lib_cache/')
Parameters
  • transformer_model (TransformerModelArguments) – Settings for transformer model, tokenizer and config.

  • num_classes (int) – Number of classes.

  • multi_label (bool, default=False) – If False, the classes are mutually exclusive, i.e. the prediction step results in exactly one predicted label per instance.

  • num_epochs (int, default=10) – Epochs to train.

  • lr (float, default=2e-5) – Learning rate.

  • mini_batch_size (int, default=12) – Size of mini batches during training.

  • validation_set_size (float, default=0.1) – The size of the validation set as a fraction of the training set.

  • validations_per_epoch (int, default=1) – Defines how of the validation set is evaluated during the training of a single epoch.

  • early_stopping_no_improvement (int, default=5) –

    Number of epochs with no improvement in validation loss until early stopping is triggered.

    Deprecated since version 1.1.0: Use the early_stopping kwarg in fit() instead.

  • early_stopping_acc (float, default=-1) –

    Accuracy threshold in the interval (0, 1] which triggers early stopping.

    Deprecated since version 1.1.0: Use the early_stopping kwarg in fit() instead.

  • model_selection (bool, default=True) – If True, model selects first saves the model after each epoch. At the end of the training step the model with the lowest validation error is selected.

  • fine_tuning_arguments (FineTuningArguments or None, default=None) – Fine tuning arguments.

  • device (str or torch.device, default=None) – Torch device on which the computation will be performed.

  • memory_fix (int, default=1) – If this value is greater than zero, every memory_fix-many epochs the cuda cache will be emptied to force unused GPU memory being released.

  • class_weight ('balanced' or None, default=None) – If ‘balanced’, then the loss function is weighted inversely proportional to the label distribution to the current train set.

fit(train_set, validation_set=None, weights=None, early_stopping=None, model_selection=None, optimizer=None, scheduler=None)

Trains the model using the given train set.

Parameters
  • train_set (TransformersDataset) – Training set.

  • validation_set (TransformersDataset, default=None) – A validation set used for validation during training, or None. If None, the fit operation will split apart a subset of the trainset as a validation set, whose size is set by self.validation_set_size.

  • weights (np.ndarray[np.float32] or None, default=None) – Sample weights or None.

  • early_stopping (EarlyStoppingHandler or 'none') – A strategy for early stopping. Passing ‘none’ disables early stopping.

  • model_selection (ModelSelectionHandler or 'none') – A model selection handler. Passing ‘none’ disables model selection.

  • optimizer (torch.optim.optimizer.Optimizer or None, default=None) – A pytorch optimizer.

  • scheduler (torch.optim._LRScheduler or None, default=None) – A pytorch scheduler.

Returns

self – Returns the current classifier with a fitted model.

Return type

TransformerBasedClassification

predict(data_set, return_proba=False)
Parameters
  • data_set (small_text.integrations.transformers.TransformerDataset) – A dataset on whose instances predictions are made.

  • return_proba (bool, default=False) – If True, additionally returns the confidence distribution over all classes.

Returns

  • predictions (np.ndarray[np.int32] or csr_matrix[np.int32]) – List of predictions if the classifier was fitted on single-label data, otherwise a sparse matrix of predictions.

  • probas (np.ndarray[np.float32], optional) – List of probabilities (or confidence estimates) if return_proba is True.

predict_proba(data_set)
Parameters

test_set (small_text.integrations.pytorch.PytorchTextClassificationDataset) – Test set.

Returns

scores – Distribution of confidence scores over all classes.

Return type

np.ndarray