Changelog
Version 1.0.0 - 2022-06-14
First stable release.
Changed
Datasets:
SklearnDatasetnow checks if the dimensions of the features and labels match.
Query Strategies:
ExpectedGradientLengthMaxWord: Cleaned up code and added checks to detect invalid configurations.
Documentation:
The documentation is now available in full width.
Repository:
Versions in this can now be referenced using the respective Zenodo DOI.
[1.0.0b4] - 2022-05-04
Added
General:
We now have a concept for optional dependencies which allows components to rely on soft dependencies, i.e. python dependencies which can be installed on demand (and only when certain functionality is needed).
Datasets:
The
Datasetinterface now has aclone()method that creates an identical copy of the respective dataset.
Query Strategies:
New strategies: DiscriminativeActiveLearning and SEALS.
Changed
Datasets:
Separated the previous
DatasetViewimplementation into interface (DatasetView) and implementation (SklearnDatasetView).Added
clone()method which creates an identical copy of the dataset.
Query Strategies:
EmbeddingBasedQueryStrategynow only embeds instances that are either in the label or in the unlabeled pool (and no longer the entire dataset).
Code examples:
Code structure was unified.
Number of iterations can now be passed via an cli argument.
small_text.integrations.pytorch.utils.data:Method
get_class_weights()now scales the resulting multi-class weights so that the smallest class weight is equal to1.0.
[1.0.0b3] - 2022-03-06
Added
New query strategy: ContrastiveActiveLearning.
Added Reproducibility Notes.
Changed
Cleaned up and unified argument naming: The naming of variables related to datasets and indices has been improved and unified. The naming of datasets had been inconsistent, and the previous
x_notation for indices was a relict of earlier versions of this library and did not reflect the underlying object anymore.PoolBasedActiveLearner:attribute
x_indices_labeledwas renamed toindices_labeledattribute
x_indices_ignoredwas unified toindices_ignoredattribute
queried_indiceswas unified toindices_queriedattribute
_x_index_to_positionwas named to_index_to_positionarguments
x_indices_initial,x_indices_ignored, andx_indices_validationwere renamed toindices_initial,indices_ignored, andindices_validation. This affects most methods of thePoolBasedActiveLearner.
QueryStrategyold:
query(self, clf, x, x_indices_unlabeled, x_indices_labeled, y, n=10)new:
query(self, clf, dataset, indices_unlabeled, indices_labeled, y, n=10)
StoppingCriterionold:
stop(self, active_learner=None, predictions=None, proba=None, x_indices_stopping=None)new:
stop(self, active_learner=None, predictions=None, proba=None, indices_stopping=None)
Renamed environment variable which sets the small-text temp folder from
ALL_TMPtoSMALL_TEXT_TEMP
[1.0.0b2] - 2022-02-22
Bugfix release.
Fixed
Fix links to the documentation in README.md and notebooks.
[1.0.0b1] - 2022-02-22
First beta release with multi-label functionality and stopping criteria.
Added
Added a changelog.
All provided classifiers are now capable of multi-label classification.
Changed
Documentation has been overhauled considerably.
PoolBasedActiveLearner: Renamedincremental_trainingkwarg toreuse_model.SklearnClassifier: Changed__init__(clf)to__init__(model, num_classes, multi_Label=False)SklearnClassifierFactory:__init__(clf_template, kwargs={})to__init__(base_estimator, num_classes, kwargs={}).Refactored
KimCNNClassifierandTransformerBasedClassification.
Removed
Removed
devicekwarg fromPytorchDataset.__init__(),PytorchTextClassificationDataset.__init__()andTransformersDataset.__init__().