Small-Text

Small-Text provides active learning for text classification. It is designed to offer a robust and modular set of components for both experimental and applied active learning.

Why Small-Text?

Interchangeable components: All components are based around the ActiveLearner class. You can mix and match different many initialization strategies, query strategies, and classifiers.
Integrations: Optional Integrations allow you to use GPU-based models from the pytorch and transformers libraries.
Common patterns: We provide solutions to common challenges when building experiments and/or applications, such as Data Management and Serialization.
Multiple scientifically evaluated components are pre-implemented and ready to use (query strategies, initialization strategies, and stopping criteria).

Getting Started

Start: Installation | Active Learning Overview
Examples: Notebooks | Code Examples

Citation

A preprint which introduces small-text is available here:

Small-Text: Active Learning for Text Classification in Python.

@misc{schroeder2021smalltext,
    title={Small-Text: Active Learning for Text Classification in Python},
    author={Christopher Schröder and Lydia Müller and Andreas Niekler and Martin Potthast},
    year={2021},
    eprint={2107.10314},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}

License

MIT License

Index | Module Index | Search Page