Pytorch Integration Classes¶
Dataset Classes¶
- class small_text.integrations.pytorch.datasets.PytorchTextClassificationDataset(data, vocab, target_labels=None, device=None)
Dataset class for classifiers from Pytorch Integration.
- __init__(data, vocab, target_labels=None, device=None)
- Parameters
data (list of tuples (text data [Tensor], label)) – Data set.
vocab (torchtext.vocab.vocab) – Vocabulary object.
- property x
Returns the features.
- Returns
x – Feature representation.
- Return type
object
- property y
Returns the labels.
- Returns
y – Label representation.
- Return type
object
- property data
Returns the internal list of tuples storing the data.
- Returns
data – Vocab object.
- Return type
list of tuples (text data [Tensor], label)
- property vocab
Returns the vocab.
- Returns
vocab – Vocab object.
- Return type
torchtext.vocab.Vocab
- property target_labels
Returns a list of possible labels.
- Returns
target_labels – List of possible labels.
- Return type
numpy.ndarray
- to(device=None, dtype=None, non_blocking=False, copy=False, memory_format=torch.preserve_format)
Calls torch.Tensor.to on all Tensors in data.
- Returns
self – The object with to having been called on all Tensors in data.
- Return type
See also
Models¶
- class small_text.integrations.pytorch.models.kimcnn.KimCNN(vocabulary_size, max_seq_length, num_classes=2, out_channels=100, embed_dim=300, padding_idx=0, kernel_heights=[3, 4, 5], dropout=0.5, embedding_matrix=None, freeze_embedding_layer=False)¶
- Parameters
vocabulary_size (int) –
max_seq_length (int) –
num_classes (int) – Number of output classes.
embedding_matrix (2D FloatTensor) –
- forward(x)¶
- Parameters
x (torch.LongTensor or torch.cuda.LongTensor) – input tensor (batch_size, max_sequence_length) with padded sequences of word ids