torchnlp.utils package

The torchnlp.utils package contains any other module or object that is useful in building out a NLP pipeline.

torchnlp.utils.collate_tensors(batch, stack_tensors=<built-in method stack of type object>)[source]

Collate a list of type k (dict, namedtuple, list, etc.) with tensors.

Inspired by: https://github.com/pytorch/pytorch/blob/master/torch/utils/data/_utils/collate.py#L31

Parameters:
  • batch (list of k) – List of rows of type k.
  • stack_tensors (callable) – Function to stack tensors into a batch.
Returns:

Collated batch of type k.

Return type:

k

Example use case:
This is useful with torch.utils.data.dataloader.DataLoader which requires a collate function. Typically, when collating sequences you’d set collate_fn=partial(collate_tensors, stack_tensors=encoders.text.stack_and_pad_tensors).

Example

>>> import torch
>>> batch = [
...   { 'column_a': torch.randn(5), 'column_b': torch.randn(5) },
...   { 'column_a': torch.randn(5), 'column_b': torch.randn(5) },
... ]
>>> collated = collate_tensors(batch)
>>> {k: t.size() for (k, t) in collated.items()}
{'column_a': torch.Size([2, 5]), 'column_b': torch.Size([2, 5])}
torchnlp.utils.flatten_parameters(model)[source]

flatten_parameters of a RNN model loaded from disk.

torchnlp.utils.get_tensors(object_)[source]

Get all tensors associated with object_

Parameters:object (any) – Any object to look for tensors.
Returns:List of tensors that are associated with object_.
Return type:(list of torch.tensor)
torchnlp.utils.get_total_parameters(model)[source]

Return the total number of trainable parameters in model.

Parameters:model (torch.nn.Module) –
Returns:The total number of trainable parameters in model.
Return type:(int)
torchnlp.utils.identity(x)[source]
torchnlp.utils.is_namedtuple(object_)[source]
torchnlp.utils.lengths_to_mask(*lengths, **kwargs)[source]

Given a list of lengths, create a batch mask.

Example

>>> lengths_to_mask([1, 2, 3])
tensor([[ True, False, False],
        [ True,  True, False],
        [ True,  True,  True]])
>>> lengths_to_mask([1, 2, 2], [1, 2, 2])
tensor([[[ True, False],
         [False, False]],
<BLANKLINE>
        [[ True,  True],
         [ True,  True]],
<BLANKLINE>
        [[ True,  True],
         [ True,  True]]])
Parameters:
  • *lengths (list of python:int or torch.Tensor) –
  • **kwargs – Keyword arguments passed to torch.zeros upon initially creating the returned tensor.
Returns:

torch.BoolTensor

torchnlp.utils.sampler_to_iterator(dataset, sampler)[source]

Given a batch sampler or sampler returns examples instead of indices

Parameters:
  • dataset (torch.utils.data.Dataset) – Dataset to sample from.
  • sampler (torch.utils.data.sampler.Sampler) – Sampler over the dataset.
Returns:

generator over dataset examples

torchnlp.utils.split_list(list_, splits)[source]

Split list_ using the splits ratio.

Parameters:
  • list (list) – List to split.
  • splits (tuple) – Tuple of decimals determining list splits summing up to 1.0.
Returns:

Splits of the list.

Return type:

(list)

Example

>>> dataset = [1, 2, 3, 4, 5]
>>> split_list(dataset, splits=(.6, .2, .2))
[[1, 2, 3], [4], [5]]
torchnlp.utils.tensors_to(tensors, *args, **kwargs)[source]

Apply torch.Tensor.to to tensors in a generic data structure.

Inspired by: https://github.com/pytorch/pytorch/blob/master/torch/utils/data/_utils/collate.py#L31

Parameters:
  • tensors (tensor, dict, list, namedtuple or tuple) – Data structure with tensor values to move.
  • *args – Arguments passed to torch.Tensor.to.
  • **kwargs – Keyword arguments passed to torch.Tensor.to.
Example use case:
This is useful as a complementary function to collate_tensors. Following collating, it’s important to move your tensors to the appropriate device.
Returns:The inputted tensors with torch.Tensor.to applied.

Example

>>> import torch
>>> batch = [
...   { 'column_a': torch.randn(5), 'column_b': torch.randn(5) },
...   { 'column_a': torch.randn(5), 'column_b': torch.randn(5) },
... ]
>>> tensors_to(batch, torch.device('cpu'))  # doctest: +ELLIPSIS
[{'column_a': tensor(...}]
torchnlp.utils.torch_equals_ignore_index(tensor, tensor_other, ignore_index=None)[source]

Compute torch.equal with the optional mask parameter.

Parameters:ignore_index (int, optional) – Specifies a tensor index that is ignored.
Returns:(bool) Returns True if target and prediction are equal.