torchnlp.utils package¶
The torchnlp.utils
package contains any other module or object that is useful in building out
a NLP pipeline.
-
torchnlp.utils.
collate_tensors
(batch, stack_tensors=<built-in method stack of type object>)[source]¶ Collate a list of type
k
(dict, namedtuple, list, etc.) with tensors.Inspired by: https://github.com/pytorch/pytorch/blob/master/torch/utils/data/_utils/collate.py#L31
Parameters: - batch (list of k) – List of rows of type
k
. - stack_tensors (callable) – Function to stack tensors into a batch.
Returns: Collated batch of type
k
.Return type: k
- Example use case:
- This is useful with
torch.utils.data.dataloader.DataLoader
which requires a collate function. Typically, when collating sequences you’d setcollate_fn=partial(collate_tensors, stack_tensors=encoders.text.stack_and_pad_tensors)
.
Example
>>> import torch >>> batch = [ ... { 'column_a': torch.randn(5), 'column_b': torch.randn(5) }, ... { 'column_a': torch.randn(5), 'column_b': torch.randn(5) }, ... ] >>> collated = collate_tensors(batch) >>> {k: t.size() for (k, t) in collated.items()} {'column_a': torch.Size([2, 5]), 'column_b': torch.Size([2, 5])}
- batch (list of k) – List of rows of type
-
torchnlp.utils.
flatten_parameters
(model)[source]¶ flatten_parameters
of a RNN model loaded from disk.
-
torchnlp.utils.
get_tensors
(object_)[source]¶ Get all tensors associated with
object_
Parameters: object (any) – Any object to look for tensors. Returns: List of tensors that are associated with object_
.Return type: (list of torch.tensor)
-
torchnlp.utils.
get_total_parameters
(model)[source]¶ Return the total number of trainable parameters in
model
.Parameters: model (torch.nn.Module) – Returns: The total number of trainable parameters in model
.Return type: (int)
-
torchnlp.utils.
lengths_to_mask
(*lengths, **kwargs)[source]¶ Given a list of lengths, create a batch mask.
Example
>>> lengths_to_mask([1, 2, 3]) tensor([[ True, False, False], [ True, True, False], [ True, True, True]]) >>> lengths_to_mask([1, 2, 2], [1, 2, 2]) tensor([[[ True, False], [False, False]], <BLANKLINE> [[ True, True], [ True, True]], <BLANKLINE> [[ True, True], [ True, True]]])
Parameters: - *lengths (list of python:int or torch.Tensor) –
- **kwargs – Keyword arguments passed to
torch.zeros
upon initially creating the returned tensor.
Returns: torch.BoolTensor
-
torchnlp.utils.
sampler_to_iterator
(dataset, sampler)[source]¶ Given a batch sampler or sampler returns examples instead of indices
Parameters: - dataset (torch.utils.data.Dataset) – Dataset to sample from.
- sampler (torch.utils.data.sampler.Sampler) – Sampler over the dataset.
Returns: generator over dataset examples
-
torchnlp.utils.
split_list
(list_, splits)[source]¶ Split
list_
using thesplits
ratio.Parameters: Returns: Splits of the list.
Return type: (list)
Example
>>> dataset = [1, 2, 3, 4, 5] >>> split_list(dataset, splits=(.6, .2, .2)) [[1, 2, 3], [4], [5]]
-
torchnlp.utils.
tensors_to
(tensors, *args, **kwargs)[source]¶ Apply
torch.Tensor.to
to tensors in a generic data structure.Inspired by: https://github.com/pytorch/pytorch/blob/master/torch/utils/data/_utils/collate.py#L31
Parameters: - Example use case:
- This is useful as a complementary function to
collate_tensors
. Following collating, it’s important to move your tensors to the appropriate device.
Returns: The inputted tensors
withtorch.Tensor.to
applied.Example
>>> import torch >>> batch = [ ... { 'column_a': torch.randn(5), 'column_b': torch.randn(5) }, ... { 'column_a': torch.randn(5), 'column_b': torch.randn(5) }, ... ] >>> tensors_to(batch, torch.device('cpu')) # doctest: +ELLIPSIS [{'column_a': tensor(...}]