torchnlp.metrics package

The torchnlp.metrics package introduces a set of modules able to compute common NLP metrics.

torchnlp.metrics.get_accuracy(targets, outputs, k=1, ignore_index=None)[source]

Get the accuracy top-k accuracy between two tensors.

Parameters:
  • targets (1 - 2D torch.Tensor) – Target or true vector against which to measure saccuracy
  • outputs (1 - 3D torch.Tensor) – Prediction or output vector
  • ignore_index (int, optional) – Specifies a target index that is ignored
Returns:

tuple consisting of accuracy (float), number correct (int) and total (int)

Example

>>> import torch
>>> from torchnlp.metrics import get_accuracy
>>> targets = torch.LongTensor([1, 2, 3, 4, 5])
>>> outputs = torch.LongTensor([1, 2, 2, 3, 5])
>>> accuracy, n_correct, n_total = get_accuracy(targets, outputs, ignore_index=3)
>>> accuracy
0.8
>>> n_correct
4
>>> n_total
5
torchnlp.metrics.get_token_accuracy(targets, outputs, ignore_index=None)[source]

Get the accuracy token accuracy between two tensors.

Parameters:
  • targets (1 - 2D torch.Tensor) – Target or true vector against which to measure saccuracy
  • outputs (1 - 3D torch.Tensor) – Prediction or output vector
  • ignore_index (int, optional) – Specifies a target index that is ignored
Returns:

tuple consisting of accuracy (float), number correct (int) and total (int)

Example

>>> import torch
>>> from torchnlp.metrics import get_token_accuracy
>>> targets = torch.LongTensor([[1, 1], [2, 2], [3, 3]])
>>> outputs = torch.LongTensor([[1, 1], [2, 3], [4, 4]])
>>> accuracy, n_correct, n_total = get_token_accuracy(targets, outputs, ignore_index=3)
>>> accuracy
0.75
>>> n_correct
3.0
>>> n_total
4.0
torchnlp.metrics.get_moses_multi_bleu(hypotheses, references, lowercase=False)[source]

Get the BLEU score using the moses multi-bleu.perl script.

Script: https://raw.githubusercontent.com/moses-smt/mosesdecoder/master/scripts/generic/multi-bleu.perl

Parameters:
  • hypotheses (list of str) – List of predicted values
  • references (list of str) – List of target values
  • lowercase (bool) – If true, pass the “-lc” flag to the multi-bleu.perl script
Returns:

(np.float32) The BLEU score as a float32 value.

Example

>>> hypotheses = [
...   "The brown fox jumps over the dog 笑",
...   "The brown fox jumps over the dog 2 笑"
... ]
>>> references = [
...   "The quick brown fox jumps over the lazy dog 笑",
...   "The quick brown fox jumps over the lazy dog 笑"
... ]
>>> get_moses_multi_bleu(hypotheses, references, lowercase=True)
46.51