BLEU score

Bilingual Evaluation Understudy (BLEU) is a metric for evaluating Machine translation, with value in range 0 to 1. The higher the BLEU score, the closer the computer-generated text is to the human-translated reference text. It uses a weighted sum of precisions of n-grams $p_{n}$ at its core along with a brevity penalty (BP) to penalize translations that are too short. The BLEU formula is:

BLEU = BP \cdot exp (n = 1 \sum N w_{n} lo g (p_{n}))

Calculation steps:

Choose the n-gram order: N=4 typically
Count matching n-grams: For each n-gram size, calculate how many matching n-grams exist between the machine-generated text and the reference translations. Be sure to account for repeated phrases
Calculate the precision of n-grams in BLEU i.e. $p_{n}$ values
Do the weighted sum of log probs, followed by exponentiation, which is very similar to geometric mean average precision: $exp (n = 1 \sum N w_{n} lo g p_{n}) = n = 1 \prod N p_{n}^{w_{n}} = (p_{1})^{\frac{1}{1}} \cdot (p_{2})^{\frac{1}{2}} \cdot (p_{3})^{\frac{1}{3}} \cdot (p_{4})^{\frac{1}{4}}$
Apply BP to get the final BLEU score

encoder-decoder architecture

Altamash Khan

Altamash Khan

BLEU score

Backlinks

Altamash Khan

BLEU score

Related

Backlinks