Topics

The ratio of co-occurrence probabilities in GloVe provides an intuitive way to capture relationships between words. This can be illustrated using the example of “ice” and “steam” as center words, with various context words.

Let be the probability of seeing word in the context of word .

Consider the following co-occurrence probability ratios:

Context word ()Ratio
solid0.000190.0000228.9
gas0.0000660.000780.085
water0.0030.00221.36
fashion0.0000170.0000180.96

Interpreting these ratios:

  1. Related to “ice”, unrelated to “steam”:

    • For “solid”, ratio = 8.9 (much larger than 1)
    • Indicates a strong association with “ice” but not with “steam”
  2. Related to “steam”, unrelated to “ice”:

    • For “gas”, ratio = 0.085 (much smaller than 1)
    • Indicates a strong association with “steam” but not with “ice”
  3. Related to both “ice” and “steam”:

    • For “water”, ratio = 1.36 (close to 1)
    • Indicates a relationship with both words
  4. Unrelated to both “ice” and “steam”:

    • For “fashion”, ratio = 0.96 (very close to 1)
    • Indicates no strong relationship with either word

GloVe uses these ratios to learn word vectors that preserve these relationships in the embedding space. We can design a function of three word vectors to fit this ratio, which helps us to derive the GloVe objective function later.