late-interaction models

Topics

neural information retrieval

ColBERT

Finds a sweet spot between no-interaction and all-to-all interaction modelling.

In no-interaction or bi-encoders, model takes either query or document and gives a single embedding. We do cosine similary and get score. In all-to-all interaction or cross-encoders, we train a model to score on query-doc pairs. Model internally interacts with all query and document tokens.

In late interaction, model generates context-aware embeddings (note the plural) from query and document separately. These generated embeddings from query and document, are cross-encoded or interacted to obtain similarity score.

Since the query-document interaction happens late after embeddings have been obtained separately for query and document terms, we call this late-interaction.

Example: ColBERT
Advantages:
- Expressiveness via query-document interaction
- Computational benefits of offline document representation
- Avoids information bottleneck of single embeddings

Altamash Khan

Altamash Khan

late-interaction models

Backlinks

Altamash Khan

late-interaction models

Related

Backlinks