logoalt Hacker News

nighthawk454yesterday at 7:36 PM0 repliesview on HN

Seems to be a trend away from mean-pooling into a single embedding. But instead of dealing with an embedding per token (lots) you still want to reduce it some. This method seems to cluster token embeddings by random partitioning, mean pool for each partition, and concatenate the resulting into a fixed-length final embedding.

Essentially, full multi vector comparison is challenging performance wise. Tools and performance for single vectors are much better. To compromise, cluster into k chunks and concatenate. Then you can do k-vector comparison at once with single-vector tooling and performance.

Ultimately the fixed length vector comes from having a fixed number of partitions, so this is kind of just k-means style clustering of the token level embeddings.

Presumably a dynamic clustering of the tokens could be even better, though that would leave you with a variable number of embeddings per document.