logoalt Hacker News

pleshkovlast Tuesday at 11:32 AM4 repliesview on HN

Author here — questions and pushback both welcome.


Replies

Devilstrotoday at 8:05 AM

In the article, you mention this approach requires no search over hyper-parameter, because the method comprises a closed-form solution with "simple" linear algebra. I agree with this, but do you not in think need to tune the L2-regularization strength? That would for me be a hyper-parameter you would need to do a CV over (or similarly).

yorwbatoday at 8:06 AM

You should benchmark the retrieval speed of each method in terms of queries per second. I suspect that the gain in bandwidth you get from slightly better compression will be defeated by decompression being much more expensive.

afxuhtoday at 9:02 AM

Cool idea. But it only works when the data never changes. could you make a streaming/incremental version? One that updates the math cheaply when new data arrives, instead of recomputing everything, or does the math fundamentally prevent it?

show 1 reply
stephantultoday at 9:33 AM

Really cool! I was investigating PCA on retrieval, thanks for the references.