logoalt Hacker News

Matrix Orthogonalization Improves Memory in Recurrent Models

76 pointsby at2005today at 5:13 AM26 commentsview on HN

Comments

imurraytoday at 2:04 PM

Here is a pytorch optimizer that can maintain a matrix as orthogonal throughout optimization:

https://github.com/adrianjav/pogo — POGO: A Proximal One-step Geometric Orthoptimizer

https://arxiv.org/abs/2602.14656 — An Embarrassingly Simple Way to Optimize Orthogonal Matrices at Scale; Adrián Javaloy, Antonio Vergari

show 1 reply
hasleytoday at 3:23 PM

I suspect with "orthogonalization" they mean to find vectors that form an orthogonal bases (same subspace) for the vectors in the source matrix.

I wonder what would be the result if they used a matrix that is orthogonal and closest to the source matrix. Usually one uses the Frobenius norm (root of the sum of all squared matrix entries). Maybe, one could even try another norm that gives a sparser matrix.

show 2 replies
BirbSingularitytoday at 6:19 AM

I can't help but think of orthogonal frequency-division multiplexing and it's use in encoding data on multiple carrier frequencies, and it makes me wonder what other parallels we will discover between digital transmission technology for cross-domain stuff like this.

show 2 replies
phkahlertoday at 1:04 PM

If it can be made orthogonal, can you go a step further and diagonalize it? The storage and performance improvement from that would be huge.

show 2 replies
harveyrooktoday at 12:46 PM

Now I’m wondering what is the eigenspace of an LLM? If I take a set of LLM’s with the same number of parameters, then what are the eigenvectors? Do they have different personalities?

show 1 reply
mv_d5339e31today at 9:00 AM

[dead]