Matrix Orthogonalization Improves Memory in Recurrent Models

76 points • by at2005 • today at 5:13 AM • 26 comments • view on HN

Comments

Here is a pytorch optimizer that can maintain a matrix as orthogonal throughout optimization:

https://github.com/adrianjav/pogo — POGO: A Proximal One-step Geometric Orthoptimizer

https://arxiv.org/abs/2602.14656 — An Embarrassingly Simple Way to Optimize Orthogonal Matrices at Scale; Adrián Javaloy, Antonio Vergari

➕ show 1 reply

hasley • today at 3:23 PM

I suspect with "orthogonalization" they mean to find vectors that form an orthogonal bases (same subspace) for the vectors in the source matrix.

I wonder what would be the result if they used a matrix that is orthogonal and closest to the source matrix. Usually one uses the Frobenius norm (root of the sum of all squared matrix entries). Maybe, one could even try another norm that gives a sparser matrix.

➕ show 2 replies

BirbSingularity • today at 6:19 AM

I can't help but think of orthogonal frequency-division multiplexing and it's use in encoding data on multiple carrier frequencies, and it makes me wonder what other parallels we will discover between digital transmission technology for cross-domain stuff like this.

➕ show 2 replies

phkahler • today at 1:04 PM

If it can be made orthogonal, can you go a step further and diagonalize it? The storage and performance improvement from that would be huge.

➕ show 2 replies

harveyrook • today at 12:46 PM

Now I’m wondering what is the eigenspace of an LLM? If I take a set of LLM’s with the same number of parameters, then what are the eigenvectors? Do they have different personalities?

➕ show 1 reply

mv_d5339e31 • today at 9:00 AM

[dead]

alt Hacker News

Matrix Orthogonalization Improves Memory in Recurrent Models

Comments