logoalt Hacker News

raddanyesterday at 10:03 PM3 repliesview on HN

I’d like to know what the advantage is over KL divergence. It seems like the important idea is symmetry? Not clear to me why that matters; I’d love to know what application this is used for.


Replies

fumeux_fumeyesterday at 10:32 PM

There are many applications. I mainly see it used for detecting drift in datasets for ML models. It has a nice benefit over the KL divergence in the case where the two distributions you're measuring have no overlap (KL won't compute, but JS will just return 0). Also, when taking its square root you get a distance rather than a divergence which allows you to compare it to JSD measurements of other distributions.

show 1 reply
andy99yesterday at 10:13 PM

Iirc (and I could be wrong, this is from memory) JS divergence is what is minimized in GANs (where we simultaneously train a generator and real/synthetic classifier with the goal of each trying to beat the other to converge on real looking synthetic data), at least for some training methods.

I don’t think GANs are used much now in comparison to diffusion models, but as recently as a few years ago they were the standard way to make fake data, a la “this face does not exist”