logoalt Hacker News

canjobearyesterday at 4:09 PM1 replyview on HN

Softmax isn't a loss function. It is used to transform model outputs into positive numbers that sum to 1, so that they can be interpreted as probabilities, and then those numbers are passed into (typically) the cross entropy loss function. I think you mean, which models are trained using some function other than softmax to transform the model outputs. There are a number of alternatives to softmax, such as the ones described here https://www.emergentmind.com/topics/sparsemax


Replies

jmalickiyesterday at 4:20 PM

The cross entropy loss function is softmax. They are one and the same.

show 1 reply