It's because the dataset is all algorithmically lossy compressed music, and not the real source
Basically made with pirated mp3s