Gradient boosting handles tabular data better than neural networks, often because the structure is simpler, and it becomes more of an issue to deal with the noise. You can do like-to-like comparisons between them for unstructured data like images, audio, video, text, and a well-designed NN will mop the floor with gradient boosting. This is because to handle that sort of data, you need to encode some form of bias around expected convolutional patterns in the data, or you won't get anywhere. Both CNNs and transformers do this.
Would you agree/disagree with the following:
- It's not gradient boosting per se that's good on tabular data, it's trees. Other fitting methods with trees as the model are also usually superior to NNs on tabular data.
- Trees are better on tabular data because they encode a useful inductive bias that NNs currently do not. Just like CNNs or ViTs are better on images because they encode spatial locality as an inductive bias.