I thought finetuning data can't contradict foundation models, and anything that are inconsistent with the standard LLM American-Chinese split personality would be rejected?
Fine tuning happens on top of pretraining, so of course it can "forget" pretrained defaults when warranted by the new data it's being fine tuned on.
Fine tuning happens on top of pretraining, so of course it can "forget" pretrained defaults when warranted by the new data it's being fine tuned on.