You are right that the wing/leg comparison is often lazy rhetoric: we hold engineered systems to different failure standards for good reason.
But you are misusing the mereological fallacy. It does not dismiss LLM/brain comparisons: it actually strengthens them. If the brain does not "think" (the person does), then LLMs do not "think" either. Both are subsystems in larger systems. That is not a category error; it is a structural similarity.
This does not excuse LLM limitations - rimeice's concern about two unreliable parties is valid. But dismissing comparisons as "category errors" without examining which properties are being compared is just as lazy as the wing/leg response.