If LLMs' internal representations are essentially one-to-one mappings of input texts with no additional structure, how can those representations be useful for tasks like object manipulation in robotics?
How is transfer learning possible when non-textual training data enhances performance on textual tasks?
If LLMs' internal representations are essentially one-to-one mappings of input texts with no additional structure, how can those representations be useful for tasks like object manipulation in robotics?
How is transfer learning possible when non-textual training data enhances performance on textual tasks?