why is the breakdown from words to letters your highest priority thing to add to the training data?
what problem does this allow you to solve that you couldnt otherwise?
This comment is tangential to their point that a transformer architecture can or cannot be functionally equivalent to a human brain. Practicality of those limitations is a different discussion
This comment is tangential to their point that a transformer architecture can or cannot be functionally equivalent to a human brain. Practicality of those limitations is a different discussion