People have tried to suss this out on the ML subreddit, and it is confusing. Most of the worst messages from Tay were just people discovering a "repeat after me: __" function, so it's hard just to figure out which Tay messages to consider as responses of the model.
There seems to have been interest in a model which would pick up language and style of its conversations (not actually learning information or looking up facts). If you haven't trained an LSTM model before - you could train on Shakespeare's plays and get out ye olde English in a screenplay format, but from line to line there was no consistency in plot, characters, entrances and exits, etc. in a way which you'd expect after GPT-2. Twitter would be good for keeping a short-form conversation. So I believe Tay and the Watson that appeared on Jeopardy are more from this 'classical NLP' thinking and not proto-LLMs, if that makes sense.