I was always curious about how Tay worked technically, since it was build before the Transformers er...

InfiniteLoup • today at 12:59 PM • 2 replies • view on HN

I was always curious about how Tay worked technically, since it was build before the Transformers era.

Was it based on a specific scientific paper or research?

The controversy surrounding it seemed to have polluted any search for a technical breakdown or a discussion, or the insights gained from it.

Replies

mapmeld • today at 3:56 PM

People have tried to suss this out on the ML subreddit, and it is confusing. Most of the worst messages from Tay were just people discovering a "repeat after me: __" function, so it's hard just to figure out which Tay messages to consider as responses of the model.

There seems to have been interest in a model which would pick up language and style of its conversations (not actually learning information or looking up facts). If you haven't trained an LSTM model before - you could train on Shakespeare's plays and get out ye olde English in a screenplay format, but from line to line there was no consistency in plot, characters, entrances and exits, etc. in a way which you'd expect after GPT-2. Twitter would be good for keeping a short-form conversation. So I believe Tay and the Watson that appeared on Jeopardy are more from this 'classical NLP' thinking and not proto-LLMs, if that makes sense.

Kye • today at 1:10 PM

https://blogs.microsoft.com/blog/2016/03/25/learning-tays-in...

https://arxiv.org/abs/1812.08989

alt Hacker News

Replies