thanks, i tested it, failed in strawberry test. qwen 3.5 0.8B with similar size passes it and is far...

pdyc • today at 2:52 AM • 3 replies • view on HN

thanks, i tested it, failed in strawberry test. qwen 3.5 0.8B with similar size passes it and is far more usable.

Replies

I hope you are kidding, how is that a test of any capabilities? it's a miracle that any model can learn strawberry because it cannot see the actual characters and ALSO, it's likely misspelled a lot in the corpus. I've been playing with this model and I'm pleasantly surprised, it certainly knows a lot, quite a lot for 1.1G

algoth1 • today at 9:09 AM

Does asking it to think step by step, or character by character, improves the answer? It might be a tokenization+unawareness of its own tokenization shortcomings

➕ show 1 reply

selcuka • today at 4:12 AM

Interesting. Qwen 3.5 0.8B failed the test for me.

alt Hacker News

Replies