thanks, i tested it, failed in strawberry test. qwen 3.5 0.8B with similar size passes it and is far more usable.
Does asking it to think step by step, or character by character, improves the answer? It might be a tokenization+unawareness of its own tokenization shortcomings
Interesting. Qwen 3.5 0.8B failed the test for me.
I hope you are kidding, how is that a test of any capabilities? it's a miracle that any model can learn strawberry because it cannot see the actual characters and ALSO, it's likely misspelled a lot in the corpus. I've been playing with this model and I'm pleasantly surprised, it certainly knows a lot, quite a lot for 1.1G