Haiku/Flash/small models are underpowered for literally anything where being non-false-positively correct on details matters at least like 25%. (That's not to say they are only correct 25% of the time, it's definitely more than that, but they're blatantly confidently wrong often enough that the wasted time is a significant net negative for me, even on relatively trivial tasks.)