the models can accept images directly as tokens. not a description of an image, the actual image itself.
yes, the visual intelligence is limited, but they do actually have vision capabilities.
Yes, I agree, we're saying the same thing, I'm just trying to highlight that the "visual intelligence" really isn't up to par for anything stringent when it comes to UI and UX. Explained further here: https://news.ycombinator.com/item?id=48133641
Yes, I agree, we're saying the same thing, I'm just trying to highlight that the "visual intelligence" really isn't up to par for anything stringent when it comes to UI and UX. Explained further here: https://news.ycombinator.com/item?id=48133641