logoalt Hacker News

gertlabstoday at 2:32 AM0 repliesview on HN

Taking into account that this is a flash model, it's a strong release. It's very fast and frontier-ish for the price.

Raw intelligence is high for a flash model. But Google's problem has always been productization and tool use, whereas raw intelligence is always competitive. It does not look like they solved that with this release -- in fact, their tool use delta (the improvement in scores when given arbitrary tools and a harness) has actually regressed from some previous models.

Data at https://gertlabs.com/rankings