This is a useless benchmark now a days, every model provider trains their models on making good pelicans. Some have even trained every combination of animal/mode of transportation
Every model provider except OpenAI?
Every model provider except OpenAI?