i think theres a different lesson to be taken from those cases - the LLM will build to what you give a feedback loop for.
if you give just the logical tests, it wont consider the speed at all. if you included tests that measure the speed and ask the llm to match the performance, itll do that too.
its the same class of error as everything else with llms. it has no common sense context for things people consider important. if you dont enforce the boundaries, it will ignore them
Question is, are our optimization functions well specified enough? (No)
How important is well specified opt function? No one knows. We will find out