Claude 3.5 Sonnet outperforms GPT-4o and o1 in software engineering, OpenAI study shows

A new OpenAI study reveals Claude 3.5 Sonnet outperforms GPT-4o and o1 on SWE-Lancer, a new benchmark simulating real-world software engineering tasks.