tentative evals on o3-mini for @ellipsis_dev code review: reasoning_effort="low" is meh reasoning_effort="medium" is quite good, allows us to simplify our pipeline reasoning_effort="high" is so good it found a bunch of new bugs in our eval PRs that we hadn't noticed before
EPX
9.23%
From Twitter
Disclaimer: The content above is only the author's opinion which does not represent any position of Followin, and is not intended as, and shall not be understood or construed as, investment advice from Followin.
Like
Add to Favorites
Comments
Share
Relevant content