• drannex 3 hours ago

    Yes! but, when they work, they only kinda work, sort of.

    • rogerkirkness a day ago

      Agents went from 10% to 30% reliable this year, which is still a big deal.

      • bogzz 9 hours ago

        lol

      • thebigspacefuck a day ago

        This is from a Dec 2024 which feels like a while ago

        • bsallthewaydown 14 hours ago

          AI is a going to be the next bubble. It can't even figure out who the real author of a sculpture is. It's really all BS made up to play with markets and geopolitics. Enjoy it while it lasts.

          • JTbane a day ago

            "We test baseline agents powered by both closed API-based and open-weights language models (LMs), and find that the most competitive agent can complete 30% of tasks autonomously."

            • gavinray a day ago

              So you ask it to try every task 3.33 times for guaranteed success?