• afshinmeh a day ago

    Vibe coding aside [1], it's very interesting software projects these days don't really care about adding a single test [2].

    [1]: https://github.com/withastro/flue/blob/8fdf8e0e9df5bd33c3120...

    [2]: https://github.com/search?q=repo%3Awithastro%2Fflue+test+pat...

    • willio58 11 hours ago

      I find automated tests are the only way to keep vibe coded projects on the rails, especially as you grow something beyond demo phase.

      • amluto a day ago

        I find this impressive: in my experience, codex-rs loves to add tests even when not prompted. Of course, it’s a bit of a crap shoot as to whether the test tests useful behavior.

        (My favorite so far: it created an empty file in /home/whatever and added a test to verify that some code it wrote would indeed fail when tested on this empty input and that it would fail with the correct error message. Never mind that this covered approximately none of the desired behavior and that the test would, of course, fail on any other system.

        • jstummbillig a day ago

          That would be really interesting. I doubt it's the case, actually probably the opposite? The harnesses seem very happy to write extensive test suits, without me having to ask much.

          • gerardnico 16 hours ago

            Tests is the new gold. You keep them to avoid a vibe coded fork.

            • ai_slop_hater a day ago

              And what would they test? This is a meaningless wrapper for Anthropic or OpenAI SDKs.

            • 9cb14c1ec0 a day ago

              Ok, real question. What products are people actually building with agent frameworks? I get the utility of AI coding tools and generic chat apps, but that is the extent of utility that I've been able to get from AI. I'm looking for examples that are real businesses, not toys.

              • steve_adams_86 12 hours ago

                I use a custom framework for creating basic but useful tools that work with sensitive data. There are cases in my organization where I like the idea of people using Claude or similar to assist with a process, but Claude Desktop or Claude Code doesn't offer the safety or security we need (in part because the people using it are unconstrained, in part because the harnesses aren't perfect and the LLMs can make bad choices).

                This provides a harness that's a state machine with very explicit directives, and it uses Deno as the runtime to constrain network, filesystem, environment, and other types of access at runtime as needed.

                Kind of like using skills in Claude Code to teach it how to do something, but with extremely tight guard rails. Like, you can only write a specific file when in a specific state, otherwise that tool isn't even callable.

                It requires understanding the problem that's being solved quite well. This often leads to realizing it can be automated without a harness. Finding cases where an LLM is genuinely crucial to enabling the automation is difficult.

                A good example of one recently was getting a local LLM to define schemas for an internal tool based on existing research data. It looks at the data, figures out the semantics of the data, relationships, and how that maps to the target schema. This is impossible to automate without this semantic inference. It then uses duckdb to perform transformations from raw data to the appropriate schema, and finally, tests the schema in the validator with the data. It makes a very complex, often unappealing and confusing process very easy. Once it's done, the data is in better shape than we ever got it to by hand. This is partially because of a validator I created, but also because the LLM can identify patterns really well and retain a massive spec while it works.

                You could do it with all kinds of existing harnesses but this one lets us comfortably define processes we trust and lets us operate on data our partners would never allow into the cloud or on OpenAI/Anthropic's servers in particular.

                > I'm looking for examples that are real businesses, not toys.

                These tools are used within a real business (specifically a coastal science NGO) and they aren't toys, so hopefully that's useful information. Based on my experience so far, and it could be my lack of imagination, I have no idea how you'd use these as the foundation for a business. I find more cases that can be automated without an LLM than I do with one, and they tend to be so niche and strange that no one else would ever need them and they can't be generalized.

                • mindfulmark 17 hours ago

                  We're building https://brooked.io/. In the same way that Cursor provides a lot of features on top of the base agents, we want to do the same for spreadsheets. There are many workflows that benefit from having an agent available - resolving cell values from a prompt, writing functions, sheet insights, alerting, debugging.

                • leothecool a day ago

                  What is the problem this solves? Why would I use this instead of telling claude to vomit out the underlying boilerplate.

                  • axpy906 a day ago

                    Yeah don’t understand. It’s a lot easier to just build your own.

                    • bronson a day ago

                      Right. So why would I use this?

                      • SpyCoder77 a day ago

                        If you want to hand make agent flows and have "software brain", this is better than using N8N.

                    • nextaccountic 21 hours ago

                      It's the exact same problem solved by any library whatsoever

                      If you (or your agent) have to write less code, there's less room to write bugs. There will be less code to understand when it needs to be modified too.

                    • sdevonoes a day ago

                      Why TS? The npm ecosystem is insane and insecure. Not a chance we are running this in our machines.

                      Go/Rust way better choices. Besides, if it’s all vibe coded, it shouldn’t matter for the author

                      • jdw64 a day ago

                        I think the TypeScript ecosystem is more suitable for this.

                        I do not think Rust is a bad language. But the agent ecosystem changes very quickly, and in Rust, assembling and reshaping agent workflows is difficult.

                        Many people prefer Rust, and I understand why. It is a genuinely excellent language, and “Rust is a great language” is a strong message that attracts many developers. But as long as lifetimes exist, I think it will remain difficult.

                        The lifetime system assumes, in some sense, that humans can fully predict the lifecycle of values and resources. I am not sure that is truly possible in all domains. I am also not sure whether that model is linguistically suitable for the agent ecosystem.

                        In agent systems, requirements change constantly. Tools change, workflows change, providers change, schemas change, and failure policies change. In that kind of environment, I am not sure Rust is the right fit.

                        I like Rust a lot, and it is a language I genuinely want to learn. But I am not sure that applying Rust to everything is really the right answer.

                        I think Rust makes a lot of sense in relatively stable infrastructure ecosystems: operating systems, runtimes, sandboxes, and core low-level layers. But agent code usually requires high-level abstraction and rapid workflow composition. Doing that in Rust takes a tremendous amount of time.

                        • EdwardDiego a day ago

                          They also mentioned Go.

                          • beepbooptheory a day ago

                            Why do agent systems change more than other things? Maybe while were here: What even is an agent system anyway? Does one work on agent systems as the final product, or is the agent system what you work with to make something else?

                            • jdw64 a day ago

                              The definition of “agent” has changed quite a bit, even in ACL papers and other academic work.

                              Looking at recent examples, the practical boundary seems to be whether an LLM uses tools. In some 2023 papers, certain pipeline-based systems were still referred to as agents. More recently, the term seems to mean something looser but more action-oriented: a system that understands a goal, uses tool calls, selects actions, and executes them.

                              In other words, there is still no fully settled engineering definition of what an agent is. I am not an expert or a graduate student; I mostly work as a subcontractor who gets hired by university professors to reproduce specific paper metrics.

                              In general, every system changes frequently in its early stage. Agent systems are no different. The workflows keep changing because the field does not yet have stable, openly accepted standards for AI development.

                              That is also why Claude, Codex, and others are fighting to define the standard. I think the term "harness," which Anthropic has been popularizing recently, is part of the same trend. By harness I mean the execution layer around the model call itself: context management, tool dispatch, retry and fallback policies, eval loops. That layer is still actively shifting. The naming is not settled, the responsibilities are not settled, and the boundaries between the harness and the model are not settled either. Each provider is drawing those lines a little differently right now.

                              So my view is this: agent systems change frequently because the definition differs from person to person, the field keeps updating rapidly, and there is no engineering standard that has been firmly established yet.

                              Even the I/O standard itself is not really settled.

                          • ChaseRensberger a day ago

                            Not ready for production yet but ive been working on https://wingman.actor for quite a while. Its a golang based portable agent runtime with minimal dependencies.

                            • d0100 a day ago

                              I am using TS sandboxed in deno for all our agent code generated from a UI builder (inspired by OpenAI's own agent builder, and spits out the same code output)

                            • wartywhoa23 a day ago

                              So we get it, all stuff agentic will be named after various diseases, how apt.

                              • spankalee a day ago

                                Uh... a flue is not a disease.

                                • minikomi a day ago

                                  It's like a heavier commone colde

                              • egonschiele a day ago

                                As other comments have said, it would be great to add what this does that existing solutions can't. I see the project has been active since Feb, and has < 150 commits. I'd assume this is still pretty immature. So why use this? I think more explanation is needed.

                                • systima a day ago

                                  How does this differ to Mastra?

                                  • bigethan a day ago

                                    Mastra is a business, this seems to be a helpful lib

                                    • esprehn a day ago

                                      One big difference seems to be Mastra has tests and this project doesn't.

                                    • _s_a_m_ a day ago

                                      The new JavaScript web frameworks are agent frameworks

                                      • dataviz1000 a day ago

                                        Why would I choose this over Mastra? [0]

                                        [0] https://mastra.ai/

                                        • maxdo a day ago

                                          Why would I choose mastra over anthropic or cursor sdk ?

                                          • sexylinux a day ago

                                            why not pi.dev?

                                            • the_mitsuhiko a day ago

                                              Flue is built on top of pi.

                                          • claudiug a day ago

                                            why not?

                                            • dataviz1000 a day ago

                                              I don't know. That is why I ask why.

                                          • mrtksn a day ago

                                            What I wonder is, why do we still need code etc? Shouldn’t all code be just a promt? This way it becomes language and platform agnostic.

                                            • yoz-y a day ago

                                              How would that actually work? Eventually the specification for the CPU has to be clear. For now, that means machine code.

                                              • piskov a day ago

                                                Natural language is too vague, ambiguous, and inefficient.

                                                That’s why we have _programming_ languages.

                                                And once you specify everything you need, the “prompt” becomes a program.

                                                Anything else is to lossy

                                              • piskov a day ago

                                                If only there were great backend languages

                                                Go, C#, what have you.

                                                Nah, thank god we have javascript

                                                • gavmor a day ago

                                                  Anders Hejlsberg, who designed C#, is now leading Typescript development. Why would I not join him at the frontiers of his creative and intellectual energies?

                                                  Go is a nice language, but it's not expressive the way typescript can be. I'm not convinced, either, that coroutines are all that snazzy an abstraction at the application level.

                                                  • vips7L a day ago

                                                    Kotlin and Scala too if you want the same type of strong type system as TypeScript

                                                    • braebo a day ago

                                                      No type system is as strong as TypeScript — certainly not Kotlin.

                                                      • vips7L a day ago

                                                        Give Scala a try :)

                                                      • wiseowise a day ago

                                                        Lipstick on Java with vendor lock in or another lipstick on Java made by and for academics, tough choice.

                                                        • dominotw a day ago

                                                          coded in scala for over a decade. i am glad i dont have to use it anymore. maybe i am just too stupid for it, never understood the point of all that.

                                                      • jdw64 a day ago

                                                        Not mentioning Java means we are already on the same wavelength.

                                                        I love C# too.

                                                        • morphology a day ago

                                                          You're welcome to port anything over to those languages. LLMs can do it in a couple of days at most.

                                                          • piskov a day ago

                                                            Tell me next time Codex app would be rewritten to native stack scross all platforms: swiftui, winui, etc.

                                                            Should be easy, yeah?

                                                          • verdverm a day ago

                                                            ADK comes in Go, Java, Python, and TS

                                                            Same framework, multiple languages, let people decide their preference while having consistency and interoperability

                                                            • wiseowise a day ago

                                                              One is stuck in 80s and another doesn’t even have official open source debugger, are you serious?

                                                              • pikedev a day ago

                                                                C# is still getting yearly updates and is a joy to work with really.

                                                            • doublerabbit a day ago

                                                                  import { getVirtualSandbox } from '@flue/sdk/cloudflare';
                                                              
                                                              You lost me there. Looked kind of cool too.
                                                              • jadar a day ago

                                                                What was offensive? Was it the use of Cloudflare? Or the way the import was written?

                                                                • doublerabbit a day ago

                                                                  cloudflare and the idea of would be "virtualsandbox"

                                                              • skullone a day ago

                                                                Another typescript library! Woohoo!