• TeamDman a minute ago

    for 50,000 rows I'd much rather just use fzf/nucleo/tv against json files instead of dealing with database schemas. When it comes to dealing with embedding vectors rather than plaintext then it gets slightly more annoying but still feels like such an pain in the ass to go full database when really it could still be a bunch of flat open files.

    More of a perspective from just trying to index crap on my own machine vs building a SaaS

    • fsckboy 14 minutes ago

      these days i find myself yearning to type "Beatles abbey rd" and find only "Beatles abbey rd"

      • Manfred 11 minutes ago

        Especially with small datasets it’s more important to be exact at the expense of a user having to fix a typo.

      • pinkmuffinere 15 minutes ago

        The rewritten title is confusing imo. Can I propose:

        “Finding ‘Abbey Road’ given ‘beatles abbey rd’ search with Postgres”

        • pinkmuffinere 13 minutes ago

          (The missing close-apostrophe, and the use of “type” are what really confuse me in the original submission)

        • lbrito 35 minutes ago

          I was just starting to learn about embeddings for a very similar use on my project. Newbie question: what are pros/cons of using an API like gpt Ada to calculate the embeddings, compared to importing some model on Python and running it locally like in this article?

          • storystarling 19 minutes ago

            The main trade-off I found is the RAM footprint on your backend workers. If you run the model locally, every Celery worker needs to load it into memory, so you end up needing much larger instances just to handle the overhead.

            With Ada your workers stay lightweight. For a bootstrapped project, I found it easier to pay the small API cost than to manage the infrastructure complexity of fat worker nodes.

            • alright2565 22 minutes ago

              Do you want it to run on your CPU, or someone else's GPU?

              Is the local model's quality sufficient for your use case, or do you need something higher quality?

            • gingerlime 31 minutes ago

              Great post. Explains the concepts just enough that they click without going too deep, shows practical implementation examples, how it fits together. Simple, clear and ultimately useful. (to me at least)

              • cess11 30 minutes ago

                I found fuzzy search in Manticore to be straightforward and pretty good. Might be a decent alternative if one perceives the ceremony in TFA as a bit much.