Comments Page - A verification layer for browser agents: Amazon case study

« Back A verification layer for browser agents: Amazon case studysentienceapi.comSubmitted by tonyww a day ago

tonyww a day ago
One clarification since a few comments from coworkers/friends are circling this: Amazon isn’t the point here.
We used it because it’s a dynamic, hostile UI, but the design goal is a site-agnostic control plane. That’s why the runtime avoids selectors and screenshots and instead operates on pruned semantic snapshots + verification gates.
If the layout changes, the system doesn’t “half-work” — it fails deterministically with artifacts. That’s the behavior we’re optimizing for.
- tomhow 14 hours ago
  Can you please clarify: is this project something that "people can play with"? I.e., can users download the code and sample data and try it out for themselves, or play with it some other way?
  That's a prerequisite for Show HN.
  I'm removing the Show HN prefix for now, until we get clarity. Then we can consider re-upping the post once we know exactly how to present it.
  tonyww an hour ago
  yes, the repo is publicly available: https://github.com/SentienceAPI/sentience-sdk-playground you can pull it and set up the dependencies including sentience API key, then run the main.py in the planner_executor_local folder
- ares623 14 hours ago
  > If the layout changes, the system doesn’t “half-work” — it fails deterministically with artifacts. That’s the behavior we’re optimizing for.
  how is this different than building a scraper script that does it traditionally?
  tonyww an hour ago
  Good question. On the surface, it does look very similar to the traditional scraper/script, but there's a subtle difference in where the logic lives and how failures are handled.
  A traditional scraper/script hard-codes selectors and control flow up front. When the layout changes, it usually breaks at an arbitrary line and you debug it manually.
  In this setup, the agent chooses actions at *runtime* from a bounded action space, and the system uses the built-in predicates (e.g. url_changes, drawer_appeared, etc) to verify the outcomes. When it fails, it fails at a specific semantic assertion with artifacts, not a missing selector.
  So it’s less “replace scripts” and more “apply test-style verification and recovery to AI-driven decisions instead of static code.”
  blibble 12 hours ago
  it costs a lot more
cjbarber 17 hours ago
looks interesting, though note:
> Show HN is for something you've made that other people can play with.
> Off topic: blog posts, sign-up pages, newsletters, lists, and other reading material. Those can't be tried out, so can't be Show HNs. Make a regular submission instead.
https://news.ycombinator.com/showhn.html
- tonyww an hour ago
  Sorry for the misunderstanding, I intended to post it as news or engineering article, which is why I didn't include *Show HN* in the title