Comments Page - Show HN: AI Subroutines – Run automation scripts inside your browser tab

« Back Show HN: AI Subroutines – Run automation scripts inside your browser tabrtrvr.aiSubmitted by arjunchint 4 days ago

saadn92 2 days ago
I built something like this but much worse. No extension, no recording, I literally sit there with Chrome devtools open, do the action manually, copy the 3-4 network requests into a Python script, and replay them with urllib and a cookie jar.
It's absurd but it works. Gumroad's cover image upload for example, their actual API can't do it, but the browser makes 3 requests (presign to their Rails Active Storage endpoint, PUT the binary to S3, POST the signed_blob_id to attach it). Captured those once in April, been replaying them since. I uploaded covers and thumbnails to 9 products today without opening a browser.
Obviously falls apart the second they change anything.
- arjunchint 2 days ago
  Yes exactly imagine now anyone, even non-technical people, can just prompt and interact with this hidden/deeper layer of the web, all in their regular browser!
  nearestnabors 2 days ago
  Oh yes indeed
rvz 3 days ago
Aren't there just many ways for the website to just break the automation?
Does this work on sites that have protection against LLMs such as captchas, LLM tarpits and PoW challenges?
I just see this as a never ending cat and mouse game.
- acoyfellow 3 days ago
  It is. They are saying “we are willing to chase the mouse for you for money”.
- arjunchint 2 days ago
  The bigger goal is to build and maintain a global library of popular automations. Users can also quickly re-record and recreate the scripts to update.
  Since it runs inside your own browser, there should be no captchas or challenges. On failure it can fallback to our regular web agent that can solve captchas.
  Big picture wise with the launch of Mythos it might just become impossible for websites to keep up, and they will have to go like Salesforce and just expose APIs for everything.
tim-projects 2 days ago
If you could take this recording and turn it into a playwright script - that would be a massive time saver.
Having to redo recordings once they break sounds like too much hassle.
- arjunchint 2 days ago
  Hey thats a great idea, we will take a look into exploring this export option. But how would it save time by being a Playwright script?
  Right now since we have a custom sandbox to re-execute the code in, we are using our own syntax and exposed methods. So even now you can edit the generated script.
JSR_FDED 2 days ago
Maybe there’s a middle ground where a small local model can roll with the variations in a site that would break a script, while saving the per token costs?
- arjunchint 2 days ago
  We found Gemini Flash to be the sweet spot for both agentic actions as well as writing code. Even Flash-Lite is too hit or miss.
  We are thinking through on self healing mechanisms like falling back to a live web agent and rewriting script.
amelius 2 days ago
The problem: I don't trust extensions one bit.
- quarkcarbon279 2 days ago
  The reason we open our client side code is to bring in the trust in putting rtrvr's DOM intelligence in your web apps - https://github.com/rtrvr-ai/rover/tree/main . Our monetization is super straight forward with subscription - https://www.rtrvr.ai/pricing . The experiences of some extensions shipping anything or selling user data comes in when people build them as side-gigs not when we pour more than year in building the highly accurate automation engine. We have cloud sandboxes too if you prefer executing with the same intelligence on cloud and not on your own device.
  PS: Also, our data policy if you are interested: https://www.rtrvr.ai/blog/rtrvr-ai-privacy-security-how-we-h...
- arjunchint a day ago
  We don't sell user data, and we are in it to build a generational company.
  We already have 25k+ users and have an opensource extension as well: https://github.com/rtrvr-ai/rover/tree/main/apps/preview-hel...
- notepad0x90 2 days ago
  auditing the code is fairly straightforward if it isn't obfuscated. so long as it doesn't execute dynamic code that is. but the big issue is you can't control when the extension itself gets an update (to my knowledge). and it isn't uncommon to sell browsing data, or the extension itself to someone more shady than the original author down the road.
  amelius 2 days ago
  Yes, this exactly.
daylab 2 days ago
oh this is clever. running in main world dodges a lot of the usual scraping pain. how do you handle sites with strict csp that block inline scripts, is the extension somehow exempt?
- arjunchint 2 days ago
  We execute the code in a sandbox and proxy the fetch calls through main world!