This all reminds me a lot of the early 2000's, when big corporations thought they could save a lot of money by outsourcing development work to low-income countries and have their expensive in-house engineers only write specifications. Turns out most of those outsourcing parties won't truly understand the core ideas behind the system you're trying to build, won't think outside the box and make corrections where necessary, and will just build the thing exactly as written in the spec. The result being that to get the end product you want, the spec needs to be so finely detailed and refined that by the time you get both specification and implementation to the desired quality level, it would have been the same amount of effort (and probably less time and frustration) to just build the system in-house.
Of course outsourcing software development hasn't gone away, but it hasn't become anywhere near as prevalent and dominant as its proponents would've had you believe. I see the same happening with AI coding - it has its place, certainly for prototyping and quick-and-dirty solutions - but it cannot and will not truly replace human understanding, ingenuity, creativity and insight.
More than one project manager has insisted that everything about the system must be documented--that's called the source code.
As you say, by the time you specify everything, you've written the code.
Theoretically a PM could say "the code is disposable and obsoleted by the next deployment. let's just document our prompts."
I don't know if that's a good idea but a lot of people are going to try it.
Do any of these vibe coding tools write out the prompts as specs and then keep the specs up to date as you continue prompting? Seems like specs == formal prompts.
You don't need a tool for that. "You're going to assist me in writing a detailed software spec in markdown. At each step adjust the document to incorporate new information. Suggest improvements and highlight areas which have been ignored so far. Initial description: ..."
If you have multiple of those, you can tell it about required sections / format, or provide a good past example.
Oh yes[1]
Then remember when we said tests should be the specs?
Then we said the end users are the specs?
All of them can be construed as a joke in our erratic search for the correct way to write software without those $150k developers that seem to be the only ones getting the job done, assuming they have a competent management hierarchy and stock options incentives.
[1] We have a waterfall software and I wonder whether Crockford’s license “JSON should be used for good, not evil” was meant for me
I think this erratic manner of trying to find the correct way is the issue. I am currently nearing my 2nd year at a company A in my industry, and while I did know they all kinda suck in their own special way, I honestly had no idea it was this bad until I had to try to make this craziness somehow work for us. Even if there are standards, I do not see people following them. Last year, the one girl, who did seem to try to follow some common sense approach, got fired for effectively using common sense against big boss wishes.
What I am saying it is a mess from beginning to end and I am honestly not sure if there is one factor that could solve it..
Have we stopped, really?
Last time i was at a faang my org also had offices in one of those “low-income countries”. So in a way we haven’t stopped.
Also, depending on how you define “low-income” then up to two thirds of the organisation i worked in was in a low-income country.
> Have we stopped, really?
No, when I was at a large, well-known company a year ago, job listings were 2:1 off-shore (India, South America) vs on-shore. There was also an increasing amount of contractors used, even for public-facing stuff you wouldn't expect.
In my experience, directly hiring (or hiring through a local company) developers in a “low-income country” - in my experience, Eastern Europe and Latin America - goes a lot better than just contracting out a body of work to a third party. Especially if your company is already fully remote, you’re able to get developers who integrate onto your team just like American devs, and are just as good at coding.
Deleted
I don’t think this addresses the comment you’re replying to.
It’s sad you’re getting downvoted by gatekeepers. It’s absolutely a good thing that more people have access. Maybe not for inflated costal salaries and egos, however.
So what have we redefined vibe coding to mean exactly?
The original tweet[1] talked very specifically about not caring about quality, just accepting whatever code the AI produces blindly, as long as you get the black box output you're looking for, and just randomly try again if you didn't.
Are people now using this term to mean "giving an AI agent broad tasks"?
[1] https://x.com/karpathy/status/1886192184808149383?lang=en
I wrote about this last month: "Not all AI-assisted programming is vibe coding" - https://simonwillison.net/2025/Mar/19/vibe-coding/
Vibe coding is when you don't review the code at all. If you're using LLMs to help you write code but you're actually reviewing what they produce (and iterating on it) that's not vibe coding any more.
This battle is almost certainly lost already, but dammit I'm gonna keep fighting anyway!
Salty version: https://bsky.app/profile/simonwillison.net/post/3ll2rtxeucs2...
> Feels like I'm losing the battle on this one, I keep seeing people use "vibe coding" to mean any time an LLM is used to write code
> I'm particularly frustrated because for a few glorious moments we had the chance at having ONE piece of AI-related terminology with a clear, widely accepted definition!
> But it turns out people couldn't be trusted to read all the way to the end of Andrej's tweet, so now we are back to yet another term where different people assume it means different things
I found out this anti-pattern where a newly coined term loses its definition as it spreads more widely is called "semantic diffusion": https://simonwillison.net/2025/Mar/23/semantic-diffusion/
We do need a simple term for "used AI to write code (semi)autonomously, but checked and/or tweaked the result and I care about the quality".
Vibe-but-verify? Faux-Vibe? AiPair? (... I'll see myself out...)
The term for this is "unicorn"
Entirely fictional creature that doesn't exist
Every person who I have seen embracing AI coding has been getting lazier and lazier about verifying
The definition seems to have pretty rapidly moved to 'used an AI coding assistant in some capacity'.
From that tweet:
> I don't read the diffs anymore. When I get error messages I just copy paste them in with no comment, usually that fixes it
That sums up vibe coding, imo.
The article talks about code quality with vibe coding, but I think that misses the point. The real problem is code knowledge. When a vibe coder inevitably needs to debug something, if they have no idea what any of the code does, or why it is the way it is, they are not going to have a good time.
Sure they can copy paste the error into the LLM and hope for the best, but what happens when that doesn’t fix it? I’ve already had to spend hours at work tracking down bugs that ending up being in there because someone just blindly accepted the code an LLM wrote, I fear it’s only going to get worse.
> When a vibe coder inevitably needs to debug something, if they have no idea what any of the code does, or why it is the way it is, they are not going to have a good time
Kernighan's law still applies.
> I’ve already had to spend hours at work tracking down bugs that ending up being in there because someone just blindly accepted the code an LLM wrote, I fear it’s only going to get worse
And I’ve also had to spend hours at work tracking down badly copy pasted stack overflow code, or from other places in the codebase that didn’t do what the programmer thought it did. A shitty carpenter will build a shitty staircase whether they have a chisel or a dremel
Also from the tweet;
> It's not too bad for throwaway weekend projects, but still quite amusing
You were never supposed to vibe code on serious projects.
Any project that solves a real need will invariably become serious.
I have lost count of the number of quick one off scripts that ended up still being used in production workloads five (or more) years later.
It's fine for tooling or weekend project. I did it on an internal tool. It's got so much functionality now though, that Cursor struggles. It was great when I had a blank project, needed x, y, z and it went off and did It's thing. The moment the project got bug and need modifications, it is better that I do it myself.
Also I am a backend engineer. I don't know what kind of code it's producing for my front end. I just hit accept all. But seeing how it does the backend code and having to prompt it to do something better (architectural, performance, code reuse) I have no doubt the front end of my pool is a pile of poop.
I fear that management will see these small quick gains and think it applies to everything.
I'm of the opinion now, that vibe coding is good if you are familiar with the code & stack and can ensure quality. You don't want to give it to a product owner and have them try to be an engineer (or have a backend dev do front end, vice versa).
Just my opinion.
Like you, I’m far too risk averse to not fact check everything an LLM outputs, but I’ve also fixed bugs that have been present for 5+ years. Maybe at a certain point you can just wait for the next generation of model to fix the bugs. And wait for the generation after that to fix the newly introduced and/or more subtle bugs.
> Sure they can copy paste the error into the LLM and hope for the best, but what happens when that doesn’t fix it?
Neither side cares unfortunately.
When users attempt to prompt away their problems without understanding the error and it doesn't solve it, that is still good news for Cursor and Anthropic and it is more money for them.
The influencers encouraging "vibe coding" also don't care. They need to be paid for their Twitter money or YouTube ad revenue.
Vibe coder - someone who uses more coding assistance than I do
I absolutely hate the polarization around "vibe coding".
The whole point of AI agents is to eventually get good enough to do this stuff better than humans do. It's okay to dogfood and test them now and see how well they do, and improve them over time.
Software engineers will eventually become managers of AI agents. Vibe coding is just version 0.1 pre-alpha of that future.
> The whole point of AI agents is to eventually get good enough to do this stuff better than humans do. It's okay to dogfood and test them now and see how well they do, and improve them over time.
I agree with that. The problem I have is that people are getting sucked into the hype and evaluating the results of those tests with major rose-colored glasses. They glaze over all the issues and fool themselves into thinking that the overall result is favorable.
> Software engineers will eventually become managers of AI agents.
Source? This seems very optimistic on both ends (that AI will replace SE work, AND SEs will still be employed to manage them).
"The whole point of AI agents is to eventually get good enough to do this stuff better than humans do"
You can be an enthusiastic adopter of AI tooling (like I am) without wanting them to eventually be better than humans at everything.
I'm very much still in the "augment, don't replace" camp when it comes to AI tooling.
Anyone who has maintained code that written by engineers new to the industry, who didn’t understand the context of a system or the underlying principles of what they’re writing, may disagree.
It is a scam. Invented by someone who is an AI researcher, but not a software engineer which the latter rigorously focuses on code quality.
"Vibe-coding" as it is defined, throws away all the principles of software engineering and adopts an unchecked approach into using AI generated code with "accept all changes" then duct-taping it with more code on top of a chaotic code architecture or none (single massive file) and especially with zero tests.
The fact is, it encourages carelessness and ignores security principles just to make it acceptable to create low quality and broken software that can be hacked very easily.
You can spot some of these "vibe-coders" if they believe that they can destroy Docusign in a day with Cursor + Claude with their solution.
Who's going to tell them?
Saying that Andrej Karpathy is "an AI researcher, but not a software engineer" isn't a very credible statement.
If you read to the end of his tweet, he specifically says "It's not too bad for throwaway weekend projects, but still quite amusing. I'm building a project or webapp, but it's not really coding - I just see stuff, say stuff, run stuff, and copy paste stuff, and it mostly works."
Your comment might make sense when it's scoped down to that article when he coined that term. If you take a look at his larger collection of statements on software engineering recently, it's hard not to put him in the bucket of overenthusiastic AI peddlers of today.
> Saying that Andrej Karpathy is "an AI researcher, but not a software engineer" isn't a very credible statement.
I think it is. He is certainly a great AI researcher / scientist, but not really a software engineer.
> It's not too bad for throwaway weekend projects, but still quite amusing. I'm building a project or webapp, but it's not really coding - I just see stuff, say stuff, run stuff, and copy paste stuff, and it mostly works."
So is that the future of software engineering? "Accept all changes", "Copy paste stuff", "It mostly works" and little to no tests whatsoever as that is what "Vibe coding" is.
Would you yourself want vibe-coded software that is in highly critical systems such as in aeroplanes, hospitals, or in energy infrastructure?
I don't think so.
Where did Andrej say it was "the future of software engineering"? He very clearly described vibe coding as an entertaining way to hack on throwaway weekend projects.
Try reading the whole tweet! https://twitter.com/karpathy/status/1886192184808149383
"Would you yourself want vibe-coded software that is in highly critical systems such as in aeroplanes, hospitals, or in energy infrastructure?"
Of course not. That's why I wrote https://simonwillison.net/2025/Mar/19/vibe-coding/#using-llm...
> treat the AI like a super-speedy but junior developer on your team
That sounds like it's going to take a lot more time that just writing the code for an experienced developer. The issue with AI for me is that it produces plausible-looking code which requires a lot of attention to read through, because things that look superficially "right", including any justification in code comments, can actually have really problematic flaws.
I remember when I'd rushed a piece of work when studying, I had a lecturer who told us something like:
There are a few kinds of developers: good, bad, slow and fast. Everyone wants to be a good developer but a fast and bad developer is worse than a slow and bad developer because they end up doing so much damage.
He could have been copying General Kurt Hammerstein. Paraphrasing here:
There are four kinds of military officers: smart and lazy, smart and hard working, dumb and hardworking, dumb and lazy.
Keep the dumb and hardworking away from any high level position. They actively make things worse. Dumb and lazy are useful for simple tasks with oversight. Smart and hardworking are excellent for high level staff and analyst positions.
The Smart and Lazy should be saved for high level command. They expend only the energy necessary to get the job done and then focus on the next important task.
I've found utility + speed go up the more conservative (in number of lines) the model generates. If it only completed a line it's much more likely to be exactly what I was about to type
Different people clearly mean different things when they talk about software quality. There is quality as perceived by the user: few bugs, accurately models the problem they have, no more complicated than necessary, etc. Then there is this other notion of quality as something to do with how the software is built. How neat and clear it is. How easy it is to extend or change.
The first kind of quality is the only kind that matters in the end. The second kind has mattered a lot up until now because of how involved humans are in typing up and editing software. It doesn't need to matter going forward. To a machine, the entire application can be rewritten just as easily as making a small change.
I would gladly give up all semblance of the second kind of quality in exchange for formal specifications and testing methods, which an AI goes through the trouble of satisfying for me. Concepts and models matter in the problem domain (assuming humans are the ones using the software), but they will increasingly have no place in the solution domain.
The problem domain is part of the solution domain: writing a good specification and tests is a skill.
Moreover, I suspect the second kind of quality won't completely go away: a smart machine will develop new techniques to organize its code (making it "neat and clear" to the machine), which may resemble human techniques. Maybe even, buried within the cryptic code output by a machine, there will be patterns resembling popular design patterns.
Brute force can get results faster than careful planning, but brute force and planning gets results faster than both. AI will keep being optimized (even if one day it starts optimizing itself), and organization is presumably a good optimization.
LLMs think differently than humans, e.g. they seem to have much larger "context" (analogous to short-term memory) but their training (analogous to long-term memory) is immutable. Yet there are similarities as demonstrated in LLM responses, e.g. they reason in English, and reach conclusions without understanding the steps they took. Assuming this holds for later AIs, the structures those AIs organize their code into to make it easier to understand, probably won't be the structures humans would create, but they'll be similar.
The second type of quality is necessary to achieve the first type of quality for systems with nontrivial levels of complexity. It doesn’t need to be perfect, or even close to perfect, but it does need to be “good enough” -
Your end users will eventually notice how long bugs take to get fixed, how long and how often outages occur, and how long it takes to get new functionality into your software.
But beyond your end-users, you likely have competitors: and if your competitors start moving faster, build a reputation of dependability and responsiveness, your business WILL suffer. You will see attrition, your CAC will go up, and those costs get absorbed somewhere: either in less runway, less capex/opex (layoffs), higher priced or all of the above. And that’s an entire domain AI isn’t (yet) suited to assist.
There’s no free lunch.
> The first kind of quality is the only kind that matters in the end.
How easy it is to maintain and extend does absolutely matter, in a world where software is constantly growing and evolving and never "finished"
> The first kind of quality is the only kind that matters in the end.
From a business perspective, this is what's exciting to a lot of people. I think we have to recognize that a lot of products fail not because the software was written poorly, but because the business idea wasn't very good.
If a business is able to spin up its product using some aspect of vibe coding to test out its merits, and is able to explore product-market fit more quickly, does it really matter if the code quality is bad? Likewise, a well-crafted product can still fail because either the market shifted (maybe it took too long to produce) or because there really wasn't a market for it to begin with. Obviously, there's a middle ground here, and if you go too far with vibe coding and produce something that constantly fails or is hard to maintain, then maybe you've gone too far, but it's a balance that needs to be weighed against business risk.
Low/no code MVP solutions have existed for a long time. Vibe coding seems like you'll get worse results than just using one of those, at least from a bug/support standpoint.
The problem is that the first kind of quality is something that's hard for even human programmers to do well, while AI is, like the rest of the tools that came before, much better at the second.
> The first kind of quality is the only kind that matters in the end.
Yes. But the first kind of quality is enabled with the second kind.
Until we live in a faultless closed loop[1], where with AI "the entire application can be rewritten just as easily as making a small change." you still need the second kind.
[1] and it's debatable if we ever will
It totally depends on the use case.
As a consultant the majority of my work is around business process automation and integrating cloud systems. We build a lot of small "applications" that change almost constantly. The number of concurrent users is low, the lifespan of the software typically short and to justify the effort has to be done quickly and efficiently.
It's 100% "value engineering".
AI agent pairing has been an absolute game changer. It can single shot most requirements and refactors. I basically am just writing technical requirements and reading pull requests all day now.
I actually think the quality of the output has gone up significantly because you can just accomplish much more in the same limited budget.
I built Plandex[1] to try to enable a more sustainable approach to vibe coding. It writes all of the model's changes to a cumulative version-controlled sandbox by default, which in my experience helps a lot to address many of the points raised in the article. While there are still categories of tasks I'd rather do myself than vibe code, the default-sandbox approach makes it feel a lot less risky to give something a shot.
On another note, a related but somewhat different technique that I think is still under-appreciated is "vibe debugging", i.e. repeatedly executing commands (builds, tests, typechecks, dependency installs, etc.) until they run successfully. This helps a lot with what imo are some of the most tedious tasks in software development—stuff like getting your webpack server to startup correctly, getting a big C project to compile for the first time, fixing random dependency installation errors, getting your CloudFormation template to deploy without errors, and so on. It's not so much that these tasks are 'difficult' really. They just require a lot of trial-and-error and have a slow feedback loop, which makes them naturally a good fit for AI.
I put a lot of focus on execution control in Plandex in order to make it helpful for these kinds of problems. It's built to be transactional—you can apply the changes from the sandbox, run pending commands, and then roll back all changes if the commands fail. You can do this repeatedly, even letting it continue automatically for some number of tries until the commands succeed (or you hit the tries limit). While there are some limitations to the terminal modality in terms of UX, I think this is an area where a CLI-based agent can really shine.
Not an excuse, but maybe an explanation. I was just talking to someone who was bragging about getting the cost of coding down to a penny a line. Seems like an insane metric to use, to me, since it takes no account of features developed or maintainability/modifiability.
This is common sense. The whole article read like they asked ChatGPT to fluff one sentence "review vibe code with a human before pushing to prod" into an entire article.
Vibe blogging.
I don't mean to be dismissive (here goes, though) but the whole article boils down to one point, and it's one that frightens me that it needs to be repeated so often: use AI as your intern, not your replacement
Simple.
>In essence, let the AI handle the grunt work, not the brain work.
And importantly make sure the user doing the brain work has the experience and support to do it properly. The XY problem can cause a lot of damage with AI agents that implement what's asked of them instead of asking why.
If you use AI to make low-quality work, then you made low-quality work. It’s pretty simple.
I think all this rests on the unacknowleged fact that most software never goes into production. When you have a choice between paying expensive humans to create software that works vs cheap AI to create software that doesn't, if nobody is ever going to use it, the AI option is the one to pick.
"The big takeaway is that speed means nothing without quality" - I feel like this is not true in 'move fast and break things' ideology
I feel like people don’t really understand that ideology - if all you ever do is break things (or produce broken things), then you’re not actually moving fast at all.
It's hard to take anyone seriously who takes the term "vibe coding" this seriously, considering that the whole thing spawned from a casual tongue-in-cheek tweet. I recently saw a job posting for a "Vibe Coder". Nuts.
Vibe seo vibe seo vibe seo vibe seo
Code vibing.
What percentage of companies can hire an engineer who writes better code than o3?
Given that o3 just spun its wheels endlessly trying to correct a CSS issue by suggesting to create a "tailwind.config.X" file despite being given the package JSON which contained a clear reference to Tailwind 4x - I'd say any engineer capable of reading and learning from basic documentation.
For reference, Tailwind 4 will not read in config files by default (which is the older behavior) - the encouraged practice is to configure customizations directly in the CSS file where you import Tailwind itself.
It is most definitely 100%. Any competent programmer can write code better than the current AI tools.
I'm a big booster of AI, but this doesn't even make sense. Any project using even the very best code generator in existence is going to need to be stewarded and tightly monitored by a competent programmer. AI is miraculous, but can't do crap reliably on it's own.
100%
highly doubt that
Then why ask the question, if you're so sure of the answer?
This question isn’t useful without context. But yes the answer is probably 100%.
Whatever percentage that can hire an engineer at all.
This won't be 100%, but that'll be the companies who're able to hire somebody to parse the problems that arise. Without that engineer, they'll be doing what OP calls 'vibe coding', meaning they'll neither understand nor be able to fix when the whole thing blows up.