> The researchers found that the number of code blocks with 5 or more duplicated lines increased by 8 times during 2024.
Counterpoint: devs would have written duplicate-functionality code anyways, but it wouldn't have shown up as an exact match in years prior.
LLMs seem to bring a kind of consistency in variable naming, comments, structure, etc. that people (and teams of people) don't usually have.
> The researchers also noted a 39.9 percent decrease in the number of moved lines.
The chart shows moved lines dropping dramatically since 2021, and "churn" increasing dramatically by the same amount. Neither of these seem remotely plausible and date to well before LLM's, and rather suggest some kind of measurement artifact. That something is causing moved code to be increasingly classified as "churn", probably because something about it isn't an exact precise match. But if you consider churn to be refactoring (which is very plausible) then refactoring is actually increasing, totally opposite of the article's conclusion.
The yearly dots are the individual points of data; so the lines between e.g. 2021 and 2022, despite being the main visual marker, do not indicate a change that happened in 2021 that leads to 2022, but rather a change that happened in 2022 compared to 2021.
GPT 3.5 came out early in 2022, chatGPT later the same year. Lots of change and community discussion that year around this sort of thing.
The original research is available at: https://gitclear-public.s3.us-west-2.amazonaws.com/AI-Copilo...
Code churn is suboptimal, but I do wonder if this is mostly a matter of looking at the companies that are using gitclear and them trending towards hipper companies more susceptible to 10x engineers?
My experience with LLMs has been that they’re good at lifting you from “incompetent” to “somewhere near the middle of the bellcurve.” So, this result is not surprising. When I have expertise in a topic, I have not found them useful. When I’m incompetent at a topic, they are a real timesaver. It’s a bit like Gell-Mann amnesia; when you’re talented in an area it’s easy to spot the poor quality of the LLM’s output.
There is a fallacy here. If you know they don't perform well in places where you have the expertise to detect it, it is a mistake to assume they perform well in other areas. My conclusion has been that they are useful for kicking off a task, sometimes useful for ideation, but generally will fall down on anything reasonably complex. The dangerous bit is that it's hard to detect the line where they start to fail.
But like, the expert also knows that the middle-of-the-bell-curve engineer makes mistakes and knows not to blindly trust their code; meanwhile, the person who is "incompetent at a topic" could be seen literally hiring the middle-of the-bell-curve engineer either as a tutor or a consultant... so I don't think trusting them when you are in that position is fair to call a "fallacy".
Garbage in, garbage out.
[dead]