What I find really strange about this is I use AI a lot as a “smart friend” to work through explanations of things I find difficult etc and I am currently preparing for some exams so I will often give the AI a document and ask for some supporting resources to take the subject further and it almost always produces something that is plausibly close to a real thing but wrong in specifics. As in when you ask for a reference it is almost invariably a hallucination. So it just amazes me that anyone would just stick that in a brief and ship it without checking it even more than they would check the work of a human underling (which they should obviously also check for something this important).
For example, yesterday I got a list of some study resources for abstract algebra. Claude referred me to a series by Benedict Gross (Which is excellent btw). It gave me a line to harvard’s website but it was a 404 and it was only with further searching that I found the real thing. It also suggested a youtube playlist by Socratica (again this exists but the url was wrong) and one by Michael Penn (same deal).
Literally every reference was almost right but actually wrong. How does anyone have the confidence to ship a legal brief that an AI produced without checking it thoroughly?
I think it's easy to understand why people are overestimating the accuracy and performance of LLM-based output: it's currently being touted as the replacement for human labor in a large number of fields. Outside of software development there are fewer optimistic skeptics and much less nuanced takes on the tech.
Casually scrolling through TechCrunch I see over $1B in very recent investments into legal-focused startups alone. You can't push the messaging that the technology to replace humans is here and expect people will also know intrinsically that they need to do the work of checking the output. It runs counter to the massive public rollout of these products which have a simple pitch: we are going to replace the work of human employees.
People are lazy. I’m enrolled in a language class in a foreign country right now - so presumably people taking that class want to actually get good at the language so they can actually live their life here - yet a significant portion of students just turn in ChatGPT essays.
And I don’t mean essays edited with chatGPT, but essays that are clearly verbatim output. When the teacher asks the students to read them out loud to the class, they will stumble upon words and grammar that are way obviously way beyond anything we’ve studied. The utter lack of self awareness is both funny but also really sad.
LLMs were originally designed for translation, so it makes sense. We have basically elimated the need to learn foreign languages for day to day use anyway, its only helpful for high professional tasks or close literary study or prestige.
There are a lot of shit tier lawyers who are just in it for the money and just barely passed their exams. Given his notoriety, Lindell is scraping the bottom of the barrel with people willing to provide legal services.
could it be that they just have to attend the class for technical reasons? Also - once the gadgets can translate for free in real time ... you can live in places you don't speak the language of, so maybe they are just prepping for that.
What this story tells us more than anything is that Lindell cannot convince a competent lawyer to defend him, so what he gets instead are clownshod phonies. Either he’s out of cash, or he’s such a terrible client that nobody with a shred of professional responsibility will take him.
I asked ChatGPT to give Wikipedia links in a table. Not one of the 50+ links was valid.
Which version of GPT? I've found that 4o has actually been quite good at this lately, rarely hallucinating links any more.
Just two days ago, I gave it a list of a dozen article titles from a newspaper website (The Guardian), asked it to look up their URLs and give me a list, and to summarise each article for me, and it made no mistakes at all.
Maybe your task was more complicated to do in some way, maybe you're not paying for ChatGPT and are on a less able model, or maybe it's a question of learning how to prompt, I don't know, I just know that for me it's gone from "assume sources cited are bullshit" to "verify each one still, but they're usually correct".
> asked it to look up their URLs and give me a list
Something missing from this conversation is whether we're talking about the raw model or model+tool calls (search). This sounds like tool calls were enabled.
And I do think this is a sign that the current UX of the chatbots is deeply flawed: even on HN we don't seem to interact with the UI components to toggle these features frequently enough that they're the intuitive answer, instead we still talk about model classes as though that makes the biggest difference in accuracy.
Ah, yes you're right - I didn't clarify this in my original comment, but my anecdote was indeed the ChatGPT interface and using its ability to browse the web[#], not expecting it to pull URLs out of its original training data. Thanks for pointing that out.
But the reason I suggested model as a potential difference between me and the person I replied to, rather than ChatGPT interface vs. plain use of model without bells and whistles, is that they had said their trouble was while using ChatGPT, not while using a GPT model over the API or through a different service.
[#] (Technically I didn't, and never do, have the "search" button enabled in the chat interface, but it's able to search/browse the web without that focus being selected.)
Right, but ChatGPT doesn't always automatically use search. I don't know what mechanisms it uses to decide whether to turn that on (maybe free accounts vs paid makes a difference?) but I rarely see it automatically turn on search, it usually tries to respond directly from weights.
And on the flip side, my local Llama 3 8b does a pretty good job at avoiding hallucinations when it's hooked up to search (through Open WebUI). Search vs no-search seems to me to matter far more than model class.
I'm just specific in my prompting, rather than letting it decide whether or not to search.
These models aren't (yet, at least) clever enough to understand what they do or don't know, so if you're not directly telling them when you want them to go and find specific info rather than guess at it you're just asking a mystic with a magic ball.
It doesn't add much to the length of prompts, just a matter of getting in the habit of wording things the right way. For the request I gave as my example a couple of comments above, I wrote "Please search for every one of the Guardian articles whose titles I pasted above and give me a list of URLs for them all." whereas if you write "Please tell me the URLs of these Guardian articles" then it may well act as if it knows them already and return bullshit.
Definitely more complicated. I've been playing around with using it to analyze historical data and using it to generate charts. And yes I've tried many different kinds of phrasing. I have experience working with and writing rules based "expert systems" and have a vague idea of how neural networks are used for image recognition. It's a pretty fun game to get useful information out of ChatGPT.
You cannot ask it to have crop yield as a column in a chart and get accurate information.
It only seems reasonable when doing a single list of items. Asking it for two columns of data and it starts making things up. Like bogus wikipedia links.
You could definitely make the argument I'm using it wrong but this is how people try to use it. I still find this useful because it gives me a start on where to point my research or ask clarifying questions.
It's much better at giving you a list of types of beer and wine that's been produced in history. Just don't trust any of the dates.
If you could share the actual prompts & info you wanted I would be curious to try and see if it is indeed too complex for it or if prompting differently would work better, because I've had it produce tables with multiple columns pulling info from different sources for different columns before so that's definitely not a hard limit... so would be happy to come back to you either with advice on how to do it next time, or with agreement that having tried it myself it is indeed ChatGPT not your prompting that was the problem.
Prompt:
I would like a list of east Indiamen from 1750 to 1800 where you can find how many tons burthen and how many crew. Show as a chart and give me the wikipedia links to the ships. Do not include any ships that do not have wikipedia links.
Here's my customization:
What do you do?:
Software Engineer
What traits should ChatGPT have?:
Show all the options
Be practical above all.
Anything else ChatGPT should know about you?:
I’m an author of science fiction and fantasy.
I like world building for stories.
I know there's hundreds of ways to phrase this and I could probably trick it into generating the chart first and finding the wikipedia links second. :)Sorry for going off topic here but I've had the same experience.
I'm not sure which update improved 4o so greatly but I get better responses from 4o than from o4-mini, o4-mini-high, and even o3. o4 and o3 have been disappointing lately - they have issues understanding intent, they have issues obeying requests, and it happened multiple times that they forgot the context even though the conversation consisted of only 4 messages without a huge number of tokens. In terms of chain-of-thought models I prefer DeepSeek over any OpenAI model (4.5 research seems great, but it’s just way too expensive).
It's rather disappointing how OpenAI releases new models that seem incredible, and then, to reduce the cost of running them, they slowly slim these models down until they're just not that good anymore.
No need for the apology, and FYI I broadly agree with everything you say (except about 4.5, which I don't actively disagree with I just haven't played with it myself).
Share the link to the conversation.
The media really hypes the capabilities up, so if you're new to it and it spews out something that looks very detailed and plausible, you just think "wow it worked". They would have no instinct for the failure modes either. The reference points here would be like a paralegal or a computer search tool. You would only really expect errors of omission: the paralegal wants to keep their job, and search cannot find things that don't exist. In that frame of mind when you see that the returned document seems to cover the relevant points and makes sense when you skim it, it seems like job done. The public doesn't get that the LLM will just completely bold-faced make stuff up.
> How does anyone have the confidence to ship a legal brief that an AI produced without checking it thoroughly?
They're treating it like they would a paralegal. Typically this means giving a research task and then using their results, but sometimes lawyers will just have them write documents and ship it, so to speak.
This is making me realize that Tech Bros treat chat GPT like the 1930s secretary they never got to have
Everything you’ve said is correct. Now picture a quiet spread of subtle defects seeping through countless codebases, borne on the euphoria of GenAI driven “productivity”. When those flaws surface, the coming AI winter will be long and bitter.
I use it in much the same way as you, and it's been extremely beneficial. But I also would not dream of signing my name on something that has been independently produced by AI, it's just too often blatantly wrong on specifics.
I think people who do are simply not aware that AI is not deterministic the same way a calculator is. I would feel entirely safe signing my name on a mathematical result produced by a calculator (assuming I trusted my own input).
LLMs are deterministic [0]. An LLM is a pure function that takes a list of tokens and returns a set of token probabilities. To make it "chat" you use the generated probabilities to pick a token, append that token to the list, and run the LLM again. Any randomness is introduced by the external component that picks a token using the probabilities: the sampler. Always picking the most likely token is a valid strategy.
The problem is that all output is a "hallucination", and only some of it coincidentally matches the truth. There's no internal distinction between hallucination and truth.
[0] Theoretically; race conditions in a parallel implementation could add non-determinism.
True, though in practice speed optimizations and instabilities on the GPU often lead to LLMs being very non-determanistic in practice.
Which doesn't detract from your main point: there's not a lot of distinction between hallucinations and what we'd consider to be the "real thing." There have been various attempts to measure hallucinations, and we can figure out things like how confident the model is in a particular answer...but there's nothing grounding that answer. Saturate the dataset with the wrong answer and you'll get an overconfident wrong result.
While this is technically correct, everyday use of LLMs involves a non-zero temperature, so they (the whole package that people think of as “AI”) are non-deterministic in practice.
No, hallucinations occur when LLM is missing information.
That's not correct, and seems to be based on a common misunderstanding of how LLMs work, the rough idea being that when the info the model is being asked for had been in the data used for training, it "looks it up" not unlike software looking up info from a huge database of general knowledge, and that when that lookup fails it falls back to making stuff up. But that's wrong, the models are actually doing the exact same thing when they're hallucinating as when they're correct, just the result is different.
Hallucinations happen when the model determines that the most likely suitable string of tokens turns out to contain incorrect information, regardless of whether the correct information is "missing" or whether the correct information actually would have been outputted had it, when selecting the first token of the response, instead selected the option that it considers second best rather than best.
Whether or not a piece of information was in the training set can obviously influence the likelihood of a model hallucinating when asked about the subject, but it can easily hallucinate about stuff that was in the training and it can also get things right that weren't in the training data.
If an LLM happens to know the answer to your question, that answer will have the greatest weight, and will therefore become a non-hallucinated output. Otherwise the output will be hallucinated. Note that a hallucination may manifest as an attempt to extrapolate, which may be successful. If you query an LLM with prior knowledge that the LLM doesn’t know the answer, you are guaranteed to receive a hallucinated output.
Or at least this is how I interpret the term.
But then isn't this also technically true that any software including a pseudo-random number generator is deterministic ? (Starting with itself, like that sampler you mention ?)
And while it might be important in some contexts, like debugging using either the exact same or different seeds, isn't this one of them where it rather confuses the issue ?
Lindell's lawyer claimed that somehow the preliminary copy (before human editing) got submitted to the court - that they actually did the work to fix it, but then slipped up in submitting it.
I could see that, especially with sloppy lawyers in the first place. Or, I could see it being a convenient "the dog ate my homework" excuse.
Having not looked into it, I would guess that his lawyers know they aren’t going to get paid any time soon.
Seems like it's a fast track to not getting paid ever (disbarrment).
Reading your comment, I'd like to coin the "AI-enhanced Dunning-Kruger".
> Wang ordered attorneys Christopher Kachouroff and Jennifer DeMaster to show cause as to why the court should not sanction the defendants, law firm, and individual attorneys. Kachouroff and DeMaster also have to explain why they should not be referred to disciplinary proceedings for violations of the rules of professional conduct.
Glad to see that this is the outcome. Similar to bribes and other similar issues, the hammer has to be big and heavy so that people stop considering this as an option.
> "[T]he Court identified nearly thirty defective citations in the Opposition. These defects include but are not limited to misquotes of cited cases; misrepresentations of principles of law associated with cited cases, including discussions of legal principles that simply do not appear within such decisions; misstatements regarding whether case law originated from a binding authority such as the United States Court of Appeals for the Tenth Circuit; misattributions of case law to this District; and most egregiously, citation of cases that do not exist," US District Judge Nina Wang wrote in an order to show cause Wednesday
30+ years ago when I was in law school [1] I would practice legal research by debunking sovereign citizen and related claims on Usenet. The errors listed above are pretty much a catalog of common sovereign citizen legal research errors.
Just add something about gold fringed flags and Admiralty jurisdiction and it would be nearly complete.
The sovereign citizen documents I debunked were usually not written by lawyers. At best the only legal experience the authors usually had was as defendants who had represented themselves and lost.
Even they usually managed to only get a couple major errors per document. That these lawyers managed to get such a range of errors in one filing is impressive.
[1] I am not a lawyer. Do not take anything I write as legal advice. Near the end of law school I decided I'd rather be a programmer with a good knowledge of law than a lawyer with a good knowledge of programming and went back to software.
What is it with the American far-right and hiring the most _incompetent possible lawyers_? Like, between this and Giuliani...
Think about the quality of lawyer who would take Lindell as a client.
He’s a bankrupt, likely mentally ill acolyte of a dude who is infamous for stiffing his lawyers. His connection with reality is tenuous at best.
Our justice system prides itself on giving everyone due process and a fair trial, even people you hate
I don't think anyone claimed he doesn't deserve due process. The only people I know of arguing against due process lately are in fact those in the Whitehouse.
This guy is getting exactly the kind of lawyers he deserves, and it’s nobody’s fault but his own
This is an appeal for a civil suit he lost.
This. As a junior lawyer at a large law firm, one of my jobs was checking every cited case to confirm that it was cited correctly, that it actually supported the theory that it was being used to support, and that it hadn't been overturned or qualified by subsequent case law. It's a process called Shepardizing that every law student does. So I can't fathom how fictitious cases could possibly be included in a brief. Also, just slightly mischaracterizing a case in a brief could be cause for being sanctioned. So I don't see how this type of issue would not undoubtedly result in sanctions.
One of the defining characteristics of this crop of alt-right, populist nutters is disdain for experts. Doesn't work so well when they need legal or medical advice.
Because competent lawyers tend to adhere to professional standards and codes of ethics, which makes them more selective in the work and clients they take on.
They think everyone is doing it and not getting caught.
Everything the right accuses anyone of, they're doing it too. That's why they don't really care about criminals and pedophiles and racists in their ranks. They think everyone is a child diddling criminal racist.
The problem is that Trump, Musk, Lindell, etc are all extremely arrogant and constantly disregard sound legal advice. Their lawyers aren't merely associated with a controversial client; their professional reputation is put at risk because they might lose easily winnable cases due to a client's dumb tweet. You have to be a crappy lawyer (or an unethical enforcer like Alex Spiro and Roy Cohn) to even want to work with them.
>The problem is that Trump, Musk, Lindell, etc are all extremely arrogant and constantly disregard sound legal advice. Their lawyers aren't merely associated with a controversial client; their professional reputation is put at risk because they might lose easily winnable cases due to a client's dumb tweet.
Bingo. This has nothing to do with ideology. Good lawyers like to win. And when a client is demonstrably too stupid to let them do that, why bother.
Being stupid and controversial is now a popular ideological option.
There's a quote I can't find right now about how fascism is associated with lower competence because it not only prioritizes but demands loyalty over all else and you get a bench made up of just the best asskissers, ideologues, extremists.
If their goal is to hire people who believe in their cause, their hands are tied
Some of the prominent people on the right have tried to ignore the law, to not let the law modify their behavior, fighting off lawsuit after lawsuit, and adverse ruling after adverse ruling. If you're going to do that, you have to file a lot of motions. That seems to drive an emphasis on volume rather than quality of motions in reply. At least, that's my perspective as an outside observer.
It's not like there are many lawyers left who are willing to represent them. Either because they have behaved so utterly vile like Alex Jones, the case is so clear cut due to their own behavior that there is zero chance of achieving more than a token reduction in sentence (while risking the ire of the clueless fanbase for a "bad defense job") like in this case, or because they have a history of not paying their bills like Trump.
That leaves only those as lawyers who already have zero reputation left to lose, want to make a name for themselves in the far-right scene, who are members of the cult as well, and those who think they can milk an already dead/insolvent horse.
These are often also simply hard clients.
Jones is a good example of this. He cycled through about 20 different lawyers during the sandyhook trials. The reason he was defaulted is because when he was required to produce something, he fire the lawyers (or they'd quit), hire new ones, and invariably in the depositions an answer to "did you bring this document the court mandated that you produce" the answer was "oh, sorry, I'm brand new to this case and didn't know anything about that".
Jones wasn't cooperating with his lawyers.
There are plenty of good lawyers that have no problem representing far right figures. The issue really comes down to those figures being willing to follow their lawyer's advice.
The really bad lawyers simply don't care if their clients ignore their advice.
Selection bias on your part. There's plenty of incompetence (and outright fraud) on the other side as well.
Rememebr Michael Avenatti?
The attorney for porn actress who had an affair with a political candidate who embezzled funds to pay her off does have a certain similarity or common nexus to an attorney for key member of the presidential whack pack.
I don’t think that nexus is political, for either party. It’s all tied to one man.
This seems like both-sidesism at its worst. Michael Avenatti is one man, and he represented Stormy Daniels, who is hardly a significant figure on the left compared to Rudy Giuliani or Mike Lindell. I don't see Democrat-leaning CEOs (e.g. Howard Schultz) hiring lawyers like this. And Trump's lawyers are far worse than Biden's!
is Stormy Daniels the far left?
See, the logic is that Stormy Daniels caused Dear Leader trouble, anyone who gets in Dear Leader's way or causes him trouble is Radical Left, therefore Stormy Daniels is Radical Left.
I don’t think she actually has a known political affiliation.
I wonder what the effects of an echo chamber in a forum like this would be.. maybe something similar to what Reddit has become
You couldn't criticize Musk here a few years ago without the fanboys dog-piling. Same for Apple before their more recent stumbles.
I dont understand how a lawyer can use AI like this and not just spend the little time required to check that the citations actually exist.
I constantly see people reply to question with "I asked ChatGPT for you and this is what it says" without a hint of the shame they should feel. The willingness to just accept plausible-sounding AI spew uncritically and without further investigation seems to be baked into some people.
I've seen this as well and I've seen pushback when pointing out it's a hallucination machine that sometimes gets good results, but not always.
Way too many people think that LLMs understand the content in their dataset.
That sort of response seems not too different from the classic "let me google that for you". It seems to me that it is a way to express that the answer to the question can be "trivially" obtained yourself by doing research on your own. Alternatively it can be interpreted as "I don't know anything more than Google/ChatGPT does".
What annoys me more about this type of response is that I feel there's a less rude way to express the same.
Let me google that for you is typically a sarcastic response pointing out someone’s laziness to verify something exceptionally easy to answer.
The ChatGPT responses seem to generally be in the tone of someone who has a harder question that requires a human (not googleable), and the laziness is the answer, not the question.
In my view the role of who is wasting others time with laziness is reversed.
It's worse, because the magic robot's output is often _wrong_.
Well wrong more often. It's not like Google et al has a monopoly on truth.
The issue is not truth, though. It's the difference between completely fabricated but plausible text generated through a stochastic process versus a result pointing towards writing at least exists somewhere on the internet and can be referenced. Said source may be have completely unhinged and bonkers content (Time Cube, anyone?), but it at least exists prior to the query.
At least those folks are acknowledging the source. It's the ones who ask ChatGPT and then give the answer as if it were their own that are likely to cause more of a problem.
Go look at "The Credit Card Song" from 1974. It's intended to be humorous, but the idea of uncritically accepting anything a computer said was prevalent enough then to give the song an underlying basis.
I downvote comments like that, regardless of platform, in almost all situations. They don't really contribute much to the majority of discussions.
I think shame is disappearing from American culture. And that's a shame.
Shame? It's often constructive! Just treat it for what it is, imperfect information.
If I wanted ChatGPT's opinion, I'd have asked ChatGPT. If I'm asking others, it's because it's too important to be left to ChatGPT's inaccuracies and I'm hoping someone has specific knowledge. If they don't, then they don't have to contribute.
It's not constructive to copy-paste LLM slop to discussions. I've yet to see a context where that is welcome, and people should feel shame for doing that.
I see your frustration that these people exist who don’t share your values, but their comments already get downvoted. Take the win and move on.
'member the moral panic when students started (often uncritically) using Wikipedia ?
Ah, we didn't knew just how good we had it...
(At least it is (was ?) real humans doing the writing, you can look at modification history, well made articles have sources, and you can debate issues with the article in the Talk page and even maybe contribute directly to it...)
You could probably use AI to check that the citations exist
The multiplying of numbers less than 1 together will continue until 1 is reached.
Clearly we just need to invent a "-2" AI
And if they don't the AI will make up some for you
Maybe someone can make a browser extension that does not take 404 for an answer but just silently makes up something plausible?
It's not "a little time"
The Judge spent the time to do exactly this. Judges are busy. Their time is valuable. The lawyer used AI to make the judge do work. The lawyer was too lazy to do the verification work that they expected the judge to perform. This speaks to a profound level of disrespect.
I highly doubt the judge was tracking down citations or reading those cited cases herself to verify what was in them. They have law clerks for that. It doesn’t make it any less an egregious waste of the court’s time and resources, but I would be surprised if a district court judge is personally doing much, if any, of that sort of spadework.
Checking if a case exists or not is little time in the context of legal research.
Perhaps not, but it is the time required to discharge their obligation under Rule 11 of the Federal Rules of Civil Procedure (IANAL).
It’s “paralegal time” which is nearly free …
Courts are not allocated an unlimited budget for clerks.
Outside of the literal dollar cost, the opportunity cost here is further delays on the docket because the clerk was unable to do something else, and the court time that must now be spent dealing with the issue.
First, you're confusing time with money
Second, the mistakes weren't just incorrect citations any paralegal could check
> Second, the mistakes weren't just incorrect citations any paralegal could check
... Some of the 'mistakes' (strictly speaking they are not mistakes, of course) are _citations of cases which do not exist_.
... just ...
Wait until you guys hear about how they used AI in the California bar exam.
https://www.sfgate.com/bayarea/article/controversy-californi...
The lawyer jokes aren't funny anymore...
> ... response to the widespread disruptions, the Committee of Bar Examiners, or CBE, voted on April 18 to lower the raw passing score for the February exam from 560 to 534, “two standard errors of measurement below ...
Although different states are involved, perhaps this goes some way toward explaining how Lindell's lawyers could have passed their bar exams.
I'm not too familiar with his lawyers, but I suspect they passed their exams a long time ago in another state.
Bar exams are funny things. Most states have a reciprocity with the NY bar, so when you think lawyer, think the NY bar.
But California is considered a harder bar to pass and has little reciprocity.
Somewhat surprisingly the hardest bar is Louisiana's. This is because their legal system is a crawdad fucking mess. They inherited their code based system from the French for a lot of local matters, but then also have to deal with the precedent based system the rest of the US uses. So you have to memorize two completely different types of law at a very high level. So, if you ever meet a Louisiana lawyer, you know you've met a very intelligent and dedicated person.
Obligatory IANAL here.
How ? The issue seems to have been that they had not revealed that LLMs were involved in the creation of the multiple choice questions. The questions/answers themselves seem to have passed the bar ? (no pun intended)
A much worse failure seems to have been the incompetent software to run the tests. And that for something as high level they would have decided to do it through the mediation of a computer as well as used multiple choice questions in the first place.
This is a really crazy story
This is just Mata v. Avianca again
Everything about this entire situation is comically dumb, but shows how far the US has degraded, that this is meaningful news. If this were a fiction book, people would dismiss it as being lazy writing - an ultra conservative CEO of a pillow company spreads voting conspiracies leading to a lawsuit in which they hire lawyers that risk losing the case because they relied on AI.
Let's not forget the majestic event that was Cyber Symposium.
But here we have an example of someone not escaping justice due to his now-evaporated wealth. I'd call it a positive.
Quite dumb. If it were a book it would be "Infinite Jest", and the receipts of everyone who bought the pillows could be used to enter into some inane raffle.
Because this sort of thing is totally geographically bound.
‘Murica is currently the most notable nation run by a cult of personality. Clown car legal maneuvers of politicians and politician-adjacent people isn’t supposed to be like this.
Russia? North Korea? China? India? Turkmenistan? Azerbaijan?
10 years ago, would you have put the United States on a list with Turkmenistan?
Argentina, Italy usually, ...
"You IDIOT!!! And you have IDIOT lawyers too." There. I said it. It needed to be said and I feel so much better.
dupe https://news.ycombinator.com/item?id=43799823
You missed this one, gnabgib
That’s so stupid, he almost deserves to lose the case just for that
He needs punishment for himself, not for the people or entity he's representing.
Is it possible that these AI models will tell someone what they want to hear rather than the truth?
I mean, that's always been tech's modus operandi....
As an attorney, I’ve found that this isn’t the issue it was a year ago.
1. Use reasoning models and include in the prompt to check the cited cases and verify holdings. 2. Take the draft, run it through ChatGpt deep research , Gemini deep research and Claude , and tell it to verify holdings.
I still double check, for now, but this is catching every hallucination.
Thanks for giving us the reality check. Distasteful as some here find it, squarely presenting the facts of what is surely becoming common practice is a service to the public.
With the Court's reply to Lindell, you now have an independent test case upon which to test your verification process and compare results against a "rival implementation" -- the Court's. One wonders if it may be AI-assisted as well. I'd be quite interested in hearing how the two stack up.
> this isn’t the issue it was a year ago
From the article, it looks like this brief was dated Feb 25 this year.
You can't tell what is and isn't parody anymore.
> still double check, for now
Whew, that's 4 LLM inference requests and still requires manual checking. Criminal levels of waste and inefficiency. Learn how to use LexisNexis, spend some time in a law library handling actual physical casebooks. Learn to do your job.
Even with checking, it turns a 3 day brief into a 4 hour brief.
And, part of the process is to do some research first, find the key cases, and the briefs of better lawyers on the same issue, and include them in the context.
And the time savings are passed onto the clients?
This is incompetent use of AI and the news related to it are becoming tiring. The result is that whenever I talk to some people outside the tech circle they just undeniably believe that AI will never be commonplace in high stakes situations, which is just a rapidly moving bar.
> The result is that whenever I talk to some people outside the tech circle they just undeniably believe that AI will never be commonplace in high stakes situations
And, I mean, they're probably right, because, well, see the pillow guy's lawyer.
The most important thing to understand about AI is that people (not you, I'm sure, but the majority) will use it incompetently and unquestioningly.
These stories are important, you personally don't have to read them if you're tired. But the more cases there are the bigger the extant threat, and the more we need to be educated so we can defend against it.
We are all going to be affected by the omnipresent reliance on AI that allows people to rush out their tasks and get home from work sooner.