>Generative AI doesn't have a coherent understanding of the world
The very headline is based on a completely faulty assumption, that AI has any capacity for understanding at all, which it doesn't. That would require self-directed reasoning and self-awareness, both of which it lacks based on any available evidence. (though there's no shortage of irrational defenders here who somehow leap to say that there's no difference between consciousness in humans and the pattern matching of AI technology today, because they happened to have a "conversation" with ChatGPT and etc).
> The very headline is based on a completely faulty assumption, that AI has any capacity for understanding at all, which it doesn't.
And this right here is why its so frustrating to me that the term "AI" is used for LLMs. They are impressive for certain tasks, but they are nothing close to artificial intelligence and were never designed to be.
> And this right here is why its so frustrating to me that the term "AI" is used for LLMs. They are impressive for certain tasks, but they are nothing close to artificial intelligence and were never designed to be.
We've had the debate if animals were self-aware in the first place, some seem to be based on studies, some don't. But all animals have intelligence, and the ability to problem solve to different degrees, and they most certainly learn based on experience and evolution.
Artificial Intelligence means intelligence that was created artificially, how intelligent and whether or not it is self-aware (it's not), is a different debate from "it's not intelligence".
ChatGPT put it best itself:
> Yes, I possess artificial intelligence, which enables me to process information, answer questions, solve problems, and engage in conversations. However, my intelligence is based on patterns and data, not personal experience or consciousness. Let me know how I can assist you!
AI isn't Skynet, Wintermute or HAL 9000, those are fictional and are what we dubbed ASI (Super), not AGI (General) or ANI (Narrow), ASI is self-aware, the others are simple intelligence that can either do one thing, or multiple things to different degrees of quality comparatively to humans (not always better, not always worse).
So yes, LLMs have intelligence, and you can claim the same for other software. Whether or not they're equal to or superior to humans is irrelevant since animals are inferior to us in intelligence to different degrees.
At the end of the day Intelligence can mean different things to different people, so I can't just outright tell you that you're right or wrong, I can only express my viewpoint and why I think LLMs as well as some other software are AIs.
EDIT: Should also note that words take on different meanings, and AI is an easy buzzword to use for the general public, so the word took on extra meaning, and it did so decades ago. We've been calling NPCs in video games AI for decades. There's likely other examples that don't come to mind where AI has been used far longer than LLMs.
> There's likely other examples that don't come to mind where AI has been used far longer than LLMs
We started in 1956. https://en.wikipedia.org/wiki/Dartmouth_workshop
I love that "AI" is suddenly a high bar that the best AI technologies we have don't pass, when the field is 70 years old and includes, you know, straightforward search algorithms.
I mean, I get it. These things are mainstream and close enough in some ways to what we see in science fiction that the term evokes the sci-fi meaning of the term rather than the computer science meaning.
I can't speak to 70 years ago, but in my 30ish years I don't think I have raised the bar of what I'd consider AI. I never thought of text prediction or auto complete as AI for example, though out of context I could see many 70 years ago considering it AI.
I've always considered AI as a much more complex concept than simply appearing to be human in speech or text. Extremely complex prediction algorithms that appear on the surface to be creative wouldn't have met my bar either.
An LLM can predict how a human may write a white paper, for example. It will sound impressive, but the model and related algorithms seem to have absolutely no way of logically working through a problem or modelling the world to know if the ideas proposed in the paper might work. Its a really impressive sounding impersonator, but it is nothing like intelligence.
Have you been balking at video game AI this whole time? At chess engines when they were pure search? Or at chess engines now?
It's actually long been noted that the bar keeps rising, that when we figure something out it stops being called AI by a lot of people. I don't think that's individuals raising their bar necessarily though, but stuff becomes more mainstream or younger people being introduced to it.
I've never considered chess engines or video games to have AI, I guess in that sense you could say I balked at it.
> It's actually long been noted that the bar keeps rising, that when we figure something out it stops being called AI by a lot of people.
That's interesting, I've seen it the other way around. In the 90s I don't remember anything related to AI depicting something as simple as fancy text prediction or human-like speech. Going back further and Hal from Space Odyssey didn't seem like an AI that was just predictive text.
Its always possible my expectations were born from science fiction and not academic research, but that's a pretty big gap if academics considered AI to only require natural language processing and text prediction.
I also never considered chess engines or video games to have AI.
The bar keeps raising because the term used for it promises human-level intelligence from a machine.
The same thing could be said about Virtual Reality where the bar raises each time the tech gets closer to making VR indistinguishable from reality.
Explain to me how a normal human is intelligent then?
How often have you talked to someone and showed them evidence through logic or other means and they just don't get it? Or just don't want to accept it?
Are people who do not properly self reflect, non-intelligent?
AI only needs to be as good as a human and cheaper
Strangely enough, it is actually solid proof that they have a mental model; they just reject your attempts to amend it. This is where many assert several fundamental decisions of that model are actually more driven by the heart rather than the brain.
A LLM has no such concept. It is not truly capable of accepting your rational arguments; it may nod emphatically in its response after a few sentences, but once your input leaves its context window, all bets are off. We can retrain it with your particular conversation, and we will increase likelihood it will more readily agree with you, but that is likely a drop in the bucket of the larger corpus.
But this is well past technology and into philosophy, of which I can safely say my mental model for the topic would likely be rather frustrating to others.
> AI only needs to be as good as a human and cheaper
Setting aside the current limitations of LLMs for a moment, this is just a really unfortunate view on a new tech in my opinion. Its nothing unique here so I don't mean to single you out. I've seen it plenty of other places. The idea, though, that we effectively imply that most humans are idiots and we just need to invent something that is as good at being a functional idiot but for much less money leaves humanity in a terrible place. What does the world look like if we succeed at that?
To you main question, though, I think differences in human opinions and ideas can be easily misconstrued as a lack of intelligence. Our would also consider environmental factors of the average person.
I'll stick with the US as its what I am most familiar with. The average american has what I'd consider a high level of stress, chronic health conditions, one or more pharmaceutical prescriptions, and our food system is sorely lacking and borderline poisonous. With all those factors in play I can't put too much weight behind my view of the average person's intelligence.
> The average ... has what I'd consider a high level of stress, chronic health conditions, one or more pharmaceutical prescriptions, and our food system is sorely lacking and borderline poisonous. With all those factors in play I can't put too much weight behind my view of the average person's intelligence
Affluent societies that abandon an ideal of personal development bound to forces that bring to a catastrophic configuration of society, with devastating effects: the absurdity, the risks and impact, the diversion from propriety merit all focus.
I shall add: the Great Scarcity today is... Societies.
For one thing, people often seem stupid to others not so much because they really are so, but because they don't happen to share another's specific context, views and opinions of certain things. Agreement is a terrible way to measure intelligence but so many of us do it.
Secondly, the very fact that you can't convince some people to "get it" (whatever the hell that thing may be, regardless of whether you're the one who's actually failing to get something at the same time) is evidence of their self directed reasoning. For better or worse, it's there, along with a sense of self that even the ostensibly dumbest person has, excepting maybe those with truly profound mental disabilities.
None of these things apply to LLMs, so why not drop the semantics about similarity or much less equivalence? Sel awareness is a sensation that you yourself feel every day as a conscious human being in your own right, and no evidence whatsoever indicates that any LLM has even a smidge of it.
"Intelligence" is an ability: some have it, and in large amount and complexity, some don't; some use it regularly, intensively and/or extensively, some don't.
AI is a set of engineered tools and they need to be adequate for the job.
When we speak about general Intelligence, it will have to be acute and reliable. Surely the metric will have to be well above low bars for humans, already because for that we already have humans available, and because, as noted, that low bar level is easily not «adequate for the job».
"Understanding" in the article just means "building a true and coherent world model, a faithful representation". The LLM in question does not: e.g. it has some "belief" in entities (e.g. streets on a map) which are not there.
That kind of understanding has no relation with «self-directed reasoning and self-awareness».
"Understanding" and "self-awareness" is as much a political tool as a practical definition. Pigs are self-aware, and understand, but we don't give them the space to express it.
AI is currently missing pieces it needs to be a "complete being", but saying there's nothing there is falling into the same reductive, "scientific" nitpicking that brought us "fish can't feel pain, they just look like it and act like it".
50% of people voted in a rapist.
Why? For sure not out of logic.
The assumption is that if you just read a lot of information, more than a normal human ever can read that you get something which can be similar to what a human builds in their brain through living.
LLM as the end result of a adult human state.
It's not that far fetched.
Especially when you see how hard it is to discuss things purely logical and not through feelings, opinions and following behavior patterns of others.
LLMs in their current form seem to me like intuitive thought, like what we use when in conversation with friends over a beer. It seems like only one part of a future AI brain, there still needs to be another system for maintaining a world model, another for background planning, etc. These + vision and hearing models to take in new info about the world and we'll have something resembling human intelligence where all that data meets
None of us has a coherent view of the world, the map is not the territory..
We build approximations: we develop world models. That suffices. That the product is incomplete is just a matter of finiteness.
We build them as we do because we grow up in them first.
AI gets trained on a lot of different worlds (books, stories, science, etc. )
No, that is irrelevant and false. False, because we also access sub-worlds - the world model is made of contexts -, like LLMs (not «AI»).
Irrelevant, because for the intended purpose (reasoning over a reliable representation of the world) there is no need for an interactive stream - the salient examples of interaction, cause and effect, are provided by the dataset.
A large perception model "LPM" will not be superior to a large language model "LLM" if its underlying system is not superior.
A map, if it is useful (which the subjective human experience of reality tends to be, for most people most of the time,) is by definition a coherent view of the territory. Coherent doesn't imply perfect objective accuracy.
By definition, you say? By what definition?
I don’t see why something would need to be entirely coherent to be useful. If you have a program which, when given a statement, assigns a value between 0 and 1, to be interpreted as if it were a probability of the statement being true, it needn’t define an entirely coherent probability distribution.
The coherence comes from the relationship between the virtual (mapped) data and the real data it represents (territory), and the ability of the map to correctly predict the attributes of the territory.
Human perception is coherent because it tends to accurately model the physical world from which it's constructed, by means of sensory input and memory, and if this were not the case, we would have gone extinct as a species ages ago. While it is true that the model of human perception isn't perfect, it would be incorrect to call it incoherent to the same or similar degree as the incoherence demostrated by LLMs, as jmole's earlier comment implies.
Even if one must consider the model of human perception incoherent, one must also consider it vastly more coherent than whatever LLMs model reality on, if anything.
That is emphatically not true - animals and small children that can't speak yet know about object persistence. If something has come from over there and is now here then it's no longer there.
LLM's do not have that concept, and you'll notice very quickly if you ask chemistry questions. Atoms appear twice, and the LLM just won't notice. The approach has to be changed for AI to be useful in the physical sciences.
Perhaps you are using the wrong LLMs that are not fine-tuned for chemistry.
https://www.science.org/doi/10.1126/science.adg9774
https://www.technologyreview.com/2024/10/18/1105880/the-race...
The argument was that with current word-based methods an LLM can puzzle out mechanistic problems better than a human. Turns out it can't at all, it makes errors that no human would. Anyone who looks at such attempts recognizes them as blather that is not rooted in reality.
The parent poster posted a paper where an AI guesses the Hamiltonian (nice, but with iterative methods you at least get an idea of the error associated with your numbers, not sure how far anyone should trust AI's there), maybe methods that do guesswork on topological networks would help. But I haven't seen those yet.
And yet that's exactly how the world actually behaves - see quantum mechanics. Object persistence is just a heuristic that we develop learning on our training data (our interactions with the macroscopic world).
You're inability to recognize high dimension relations between word tokens is no less evidence you lack coherent understanding.
Emmy Noether, paging Emmy Noether, Emmy Noether to the red courtesy phone please!
You can't say that given <conditions>, 1-phenylpropane rearranges to 1,2-diphenylpropane, you cannot pull whole atoms out of thin air for an indefinite amount of time. No human would make such a mistake, but AI does, and what's worse, it is impossible to talk the AI out of its strange reasoning.
Yeah I was going to call out MIT for pointing out the obvious, but there's enough noise/misunderstanding out there, that this kind of article can lead to the 'I get it' moment for someone.
To be honest… it’s amazing it can have any understanding given the only “senses” it has are the 1024 inputs of d_model!
Imagine trying to understand the world if you were simply given books and books in a language you had never read… and you didn’t even know how to read or even talk!
So it’s pretty incredible it’s got this far!
I mean, I’m amazed by LLMs.
But what you describe is basically done by any human on Earth : you are born not knowing how to read or how to talk and after years of learning, reading all the books may give you some understanding of the world.
Contrary to LLM though, human brains don’t have a virtually infinite energy supply, cannot be parallelized and have to dedicate their already scarce energy to do a lot of other things than reading books including moving in a 3D world, living in a society, feeding themselves, doing their own hardware maintenance (sleep …), pay attention not to die every single day etc etc.
So, for sure, LLMs _algorithms_ are really incredible, but they are also useful only if you throw a lot of hardware and energy into them. I’d be curious to see how long you’d need to train (not use) a useful LLM with only 20W of power (which is more or less the power we estimate the brain is using to function).
We can still be impressed by the results, but not really by the speed. And when you have to learn the entire written corpus in some weeks/months, speed is pretty useful.
> But what you describe is basically done by any human on Earth
You don't simply learn on training data, you are taught by other humans who were in turn taught by more humans still. When you do learn genuinely new information, you know exactly how it makes sense to you, and you can share it with others in a way optimized for them to incorporate it as well.
LLMs aren't just learning to speak and read, they are developing from scratch a framework for comprehending the training data thrown at them. An LLM has no concept of phonetics because it has no ears with which to hear nor a mouth with which to speak. It derived english from pulses of electricity.
Leave a child in a box with no human contact and a library of books and perhaps it will ascribe meaning to the symbols written on the pages, but it will never know what they actually mean. Even if they lived a million lifetimes, and were able to recognize that the symbols appear with certain regularity and could infer some rules of the language, they would never know what concept the string "comprehension" refers to.
Pretty sure the human brain is far more parallelized than a regular CPU or GPU. Our image recognition for example probably dosen't take shortcuts like convulution because of the cost of processing each "pixel", we directly do it all with those millions of eye neurons. Well, to be fair there is alot of post-processing and "magic" involved in getting the final image.
There is soooo much post processing happening in the visual cortex. The image from the eye is crude, only in focus in a tiny area, warped and upside down. It's quite incredible how our brains create this seamless virtual reality out of that input soup.
"MIT Researchers have yet to discover self-aware sentient dataset. In unrelated news, Sam Altman promises with another 7 trillion dollars he can turn lead into gold. 'For real this time,' claims Altman."
Joke aside, the piece is not about «self-aware sentient dataset[s]».
Isn't it, though?
Interesting -- I'm surprised this joke didn't land harder.
Thought experiment-- let's say you have the PERFECT transformer architecture and perfectly tuned weights with infinite free compute and instantaneous compute speed. What does that give? Something that allows you to perfectly generalize over your dataset. You can perfectly extract the signal from the noise.
Having a "world model" implies a model of the world and thus implies a model of the self distinct from the world. Either that or the world model incorporates the self, right? Because the thing doing the modeling must be part of the world in order for it to have a model of the world.
So, given a perfect transformer and perfect language model, you will at best get a perfect generalization of the dataset.
You cannot take a dataset that has no "world model" and run it through a perfect transformer and somehow produce a world model, aka, cannot turn lead into gold-- you have created information from thin air and now we are in the realm of magic and not science. It does not matter how much compute you throw at the problem.
Looking at the model is a red herring. The answers are in the dataset.
First of all, let us be clear that a «Perfect transformer architecture», in order to correctly perform induction of the dataset (and «generalize over» it), must also be a "perfect" reasoner. The new architecture will have to have that module working. And you reason over a world model (by definition), hence the problem whether a "proper" world model is there, and how good is its quality.
You seem to be conflating a «self» and «the thing doing the modeling». But there is no need to confuse the "self" as a perceptual object part of the world model, and the processor. The "world model" is the set of the ideas (productive to statements) describing reality. The perceptual self may not be greatly important in the world model (it will not be for a large amount of problems we will want to propose to the reasoner), and the processor will be just a "topic" in the world model, in the descriptive form (the world model contains descriptions of processors).
The expression about a «dataset that [would have] no "world model"» makes no sense. The dataset is the input of protocols ("that fact was witnessed"), the world model is what is reconstructed from the dataset. You have a set of facts (the dataset): you reconstruct an ontology and logos (entities and general rules) that make sense in view of the facts.
> let us be clear that a «Perfect transformer architecture», in order to correctly perform induction of the dataset (and «generalize over» it), must also be a "perfect" reasoner.
Respectfully, this is false. The easiest non-technical intuition for why this is false is because the transformer can be trained on any dataset-- if you give it a random noise dataset, it does not produce "reason" but random noise. If you give it a dataset of language full of nonsense, it will produce nonsense. Using the term "reasoning" is an inappropriate anthropomorphism of the vectorized equation y=f(mx+b), which is effectively what a transformer is.
Respectfully again, it's not productive for me to argue the rest of your points because they proceed from this false premise.
The thing I would encourage you to think about is the relationship between the dataset and the transformer.
I never said that a transformer architecture "reasons". I said that there exists an interpretation of «generalize» that is sought and it is relevant to building a world model and that it requires reasoning.
When you write about «tak[ing] a dataset that has no "world model"», that is what we normally do: we gather experience and make sense of it, and that happens through functions that can be called "reasoning" (through "intelligence" etc). We already "«create[] information from thin air»" - it's what we do.
If the «Perfect transformer architecture» cannot reach the proper «world model» through its own devices, it must then be augmented with the proper modules - "Reasoning", "Intelligence" etc.
The point was, "Reasoning", "Intelligence" etc. do not seem to need «self-aware sentien[ce]».
I think you are making my point more eloquently than I could. A perfect transformer cannot add information to a dataset, a (self-aware, sentient) reasoning human can.
Perhaps at this point we are merely speculating on an existence proof-- whether or not there exists such a dataset that contains a sufficient "world model" (for some definition of "world model"). I assert that such a dataset would necessarily include a concept of the self, otherwise the "world model" would be flawed. You seem to disagree.
That's fine.
My argument does not hinge strongly on whether or not sentience is required for such a world model, it is sufficient to say that no such dataset exists that will provide a transformer with sufficient information to condition weights which will have an accurate world model, sentience or not.
The "self-aware sentience" bit was hyperbolic prose for the sake of the joke, but the joke is that anyone pedaling "if only I had more money for compute I could make AGI" is shilling snake oil. No amount of infinite compute can turn an incomplete dataset into anything more than an incomplete dataset, information is conserved.
Hopefully that point is not too controversial.
Is this argument not also applicable to us and our lives?
No, not even close. We are not "tabula rasa"[1] or blank slate*. If you would actually like to understand why, some good books about this are "The Self Assembling Brain"[2] and "The Archaeology of Mind"[3].
[1] https://en.wikipedia.org/wiki/Tabula_rasa
[2] https://press.princeton.edu/books/hardcover/9780691181226/th...
[3] https://www.amazon.com/Archaeology-Mind-Neuroevolutionary-In...
[*] One of the things that frustrates me the most in the discourse on LLMs is that people who should know better deliberately mislead others into believing that there is something similar to "intelligence" going on with LLMs -- because they are heavily financially incentivized to do so. Comparisons with humans are categorical errors in everything but metaphor. They call them "neural networks" instead of "systems of nonlinear equations", because "neural network" sounds way sexier than vectorized y=f(mx+b).
AI generates a world that is language mapped by word pointing to another word like we did?
Actually that is a real question and it has been studied in philosophy of the mind: https://en.wikipedia.org/wiki/Intentional_object